Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubernetes runner: Non-privileged filesystem access, updated k8s Job version #3972

Merged
merged 10 commits into from May 4, 2017

Conversation

Projects
None yet
4 participants
@pcm32
Copy link
Member

commented Apr 25, 2017

This PR adds functionality to:

  • Allow access non-privileged filesystems in Kubernetes (k8s) through the introduction of options for setting k8s fsGroup and supplementalGroups variables on Jobs, when set on the job_conf.xml file. This gives missing pieces required to use direct shared file system provisioned by certain cloud providers in the form of NFS and others.
  • Set k8s pullPolicy for jobs, through a value set on job_conf.xml. This controls the policy for deciding how tool's containers are pulled from registries. This only affect k8s and in no way how Galaxy uses containers in general. This addresses #3900.
  • Moves to the current version of k8s Job definition (batch/v1), instead of the previous one (extensions/v1beta) no longer available in k8s 1.6. This can be set as well as in the previous points.
    • No longer advices to use pykube from my own fork, as all the features required have been merged upstream on that project now. Partly required to change Job definition version.

These changes having been tested on top of the latest galaxy dev version, and all functionality seems fine.

@galaxybot galaxybot added the triage label Apr 25, 2017

@galaxybot galaxybot added this to the 17.05 milestone Apr 25, 2017

@pcm32 pcm32 force-pushed the phnmnl:feature/pr_fs_access_job_version branch from f92e93c to 54c0249 Apr 26, 2017

@pcm32

This comment has been minimized.

Copy link
Member Author

commented Apr 26, 2017

I have tried addressing the tox errors, but it is still not satisfied:

py34-lint runtests: commands[0] | bash .ci/flake8_py3_wrapper.sh
lib/galaxy/jobs/runners/kubernetes.py:101:17: E123 closing bracket does not match indentation of opening bracket's line

after making some changes (there was an extra comma), looks good to me:

        k8s_job_obj = {
            "apiVersion": self.runner_params['k8s_job_api_version'],
            "kind": "Job",
            "metadata":
            # metadata.name is the name of the pod resource created, and must be unique
            # http://kubernetes.io/docs/user-guide/configuring-containers/
                {
                    "name": k8s_job_name,
                    "namespace": "default",  # TODO this should be set
                    "labels": {"app": k8s_job_name}
                },   # this is line 101
            "spec": self.__get_k8s_job_spec(job_wrapper)
        }

I don't understand how any of my changed code influences:

py27-lint runtests: PYTHONHASHSEED='2262571038'
py27-lint runtests: commands[0] | bash .ci/flake8_wrapper.sh
Traceback (most recent call last):
  File "/home/travis/build/galaxyproject/galaxy/.tox/py27-lint/bin/flake8", line 11, in <module>
    sys.exit(main())
  File "/home/travis/build/galaxyproject/galaxy/.tox/py27-lint/lib/python2.7/site-packages/flake8/main/cli.py", line 16, in main
    app.run(argv)
  File "/home/travis/build/galaxyproject/galaxy/.tox/py27-lint/lib/python2.7/site-packages/flake8/main/application.py", line 328, in run
    self._run(argv)
  File "/home/travis/build/galaxyproject/galaxy/.tox/py27-lint/lib/python2.7/site-packages/flake8/main/application.py", line 316, in _run
    self.run_checks()
  File "/home/travis/build/galaxyproject/galaxy/.tox/py27-lint/lib/python2.7/site-packages/flake8/main/application.py", line 246, in run_checks
    self.file_checker_manager.run()
  File "/home/travis/build/galaxyproject/galaxy/.tox/py27-lint/lib/python2.7/site-packages/flake8/checker.py", line 317, in run
    self.run_parallel()
  File "/home/travis/build/galaxyproject/galaxy/.tox/py27-lint/lib/python2.7/site-packages/flake8/checker.py", line 286, in run_parallel
    for ret in pool_map:
  File "/opt/python/2.7.12/lib/python2.7/multiprocessing/pool.py", line 287, in <genexpr>
    return (item for chunk in result for item in chunk)
  File "/opt/python/2.7.12/lib/python2.7/multiprocessing/pool.py", line 668, in next
    raise value
AttributeError: 'module' object has no attribute 'PEP257Checker'
ERROR: InvocationError: '/bin/bash .ci/flake8_wrapper.sh'

at https://travis-ci.org/galaxyproject/galaxy/jobs/225927913.

And I fixed a missing comment closure close to line 190 on the job_conf.xml.sample, but it seems to complain about character on column 16, which looks fine to me (just a dash).

Any thoughts on how to fix these? Thanks!

the security context.
Using this requires that the Kubernetes cluster is not running the admission controller
"SecurityContextDeny". To check this, look at the --admission-control= variable setup for the

This comment has been minimized.

Copy link
@nsoranzo

This comment has been minimized.

Copy link
@pcm32

pcm32 Apr 26, 2017

Author Member

Thanks for pointing it out! Will fix that then!

}
,
"labels": {"app": k8s_job_name}
},

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Apr 26, 2017

Member

That's how I would indent this:

        k8s_job_obj = {
            "apiVersion": self.runner_params['k8s_job_api_version'],
            "kind": "Job",
            "metadata": {
                # metadata.name is the name of the pod resource created, and must be unique
                # http://kubernetes.io/docs/user-guide/configuring-containers/
                "name": k8s_job_name,
                "namespace": "default",  # TODO this should be set
                "labels": {"app": k8s_job_name}
            },
            "spec": self.__get_k8s_job_spec(job_wrapper)
        }

This comment has been minimized.

Copy link
@pcm32

pcm32 Apr 26, 2017

Author Member

Thanks! will do!

@nsoranzo

This comment has been minimized.

Copy link
Member

commented Apr 26, 2017

@pcm32 Our py27-lint Travis build is broken due to https://gitlab.com/pycqa/flake8-docstrings/issues/19 , hopefully that will be fixed soon.

@pcm32 pcm32 force-pushed the phnmnl:feature/pr_fs_access_job_version branch from 54c0249 to 9ffbcdd Apr 26, 2017

@pcm32

This comment has been minimized.

Copy link
Member Author

commented Apr 26, 2017

Thanks for looking into it so promptly and for the suggestions @nsoranzo! Have added those now!

@nsoranzo

This comment has been minimized.

Copy link
Member

commented Apr 26, 2017

@galaxybot test this

@jmchilton jmchilton modified the milestones: 17.09, 17.05 Apr 26, 2017

jmchilton and others added some commits May 1, 2017

Touch up k8 options in job_conf sample.
Optional options should be commented out - I guess only the two claim options need to have non-default values specified?
@jmchilton

This comment has been minimized.

Copy link
Member

commented May 4, 2017

Very nice - thanks for merging my PR and uncommenting the config path.

This looks good as is - I'll probably have some follow up PRs as I make more progress on Kubernetes myself. I'm investigating getting this to work with BioContainers, docker-galaxy-stable, and mounting claims on a per-job basis.

@jmchilton jmchilton merged commit 8d09854 into galaxyproject:dev May 4, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.