Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separated data PVC #12345

Merged
merged 9 commits into from
Dec 20, 2021
Merged

Separated data PVC #12345

merged 9 commits into from
Dec 20, 2021

Conversation

almahmoud
Copy link
Member

@almahmoud almahmoud commented Aug 13, 2021

Separates the volumes that should be mounted in all jobs (cvmfs, scripts, cache, etc...) vs the pvcs that should be mounted per job (input data + job directory).

Friends with galaxyproject/galaxy-helm#314

Example volume mounts per job:


## Unique to job (generated from `k8s_data_volume_mount`)
[{'name': 'k8sfiles-gxy-rls-galaxy-pvc', 'mountPath': '/galaxy/server/database/objects/4/1/5/dataset_415a97d5-2b27-4b10-9626-4eb60372e46d.dat', 'subPath': 'objects/4/1/5/dataset_415a97d5-2b27-4b10-9626-4eb60372e46d.dat'},
 {'name': 'k8sfiles-gxy-rls-galaxy-pvc', 'mountPath': '/galaxy/server/database/jobs_directory/000/19/outputs', 'subPath': 'jobs_directory/000/19/outputs'},
 {'name': 'k8sfiles-gxy-rls-galaxy-pvc', 'mountPath': '/galaxy/server/database/jobs_directory/000/19', 'subPath': 'jobs_directory/000/19'},

## Common among jobs (from `k8s_persistent_volume_claims`)
 {'name': 'k8sfiles-gxy-rls-galaxy-pvc', 'mountPath': '/galaxy/server/database/cache', 'subPath': 'cache'},
 {'name': 'k8sfiles-gxy-rls-galaxy-pvc', 'mountPath': '/galaxy/server/database/config', 'subPath': 'config'},
 {'name': 'k8sfiles-gxy-rls-galaxy-pvc', 'mountPath': '/galaxy/server/database/deps', 'subPath': 'deps'},
 {'name': 'k8sfiles-gxy-rls-galaxy-pvc', 'mountPath': '/galaxy/server/database/object_store_cache', 'subPath': 'object_store_cache'},
 {'name': 'k8sfiles-gxy-rls-galaxy-pvc', 'mountPath': '/galaxy/server/database/tmp', 'subPath': 'tmp'},
 {'name': 'k8sfiles-gxy-rls-galaxy-pvc', 'mountPath': '/galaxy/server/database/tool-data', 'subPath': 'tool-data'},
 {'name': 'k8sfiles-gxy-rls-galaxy-pvc', 'mountPath': '/galaxy/server/database/tools', 'subPath': 'tools'},
 {'name': 'k8sfiles-gxy-rls-galaxy-pvc', 'mountPath': '/galaxy/server/database/tool_search_index', 'subPath': 'tool_search_index'},
 {'name': 'k8sfiles-gxy-rls-galaxy-cvmfs-gxy-data-pvc', 'mountPath': '/cvmfs/data.galaxyproject.org'},
 {'name': 'k8sfiles-gxy-rls-galaxy-pvc', 'mountPath': '/cvmfs/cloud.galaxyproject.org', 'subPath': 'cvmfsclone'}]

Todo

  • Make sure config is backwards compatible (conceptually should be since we can still use k8s_persistent_volume_claims to mount the entire shared FS in all jobs as before, but want to test that it handles empty/null values for k8s_data_volume_claim properly)

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

except Exception as e:
log.debug('Failed to parse `k8s_data_volume_claim` parameter in the kubernetes runner configuration')
raise e
inputs = job_wrapper.get_input_fnames()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work with extra files paths and metadata files I assume.

This logic shouldn't be happening at this layer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmchilton Where do you think it should be happening though? I'm not sure this logic can be moved elsewhere, because mounting PVCs is inherently tied to the k8s runner? The extra files path is a fair point, but the overall way this works is that the extra metadata can be mounted into the job and made available, out-of-band if necessary.

Comment on lines 108 to 109
self._init_monitor_thread()
self._init_worker_threads()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self._init_monitor_thread()
self._init_worker_threads()

except Exception as e:
log.debug('Failed to parse `k8s_data_volume_claim` parameter in the kubernetes runner configuration')
raise e
data_volume_name = data_claim_name if "/" not in data_claim_name else data_claim_name.split("/")[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parsing could be moved to separate function and shared between k8s_persistent_volume_claims and k8s_data_volume_claim and k8s_working_volume_claim

@nuwang nuwang changed the title [WIP] Separated data PVC Separated data PVC Dec 19, 2021
@github-actions github-actions bot added this to the 22.01 milestone Dec 19, 2021
@nuwang
Copy link
Member

nuwang commented Dec 20, 2021

This is working well, and since it's backward compatible, should pose no issues even if we have to make any changes in future.

@nuwang nuwang merged commit e8cb3d7 into galaxyproject:dev Dec 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants