New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cloud NFS PVC needs to specify volumeName with k8s 1.11 #2475
Comments
…ass not found. Related to kubeflow/kubeflow#2475
* Improve auto_deploy to support changing zone and testing changes. * Explicitly delete the storage deployment; the delete script won't delete it by design because we don't want to destroy data. * Instead of depending on a dummy kubeconfig file call generate/apply for platform and then for k8s. * For repos take them in the form ${ORG}/${NAME}@Branch. This matches what the other test script does. It also allows us to check out the repos from people's forks which makes testing changes easier. * Move logic in checkout-snapshot.sh into repo_clone_snapshot.py This is cleaner then having the python script shell out to a shell script. * Move the logic in deployment-workflow.sh into create_kf_instance.py * Add an option in create_kf_instance.py to read and parse the snapshot.json file rather than doing it in bash. * Change the arg name to be datadir instead of nfs_mount because nfs is an assumption of how it is run on K8s. * Check out the source into NFS to make it easier to debug. * Add a bash script to set the images using YQ * Add a wait operation for deletes and use it to wait for deletion of storage. * Rename init.sh to auto_deploy.sh to make it more descriptive. * Also modify auto_deploy.sh so we can run it locally. * Use the existing worker ci image as a base image to deduplicate the Dockerfile. * Attach labels to the deployment not the cluster * We want to use deployments not cluster to decide what to recycle * Deloyments are global but clusters are zonal and we want to be able to move to different zones to deal with issues like stockouts. * The GCP API does return labels on deployments. * We can figure out which deployment to recycle just by looking at the insertTime; we don't need to depend on deployment labels. * Add retries to deal with kubeflow/kubeflow#1933 * Fix lint. * * With K8s 1.11 we need to set volumeName otherwise we get storage class not found. Related to kubeflow/kubeflow#2475 * Fix lint. * * Change cron job to run every 12 hours. * This should be the current schedule but it looks like it was never checked in * We want to leave clusters up long enough to facilitate debugging.
/assign @zabbasi |
/close |
@zabbasi: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
#2710 didn't fix the issue. We need to set storageClassName on the volume as well. |
… filestore. Related to: kubeflow#2475
… filestore. Related to: kubeflow#2475
… filestore. Related to: kubeflow#2475 Update kf_is_ready_test to print out the namespace we are monitoring for the deployments; makes it easier to debug issues like kubeflow#3273 that was causing test failures.
… filestore. (kubeflow#3268) Related to: kubeflow#2475 Update kf_is_ready_test to print out the namespace we are monitoring for the deployments; makes it easier to debug issues like kubeflow#3273 that was causing test failures.
Fixed by #3268 |
… filestore. (kubeflow#3268) Related to: kubeflow#2475 Update kf_is_ready_test to print out the namespace we are monitoring for the deployments; makes it easier to debug issues like kubeflow#3273 that was causing test failures.
… filestore. (kubeflow#3268) Related to: kubeflow#2475 Update kf_is_ready_test to print out the namespace we are monitoring for the deployments; makes it easier to debug issues like kubeflow#3273 that was causing test failures.
Here's a PV and PVC spec for Cloud NFS.
https://github.com/jlewi/community/blob/6448276eb77ea06f971dbec470565b0b3088c349/devstats/k8s_manifests/nfs_pvc.yaml
This was based on our ksonnet prototype
https://github.com/kubeflow/kubeflow/blob/master/kubeflow/gcp/google-cloud-filestore-pv.libsonnet
On K8s 1.10 and kubeflow-testing this pattern appears to work.
But on K8s 1.11 the PVC was giving me the error
I fixed this by modifying the PVC to use the field
storageClassName
(instead of the annotation) and setting the fieldvolumeName
.Here's the fixed manifest.
https://github.com/jlewi/community/blob/ed9b1673febd88e31d4baf76c6053a303d4eeed0/devstats/k8s_manifests/nfs_pvc.yaml
I think we will need to update our ksonnet prototype to make the same changes.
The text was updated successfully, but these errors were encountered: