Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GKE Deploy fails in Australia as no NVIDIA GPUs available unless we modify cluster.jinja #4947

Closed
snowlover173 opened this issue Apr 13, 2020 · 3 comments
Labels
area/gcp area/gpu label for GPU related questions, issues and features kind/bug platform/gcp priority/p2

Comments

@snowlover173
Copy link

snowlover173 commented Apr 13, 2020

/kind bug

What steps did you take and what happened:
Deployed Kubeflow 0.7, 1.0.0 and 1.0.1 , using the documentation for deploying Kubeflow with cli on GCP.
It failed with an error NVIDIA GPUs not available.
In gcp_config/cluster-kubeflow.yaml changed "gpu-pool-max-nodes" from 10 to 0 and it ran.

What did you expect to happen:
no error

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • Kubeflow version: 0.7, 1.0.0, 1.0.1
  • kfctl version: accordingly
  • Kubernetes platform: GCP
  • Kubernetes version: 1.14
  • OS (e.g. from /etc/os-release):
@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the labels:

Label Probability
kind/bug 0.99

Please mark this comment with 👍 or 👎 to give our bot feedback!
Links: app homepage, dashboard and code for this bot.

@kubeflow-bot kubeflow-bot added this to To Do in Needs Triage Apr 13, 2020
@jbottum jbottum added area/gpu label for GPU related questions, issues and features priority/p2 area/gcp platform/gcp labels Apr 17, 2020
@kubeflow-bot kubeflow-bot removed this from To Do in Needs Triage Apr 17, 2020
@jlewi
Copy link
Contributor

jlewi commented Apr 20, 2020

Which zone did you try to use?

@jlewi
Copy link
Contributor

jlewi commented Jun 9, 2020

I'm closing this as obsolete because with 1.1 we are switching to blueprints which will no longer use
deployment manager.
https://github.com/kubeflow/gcp-blueprints/tree/master/kubeflow

Blueprints use Cloud Config Connector to configure your GKE clusters and you can configure settings like node autoprovisioning using tools like kustomize to generate appropriate YAML specs.

@jlewi jlewi closed this as completed Jun 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/gcp area/gpu label for GPU related questions, issues and features kind/bug platform/gcp priority/p2
Projects
None yet
Development

No branches or pull requests

4 participants