Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create OCP Stack in KF #54

Closed
2 of 7 tasks
nakfour opened this issue Aug 27, 2020 · 3 comments
Closed
2 of 7 tasks

Create OCP Stack in KF #54

nakfour opened this issue Aug 27, 2020 · 3 comments
Assignees
Projects

Comments

@nakfour
Copy link

nakfour commented Aug 27, 2020

Testing done on OCP 4.3/Kub 1.16. Code is in this PR : https://github.com/kubeflow/manifests
Issues:

  1. Profiles app is not setting $namespaces as seen in error from validating webhook. Looking at manifest/profiles/base_v3 it seems not to specify a variable $namespace as is specified in base/kustomization.yaml file. DONE-FIXED
WARN[0032] Encountered error applying application profiles:  (kubeflow.error): Code 500 with message: Apply.Run : error when creating "/tmp/kout859943078": admission webhook "pilot.validation.istio.io" denied the request: configuration is invalid: domain name "$(namespace).svc.cluster.local" invalid (label "$(namespace)" invalid)  filename="kustomize/kustomize.go:266"
WARN[0032] Will retry in 6 seconds.                      filename="kustomize/kustomize.go:267"
  1. Cache-server pod crashing on "secret "webhook-server-tls" not found"
  2. When creating a notebook server, fsgroup is set to 100. DONE-FIXED
  3. When launching a notebook server we get error "no healthy upstream" DONE-FIXED
  4. ml-pipeline pod crashes withe logs DONE-FIXED
 config.go:37] Please specify flag PROFILES_KFAM_SERVICE_HOST

This is due to the fact this needs profiles installed, and profiles only works when installed in top kustomization file
6. Getting to Pipeline page from Kubeflow dashboard gives error before: kubeflow/kubeflow#5271
upstream connect error or disconnect/reset before headers. reset reason: connection failure
to solve this set

tls:
      mode: DISABLE

in destination rule for ml-pipeline-ui. DONE-FIXED

  • tf-jobs,pytorch,katib,metadata (Vasek)

  • profiles, pipelines (Landon)

  • Seldon (Juana)

  • JH web (Maros)

  • Notebook Controller

  • istio (Juana)

  • cert-manager

@nakfour nakfour created this issue from a note in ODH 0.9.0 (To do) Aug 27, 2020
@nakfour nakfour self-assigned this Aug 27, 2020
@nakfour nakfour removed this from To do in ODH 0.9.0 Aug 27, 2020
@vpavlin
Copy link

vpavlin commented Oct 21, 2020

@vpavlin
Copy link

vpavlin commented Oct 21, 2020

  1. could it be because the params.yaml is not taken into account (https://github.com/kubeflow/manifests/blob/master/profiles/overlays/istio/params.yaml) and thus the virtual service is not templated?

@nakfour nakfour added this to In progress in ODH 0.9.0 Oct 21, 2020
@vpavlin
Copy link

vpavlin commented Oct 23, 2020

TF operator, PyTorch operator and Katib work

metadata-db needs a small fix mentioned in kubeflow#1567 (comment)

@nakfour nakfour moved this from In progress to Done in ODH 0.9.0 Nov 9, 2020
@nakfour nakfour closed this as completed Jun 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
ODH 0.9.0
  
Done
Development

No branches or pull requests

2 participants