Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubeflow dashboard does not show in browser #1764

Closed
kosehy opened this issue Nov 23, 2020 · 37 comments
Closed

Kubeflow dashboard does not show in browser #1764

kosehy opened this issue Nov 23, 2020 · 37 comments
Assignees

Comments

@kosehy
Copy link

kosehy commented Nov 23, 2020

What steps did you take and what happened:
Based on the ubuntu/microk8s#1698 suggestion,
I installed kubeflow using microk8s.

sudo snap install microk8s --classic --channel=latest/edge
microk8s enable dns storage gpu istio
microk8s enable kubeflow

After install, I clicked the http://localhost
Screenshot from 2020-11-23 13-03-46

And I got error below
Screenshot from 2020-11-23 13-11-40

What did you expect to happen:
Enter the kubeflow dashboard.

Anything else you would like to add:
inspection-report-20201123_124046.tar.gz
When I Use the following command to set up port forwarding to the Istio gateway,

export NAMESPACE=istio-system
microk8s kubectl port-forward -n ${NAMESPACE} svc/istio-ingressgateway 8080:80

I have below error message:

upstream connect error or disconnect/reset before headers. reset reason: connection failure

When I enter the link http://10.152.183.51:8082/, I can access to kubeflow dashboard.
Screenshot from 2020-11-23 15-01-32
However, I cannot access other section, like pipelines, Notebook Servers, etc.
Screenshot from 2020-11-23 15-04-26
Environment:

  • Kubernetes version: (use kubectl version): v1.19.4-34+68a982ef7f1a98
  • OS (e.g. from /etc/os-release): ubuntu 20.04 LTS
@chen98
Copy link

chen98 commented Nov 23, 2020

I am having similar issue now after bring up kubeflow successfully.
(
sudo snap install microk8s --classic --channel=latest/edge
...
Waiting for service pods to become ready.
Congratulations, Kubeflow is now available.
The dashboard is available at http://localhost
Username: admin
Password: ...
To see these values again, run:
microk8s juju config dex-auth static-username
)
Based on the following setup,
https://ubuntu.com/tutorials/deploy-kubeflow-ubuntu-windows-mac#4-deploy-kubeflow
shall we use "http://10.64.140.43.xip.io" always? (after set up network proxy).
But now getting "requested resource is not found" error while access kubeflow dashboard.

Thanks.

@chen98
Copy link

chen98 commented Nov 23, 2020

If I go by Kosehy suggested (clster I.P.: http://:8082/_/jupyter/), I am get an "not a valid page" error.

No issue to get K8S dashboard with login token
(
https:///#/overview?namespace=default
)
Please help. Thanks.

@chen98
Copy link

chen98 commented Nov 23, 2020

I am getting the I.P. from the following,
kube-system service/kubernetes-dashboard ClusterIP 10.152.183.### 443/TCP
kubeflow service/kubeflow-dashboard ClusterIP 10.152.183.### 8082/TCP
Thanks.

@kosehy
Copy link
Author

kosehy commented Nov 24, 2020

Based on kubeflow/kubeflow#5407 (comment)
I can finally access to the kubeflow dashboard.

Screenshot from 2020-11-24 19-21-38

Here is what I did

# remove the microk8s
sudo snap remove microk8s --purge
# reinstall microk8s
sudo snap install microk8s --classic --channel=latest/edge
microk8s enable dns storage gpu istio
# before enabling kubeflow
# if your server's ip address is "123.456.789.123"
export KUBEFLOW_HOSTNAME="http://123.456.789.123"
microk8s enable kubeflow

After enabling kubeflow

microk8s juju config dex-auth static-username=admin
microk8s juju config dex-auth static-password=1qaz2wsx
microk8s.kubectl port-forward -n kubeflow service/istio-ingressgateway 8081:80 --address 0.0.0.0

go to "http://123.456.789.123:8081"

you can login admin:1qaz2wsx

However, I still cannot access pipeline and notebook servers

Screenshot from 2020-11-24 19-22-11

@chen98
Copy link

chen98 commented Nov 26, 2020

I grabbed the latest microk8s today and still got the same error as above.
sudo snap install microk8s --classic --channel=latest/edge
(build #1829)
...
"Sorry, /jupyter/ is not a valid page"
from "http://10.152.183.nnn:8082/_/jupyter/" URL in the browser.

Any idea how to fix it? Thanks.

@kosehy
Copy link
Author

kosehy commented Nov 29, 2020

@chen98 It might be certificate issue.
kubeflow/kubeflow#5407 (comment)

And I also got error message inside the istio-ingressgateway log
kubeflow/kubeflow#5407 (comment)

microk8s kubectl -n kubeflow get pods
NAME                                           READY   STATUS    RESTARTS   AGE
modeloperator-dc5df5b59-smwmg                  1/1     Running   0          4h29m
argo-controller-operator-0                     1/1     Running   0          4h29m
argo-ui-operator-0                             1/1     Running   0          4h29m
dex-auth-operator-0                            1/1     Running   0          4h29m
istio-ingressgateway-operator-0                1/1     Running   0          4h28m
istio-pilot-operator-0                         1/1     Running   0          4h28m
jupyter-controller-operator-0                  1/1     Running   0          4h28m
argo-ui-75bdcd5684-2cjwc                       1/1     Running   0          4h29m
jupyter-web-operator-0                         1/1     Running   0          4h27m
katib-controller-operator-0                    1/1     Running   0          4h27m
jupyter-controller-bf98b886b-9vk2x             1/1     Running   0          4h28m
katib-db-operator-0                            1/1     Running   0          4h27m
katib-ui-operator-0                            1/1     Running   0          4h27m
katib-manager-operator-0                       1/1     Running   0          4h27m
istio-pilot-7446cb564b-7lrb9                   1/1     Running   0          4h28m
kubeflow-profiles-operator-0                   1/1     Running   0          4h26m
katib-controller-7469bc6576-ssvbq              1/1     Running   0          4h27m
metacontroller-operator-0                      1/1     Running   0          4h26m
kubeflow-dashboard-operator-0                  1/1     Running   0          4h26m
katib-ui-7c96f9566c-m89hk                      1/1     Running   0          4h26m
katib-db-0                                     1/1     Running   0          4h27m
metadata-grpc-operator-0                       1/1     Running   0          4h25m
pipelines-api-operator-0                       1/1     Running   0          4h25m
pipelines-persistence-operator-0               1/1     Running   0          4h25m
seldon-core-operator-0                         1/1     Running   0          4h25m
pipelines-scheduledworkflow-operator-0         1/1     Running   0          4h25m
pipelines-ui-operator-0                        1/1     Running   0          4h25m
pipelines-viewer-operator-0                    1/1     Running   0          4h25m
metadata-api-operator-0                        1/1     Running   0          4h25m
metadata-db-operator-0                         1/1     Running   0          4h24m
metadata-envoy-operator-0                      1/1     Running   0          4h24m
metacontroller-64b95b78bd-pzz49                1/1     Running   0          4h26m
metadata-ui-operator-0                         1/1     Running   0          4h24m
katib-manager-8598bbb465-74m5b                 1/1     Running   0          4h26m
minio-operator-0                               1/1     Running   0          4h24m
oidc-gatekeeper-operator-0                     1/1     Running   0          4h24m
metadata-db-0                                  1/1     Running   0          4h24m
tf-job-operator-operator-0                     1/1     Running   0          4h23m
pipelines-db-operator-0                        1/1     Running   0          4h23m
pipelines-visualization-operator-0             1/1     Running   0          4h23m
pytorch-operator-operator-0                    1/1     Running   0          4h23m
pipelines-db-0                                 1/1     Running   0          4h23m
kubeflow-dashboard-59f7db4959-hp5t9            1/1     Running   0          4h26m
kubeflow-profiles-685d84b949-5btvz             2/2     Running   0          4h26m
pipelines-scheduledworkflow-858d496447-k8btp   1/1     Running   0          4h25m
pipelines-viewer-55d8dbb4c8-bxvcw              1/1     Running   0          4h24m
jupyter-web-558b989d55-68zx4                   2/2     Running   0          4h26m
seldon-core-7bd75c844c-vw9c5                   1/1     Running   1          4h25m
metadata-api-5b59dd78c8-jr9vr                  1/1     Running   0          4h24m
metadata-envoy-58ddcbf55c-2rcxs                1/1     Running   0          4h24m
metadata-grpc-c7f9f8968-q9wws                  1/1     Running   0          4h24m
metadata-ui-6b7f7f7988-hgq47                   1/1     Running   0          4h24m
minio-0                                        1/1     Running   0          4h24m
argo-controller-787f765b4b-2kvls               1/1     Running   0          4h23m
pipelines-ui-8494d8b984-dnrml                  2/2     Running   0          4h23m
tf-job-operator-94479fffb-52cln                1/1     Running   0          4h23m
istio-ingressgateway-6dbfc48ff-brj4g           1/1     Running   0          4h23m
pipelines-visualization-6f96f8fdff-gpvq4       1/1     Running   0          4h23m
pytorch-operator-758594d5ff-xpskb              1/1     Running   0          4h22m
pipelines-api-7bfd776db4-966jd                 2/2     Running   0          4h22m
oidc-gatekeeper-58597c8858-w79lx               2/2     Running   0          4h22m
pipelines-persistence-74cd45bcb7-2f445         1/1     Running   10         4h24m
dex-auth-868c68558-rwxfn                       2/2     Running   1          45m
microk8s kubectl -n kubeflow logs istio-ingressgateway-6dbfc48ff-brj4g
2020-11-27T06:55:06.343137Z	info	FLAG: --binaryPath="/usr/local/bin/envoy"
2020-11-27T06:55:06.343198Z	info	FLAG: --concurrency="0"
2020-11-27T06:55:06.343206Z	info	FLAG: --configPath="/etc/istio/proxy"
2020-11-27T06:55:06.343213Z	info	FLAG: --connectTimeout="10s"
2020-11-27T06:55:06.343228Z	info	FLAG: --controlPlaneAuthPolicy="NONE"
2020-11-27T06:55:06.343234Z	info	FLAG: --controlPlaneBootstrap="true"
2020-11-27T06:55:06.343238Z	info	FLAG: --customConfigFile=""
2020-11-27T06:55:06.343243Z	info	FLAG: --datadogAgentAddress=""
2020-11-27T06:55:06.343247Z	info	FLAG: --disableInternalTelemetry="false"
2020-11-27T06:55:06.343252Z	info	FLAG: --discoveryAddress="istio-pilot.kubeflow.svc:15012"
2020-11-27T06:55:06.343257Z	info	FLAG: --dnsRefreshRate="300s"
2020-11-27T06:55:06.343262Z	info	FLAG: --domain="kubeflow.svc.cluster.local"
2020-11-27T06:55:06.343267Z	info	FLAG: --drainDuration="45s"
2020-11-27T06:55:06.343271Z	info	FLAG: --envoyAccessLogService=""
2020-11-27T06:55:06.343276Z	info	FLAG: --envoyMetricsService=""
2020-11-27T06:55:06.343280Z	info	FLAG: --help="false"
2020-11-27T06:55:06.343285Z	info	FLAG: --id=""
2020-11-27T06:55:06.343289Z	info	FLAG: --ip=""
2020-11-27T06:55:06.343293Z	info	FLAG: --lightstepAccessToken=""
2020-11-27T06:55:06.343298Z	info	FLAG: --lightstepAddress=""
2020-11-27T06:55:06.343302Z	info	FLAG: --lightstepCacertPath=""
2020-11-27T06:55:06.343307Z	info	FLAG: --lightstepSecure="false"
2020-11-27T06:55:06.343312Z	info	FLAG: --log_as_json="false"
2020-11-27T06:55:06.343316Z	info	FLAG: --log_caller=""
2020-11-27T06:55:06.343321Z	info	FLAG: --log_output_level="default:info"
2020-11-27T06:55:06.343325Z	info	FLAG: --log_rotate=""
2020-11-27T06:55:06.343330Z	info	FLAG: --log_rotate_max_age="30"
2020-11-27T06:55:06.343335Z	info	FLAG: --log_rotate_max_backups="1000"
2020-11-27T06:55:06.343340Z	info	FLAG: --log_rotate_max_size="104857600"
2020-11-27T06:55:06.343344Z	info	FLAG: --log_stacktrace_level="default:none"
2020-11-27T06:55:06.343356Z	info	FLAG: --log_target="[stdout]"
2020-11-27T06:55:06.343361Z	info	FLAG: --mixerIdentity=""
2020-11-27T06:55:06.343365Z	info	FLAG: --outlierLogPath=""
2020-11-27T06:55:06.343370Z	info	FLAG: --parentShutdownDuration="1m0s"
2020-11-27T06:55:06.343373Z	info	FLAG: --pilotIdentity=""
2020-11-27T06:55:06.343379Z	info	FLAG: --proxyAdminPort="15000"
2020-11-27T06:55:06.343384Z	info	FLAG: --proxyComponentLogLevel="misc:error"
2020-11-27T06:55:06.343388Z	info	FLAG: --proxyLogLevel="warning"
2020-11-27T06:55:06.343393Z	info	FLAG: --serviceCluster="istio-ingressgateway"
2020-11-27T06:55:06.343397Z	info	FLAG: --serviceregistry="Kubernetes"
2020-11-27T06:55:06.343401Z	info	FLAG: --statsdUdpAddress=""
2020-11-27T06:55:06.343405Z	info	FLAG: --statusPort="15020"
2020-11-27T06:55:06.343410Z	info	FLAG: --stsPort="0"
2020-11-27T06:55:06.343414Z	info	FLAG: --templateFile=""
2020-11-27T06:55:06.343418Z	info	FLAG: --tokenManagerPlugin="GoogleTokenExchange"
2020-11-27T06:55:06.343423Z	info	FLAG: --trust-domain="cluster.local"
2020-11-27T06:55:06.343427Z	info	FLAG: --zipkinAddress=""
2020-11-27T06:55:06.343461Z	info	Version 1.5.0-c3c353285578eb68b334fc8766746b754b6b3789-Clean
2020-11-27T06:55:06.343676Z	info	Obtained private IP [10.1.127.80 fe80::303d:1bff:fe5f:4704]
2020-11-27T06:55:06.343720Z	info	Proxy role: &model.Proxy{ClusterID:"", Type:"router", IPAddresses:[]string{"10.1.127.80", "10.1.127.80", "fe80::303d:1bff:fe5f:4704"}, ID:"istio-ingressgateway-6dbfc48ff-brj4g.kubeflow", Locality:(*envoy_api_v2_core.Locality)(nil), DNSDomain:"kubeflow.svc.cluster.local", ConfigNamespace:"", Metadata:(*model.NodeMetadata)(nil), SidecarScope:(*model.SidecarScope)(nil), MergedGateway:(*model.MergedGateway)(nil), ServiceInstances:[]*model.ServiceInstance(nil), WorkloadLabels:labels.Collection(nil), IstioVersion:(*model.IstioVersion)(nil)}
2020-11-27T06:55:06.343730Z	info	PilotSAN []string(nil)
2020-11-27T06:55:06.343734Z	info	MixerSAN []string(nil)
2020-11-27T06:55:06.344759Z	info	Effective config: binaryPath: /usr/local/bin/envoy
configPath: /etc/istio/proxy
connectTimeout: 10s
discoveryAddress: istio-pilot.kubeflow.svc:15012
drainDuration: 45s
envoyAccessLogService: {}
envoyMetricsService: {}
parentShutdownDuration: 60s
proxyAdminPort: 15000
serviceCluster: istio-ingressgateway
statNameLength: 189

2020-11-27T06:55:06.345002Z	info	JWT policy is first-party-jwt
2020-11-27T06:55:06.345138Z	info	Using user-configured CA istio-pilot.kubeflow.svc:15012
2020-11-27T06:55:06.345188Z	info	istiod uses self-issued certificate
2020-11-27T06:55:06.345319Z	info	the CA cert of istiod is: -----BEGIN CERTIFICATE-----
MIIC3jCCAcagAwIBAgIRAOQC2c+IZZeD85Vmnqf3NmswDQYJKoZIhvcNAQELBQAw
GDEWMBQGA1UEChMNY2x1c3Rlci5sb2NhbDAeFw0yMDExMjcwNjE2NDlaFw0zMDEx
MjUwNjE2NDlaMBgxFjAUBgNVBAoTDWNsdXN0ZXIubG9jYWwwggEiMA0GCSqGSIb3
DQEBAQUAA4IBDwAwggEKAoIBAQDee2XmyvTrhmRSIpNVbCN9/0/3d6ssGuBXGTEQ
jRniY8Dt0KdAS1GXadmurUcx7qIRJFxe+HWVe0JcYvtNEoGOy4U1mhIKRq8nwzqv
Z0KmPMIeu3YfqWvFXgMSDw+zMzC2AdeKZkN00UISwFPoleAFkhP7JjM+QjDmwiKy
cs2cULAeNk3Ce+7LuE2WeIE2SWquLviP9zNMClNgl9+aDW4ES92vvk10h+BvV8bc
bjnX09Al+PAKal5J9txKVEWbKgoJwvr4nmbBc40qU2MW0qutNCFon8fko4OekGg/
eW142snaz2PLQfGmYYYraAi1aTLowJ5B3Nto6b10YINZBLTRAgMBAAGjIzAhMA4G
A1UdDwEB/wQEAwICBDAPBgNVHRMBAf8EBTADAQH/MA0GCSqGSIb3DQEBCwUAA4IB
AQCKyOXLg8SrRlly72F5756E9ZsYKF6GYupT0ThXnkDfdEfRiSVEt0st/vr38Id5
uLaoi0UMY+ihHnTzHlppw2NbTsOBth+tmtCoWays/K21G0W/vxBKBYEaGY1i8LAI
zkLNP1ubvaK2n0y6iNhKvMwtgx7l/L+ZCeDleTTShkG+CMEi3QM+SQMgjEB44a47
YExZQ31jFfpmshUHwHvXf4K3f/SZuB/QsIYsNYhHy0B1/+45fw1JcDm8C1Avjv1K
SvPWBUY4cF326yVUebC5SiigBbK5r/AZI5nJCz7rx06GYiIZx91Zs7+PbGRnKYX4
qsDgdX1MO5JKpYUEzh9tofin
-----END CERTIFICATE-----

2020-11-27T06:55:06.346031Z	info	parsed scheme: ""
2020-11-27T06:55:06.346074Z	info	scheme "" not registered, fallback to default scheme
2020-11-27T06:55:06.346175Z	info	ccResolverWrapper: sending update to cc: {[{istio-pilot.kubeflow.svc:15012  <nil> 0 <nil>}] <nil> <nil>}
2020-11-27T06:55:06.346200Z	info	ClientConn switching balancer to "pick_first"
2020-11-27T06:55:06.346417Z	info	pickfirstBalancer: HandleSubConnStateChange: 0xc000844440, {CONNECTING <nil>}
2020-11-27T06:55:06.358008Z	info	pickfirstBalancer: HandleSubConnStateChange: 0xc000844440, {READY <nil>}
2020-11-27T06:55:06.421041Z	info	Starting gateway SDS
2020-11-27T06:55:06.499843Z	warn	secretfetcher	failed load server cert/key pair from secret pipelines-api-pipelines-api-secret: server cert or private key is empty
2020-11-27T06:55:06.499935Z	warn	secretfetcher	failed load server cert/key pair from secret argo-controller-argo-controller-secret: server cert or private key is empty
2020-11-27T06:55:06.499957Z	warn	secretfetcher	failed load server cert/key pair from secret katib-ui-katib-ui-secret: server cert or private key is empty
2020-11-27T06:55:06.499977Z	warn	secretfetcher	failed load server cert/key pair from secret mlpipeline-minio-artifact: server cert or private key is empty
2020-11-27T06:55:06.499994Z	warn	secretfetcher	failed load server cert/key pair from secret kubeflow-profiles-kubeflow-profiles-secret: server cert or private key is empty
2020-11-27T06:55:06.500025Z	warn	secretfetcher	failed load server cert/key pair from secret pytorch-operator-pytorch-operator-secret: server cert or private key is empty
2020-11-27T06:55:06.500043Z	warn	secretfetcher	failed load server cert/key pair from secret metadata-api-metadata-api-secret: server cert or private key is empty
2020-11-27T06:55:06.500066Z	warn	secretfetcher	failed load server cert/key pair from secret jupyter-controller-jupyter-controller-secret: server cert or private key is empty
2020-11-27T06:55:06.500077Z	warn	secretfetcher	failed load server cert/key pair from secret metacontroller-metacontroller-secret: server cert or private key is empty
2020-11-27T06:55:06.500143Z	warn	secretfetcher	failed load server cert/key pair from secret katib-manager-katib-manager-secret: server cert or private key is empty
2020-11-27T06:55:06.500157Z	warn	secretfetcher	failed load server cert/key pair from secret pipelines-viewer-pipelines-viewer-secret: server cert or private key is empty
2020-11-27T06:55:06.500192Z	warn	secretfetcher	failed load server cert/key pair from secret metadata-grpc-metadata-grpc-secret: server cert or private key is empty
2020-11-27T06:55:06.500239Z	warn	secretfetcher	failed load server cert/key pair from secret argo-ui-argo-ui-secret: server cert or private key is empty
2020-11-27T06:55:06.500263Z	warn	secretfetcher	failed load server cert/key pair from secret jupyter-web-jupyter-web-secret: server cert or private key is empty
2020-11-27T06:55:06.500276Z	warn	secretfetcher	failed load server cert/key pair from secret oidc-gatekeeper-oidc-gatekeeper-secret: server cert or private key is empty
2020-11-27T06:55:06.500295Z	warn	secretfetcher	failed load server cert/key pair from secret metadata-envoy-metadata-envoy-secret: server cert or private key is empty
2020-11-27T06:55:06.500309Z	warn	secretfetcher	failed load server cert/key pair from secret kubeflow-profiles-kubeflow-kfam-secret: server cert or private key is empty
2020-11-27T06:55:06.500342Z	warn	secretfetcher	failed load server cert/key pair from secret seldon-core-seldon-core-secret: server cert or private key is empty
2020-11-27T06:55:06.500372Z	warn	secretfetcher	failed load server cert/key pair from secret pipelines-visualization-pipelines-visualization-secret: server cert or private key is empty
2020-11-27T06:55:06.500394Z	warn	secretfetcher	failed load server cert/key pair from secret tf-job-operator-tf-job-operator-secret: server cert or private key is empty
2020-11-27T06:55:06.500418Z	warn	secretfetcher	failed load server cert/key pair from secret katib-controller-katib-controller-secret: server cert or private key is empty
2020-11-27T06:55:06.500477Z	warn	secretfetcher	failed load server cert/key pair from secret dex-auth-dex-auth-secret: server cert or private key is empty
2020-11-27T06:55:06.500495Z	warn	secretfetcher	failed load server cert/key pair from secret pipelines-persistence-pipelines-persistence-secret: server cert or private key is empty
2020-11-27T06:55:06.500539Z	warn	secretfetcher	failed load server cert/key pair from secret pipelines-ui-pipelines-ui-secret: server cert or private key is empty
2020-11-27T06:55:06.500584Z	warn	secretfetcher	failed load server cert/key pair from secret minio-minio-secret: server cert or private key is empty
2020-11-27T06:55:06.500648Z	warn	secretfetcher	failed load server cert/key pair from secret pipelines-scheduledworkflow-pipelines-scheduledworkflow-secret: server cert or private key is empty
2020-11-27T06:55:06.500661Z	warn	secretfetcher	failed load server cert/key pair from secret metadata-ui-metadata-ui-secret: server cert or private key is empty
2020-11-27T06:55:06.500672Z	warn	secretfetcher	failed load server cert/key pair from secret kubeflow-dashboard-kubeflow-dashboard-secret: server cert or private key is empty
2020-11-27T06:55:06.524375Z	info	sds	SDS gRPC server for workload UDS starts, listening on "/etc/istio/proxy/SDS" 

2020-11-27T06:55:06.524440Z	info	sds	Start SDS grpc server
2020-11-27T06:55:06.524521Z	info	sds	SDS gRPC server for ingress gateway controller starts, listening on "/var/run/ingress_gateway/sds" 

2020-11-27T06:55:06.524574Z	info	PilotSAN []string{"istio-pilot.kubeflow.svc"}
2020-11-27T06:55:06.524545Z	info	sds	Start SDS grpc server for ingress gateway proxy
2020-11-27T06:55:06.524619Z	info	Starting proxy agent
2020-11-27T06:55:06.524643Z	info	Opening status port 15020

2020-11-27T06:55:06.524677Z	info	Received new config, creating new Envoy epoch 0
2020-11-27T06:55:06.524746Z	info	Epoch 0 starting
2020-11-27T06:55:06.531682Z	info	Envoy command: [-c /etc/istio/proxy/envoy-rev0.json --restart-epoch 0 --drain-time-s 45 --parent-shutdown-time-s 60 --service-cluster istio-ingressgateway --service-node router~10.1.127.80~istio-ingressgateway-6dbfc48ff-brj4g.kubeflow~kubeflow.svc.cluster.local --max-obj-name-len 189 --local-address-ip-version v4 --log-format [Envoy (Epoch 0)] [%Y-%m-%d %T.%e][%t][%l][%n] %v -l warning --component-log-level misc:error]
[Envoy (Epoch 0)] [2020-11-27 06:55:06.663][47][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:91] gRPC config stream closed: 14, no healthy upstream
[Envoy (Epoch 0)] [2020-11-27 06:55:06.663][47][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:54] Unable to establish new stream
2020-11-27T06:55:06.672985Z	info	sds	node:router~10.1.127.80~istio-ingressgateway-6dbfc48ff-brj4g.kubeflow~kubeflow.svc.cluster.local-1 resource:default new connection
2020-11-27T06:55:06.953794Z	info	cache	Root cert has changed, start rotating root cert for SDS clients
2020-11-27T06:55:06.954162Z	info	sds	node:router~10.1.127.80~istio-ingressgateway-6dbfc48ff-brj4g.kubeflow~kubeflow.svc.cluster.local-1 resource:default pushed key/cert pair to proxy
2020-11-27T06:55:06.954186Z	info	sds	node:router~10.1.127.80~istio-ingressgateway-6dbfc48ff-brj4g.kubeflow~kubeflow.svc.cluster.local-1 resource:default pushed secret
2020-11-27T06:55:07.403922Z	info	sds	node:router~10.1.127.80~istio-ingressgateway-6dbfc48ff-brj4g.kubeflow~kubeflow.svc.cluster.local-2 resource:ROOTCA new connection
2020-11-27T06:55:07.404245Z	info	sds	node:router~10.1.127.80~istio-ingressgateway-6dbfc48ff-brj4g.kubeflow~kubeflow.svc.cluster.local-2 resource:ROOTCA pushed root cert to proxy
2020-11-27T06:55:07.404268Z	info	sds	node:router~10.1.127.80~istio-ingressgateway-6dbfc48ff-brj4g.kubeflow~kubeflow.svc.cluster.local-2 resource:ROOTCA pushed secret
[Envoy (Epoch 0)] [2020-11-27 06:55:07.411][47][warning][config] [external/envoy/source/common/protobuf/message_validator_impl.cc:28] Unknown field: type envoy.extensions.filters.http.ext_authz.v3.ExtAuthz reason INVALID_ARGUMENT:(http_service.server_uri) failureModeAllow: Cannot find field.
2020-11-27T06:55:07.599714Z	info	Envoy proxy is ready
2020-11-27T07:22:18.715102Z	info	transport: loopyWriter.run returning. connection error: desc = "transport is closing"
2020-11-27T07:22:18.715194Z	info	pickfirstBalancer: HandleSubConnStateChange: 0xc000844440, {CONNECTING <nil>}
2020-11-27T07:22:18.727227Z	info	pickfirstBalancer: HandleSubConnStateChange: 0xc000844440, {READY <nil>}
[Envoy (Epoch 0)] [2020-11-27 07:26:51.183][47][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:91] gRPC config stream closed: 13, 
2020-11-27T07:49:44.235979Z	info	transport: loopyWriter.run returning. connection error: desc = "transport is closing"
2020-11-27T07:49:44.236053Z	info	pickfirstBalancer: HandleSubConnStateChange: 0xc000844440, {CONNECTING <nil>}
2020-11-27T07:49:44.247639Z	info	pickfirstBalancer: HandleSubConnStateChange: 0xc000844440, {READY <nil>}
[Envoy (Epoch 0)] [2020-11-27 07:57:13.794][47][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:91] gRPC config stream closed: 13, 
2020-11-27T08:20:06.076927Z	info	transport: loopyWriter.run returning. connection error: desc = "transport is closing"
2020-11-27T08:20:06.077008Z	info	pickfirstBalancer: HandleSubConnStateChange: 0xc000844440, {CONNECTING <nil>}
2020-11-27T08:20:06.087898Z	info	pickfirstBalancer: HandleSubConnStateChange: 0xc000844440, {READY <nil>}
[Envoy (Epoch 0)] [2020-11-27 08:24:26.066][47][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:91] gRPC config stream closed: 13, 
2020-11-27T08:48:04.989575Z	info	transport: loopyWriter.run returning. connection error: desc = "transport is closing"
2020-11-27T08:48:04.989590Z	info	pickfirstBalancer: HandleSubConnStateChange: 0xc000844440, {CONNECTING <nil>}
2020-11-27T08:48:05.000068Z	info	pickfirstBalancer: HandleSubConnStateChange: 0xc000844440, {READY <nil>}
[Envoy (Epoch 0)] [2020-11-27 08:54:07.009][47][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:91] gRPC config stream closed: 13, 
2020-11-27T09:17:52.764790Z	info	transport: loopyWriter.run returning. connection error: desc = "transport is closing"
2020-11-27T09:17:52.764862Z	info	pickfirstBalancer: HandleSubConnStateChange: 0xc000844440, {CONNECTING <nil>}
2020-11-27T09:17:52.774860Z	info	pickfirstBalancer: HandleSubConnStateChange: 0xc000844440, {READY <nil>}
[Envoy (Epoch 0)] [2020-11-27 09:26:31.604][47][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:91] gRPC config stream closed: 13, 
2020-11-27T09:49:04.033166Z	info	transport: loopyWriter.run returning. connection error: desc = "transport is closing"
2020-11-27T09:49:04.033244Z	info	pickfirstBalancer: HandleSubConnStateChange: 0xc000844440, {CONNECTING <nil>}
2020-11-27T09:49:04.044356Z	info	pickfirstBalancer: HandleSubConnStateChange: 0xc000844440, {READY <nil>}
[Envoy (Epoch 0)] [2020-11-27 09:56:24.163][47][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:91] gRPC config stream closed: 13, 
2020-11-27T10:19:36.984633Z	info	transport: loopyWriter.run returning. connection error: desc = "transport is closing"
2020-11-27T10:19:36.984718Z	info	pickfirstBalancer: HandleSubConnStateChange: 0xc000844440, {CONNECTING <nil>}
2020-11-27T10:19:36.996377Z	info	pickfirstBalancer: HandleSubConnStateChange: 0xc000844440, {READY <nil>}
[Envoy (Epoch 0)] [2020-11-27 10:27:53.218][47][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:91] gRPC config stream closed: 13,

@chen98
Copy link

chen98 commented Nov 29, 2020

build #1830 (latest/edge) still have this issue.
build #1810 (1.19/candidate) still have this "oidc gatekeeper crash" issue.
Do you find any previous build (1.18, 1.19 or 1.20) kubeflow still working?

Thanks.

@kosehy
Copy link
Author

kosehy commented Nov 29, 2020

build #1830 (latest/edge) still have this issue.
build #1810 (1.19/candidate) still have this "oidc gatekeeper crash" issue.
Do you find any previous build (1.18, 1.19 or 1.20) kubeflow still working?

Thanks.

@chen98 I will test previous build.

mutatingwebhook/validatingwebhook(m/v) issue

mutating_error

The connection to the server 127.0.0.1:16443 was refused issue

server_was_refused

Empty charm patch

empty charm patch

  1.20/beta:        v1.20.0-beta.2  2020-11-20 (1825) 215MB classic 127.0.0.1:16443 issue
  1.20/edge:        v1.20.0-alpha.3 2020-11-05 (1788) 214MB classic 127.0.0.1:16443 issue
  1.19/stable:      v1.19.3         2020-11-12 (1791) 214MB classic m/v issue
  1.19/candidate:   v1.19.4         2020-11-27 (1810) 214MB classic m/v issue
  1.19/beta:        v1.19.4         2020-11-27 (1810) 214MB classic m/v issue
  1.19/edge:        v1.19.4         2020-11-16 (1810) 214MB classic notebook server error
  latest/stable:    v1.19.3         2020-11-10 (1786) 214MB classic m/v issue
  latest/candidate: v1.19.3         2020-11-10 (1786) 214MB classic m/v issue
  latest/beta:      v1.19.3         2020-11-10 (1786) 214MB classic m/v issue
  1.18/stable:      v1.18.9         2020-09-26 (1702) 199MB classic m/v issue
  1.18/candidate:   v1.18.10        2020-11-04 (1777) 199MB classic m/v issue
  1.18/beta:        v1.18.10        2020-11-04 (1777) 199MB classic m/v issue
  1.18/edge:        v1.18.12        2020-11-14 (1807) 199MB classic m/v issue
  1.17/stable:      v1.17.12        2020-09-25 (1701) 177MB classic empty charm patch issue
  1.17/candidate:   v1.17.14        2020-11-12 (1796) 177MB classic empty charm patch issue
  1.17/beta:        v1.17.14        2020-11-12 (1796) 177MB classic empty charm patch issue
  1.17/edge:        v1.17.14        2020-11-11 (1796) 177MB classic empty charm patch issue

@chen98 I test version from 1.17 to latest.
Only latest/edge version can enable kubeflow.

  latest/edge:      v1.19.4         2020-11-27 (1830) 213MB classic can enable kubeflow

@danudeep90
Copy link

@chen98 : I installed and enabled kubeflow on a cloud virtual machine using
sudo snap install microk8s --classic --channel=latest/edge
microk8s.enable dns dashboard storage
microk8s.enable kubeflow

I get a success message saying kubeflow dashboard available at http://localhost

I setup SOCKS proxy on port 9999 and able to open Kubeflow page using clusterIP in the services, but unable to access pipelines and notebook server page

Any idea how we can get notebook server and pipelines page working ?

@chen98
Copy link

chen98 commented Dec 2, 2020

I am still having the same issue as you and @kosehy have, not able to access notebook server or pipeline pages. It was broken since Nov. releases. Still waiting for microk8s team response... Temporarily I switched to minikube to use kubeflow.

@danudeep90
Copy link

@chen98 : Are there any older microk8s versions which work fine like 1.18/stable ?

@kosehy
Copy link
Author

kosehy commented Dec 2, 2020

I am still having the same issue as you and @kosehy have, not able to access notebook server or pipeline pages. It was broken since Nov. releases. Still waiting for microk8s team response... Temporarily I switched to minikube to use kubeflow.

@chen98 Would you please share some more details about how to switch to minikube for using kubeflow?
Whenever I tried to install minikube and enable kubeflow, I got error and I could not enable kubeflow.

@kosehy
Copy link
Author

kosehy commented Dec 2, 2020

@chen98 : Are there any older microk8s versions which work fine like 1.18/stable ?

@danudeep90 I tested from 1.18 to latest version and only latest/edge version works.

I will test 1.17 version and let you know.
-> Doesn't work as empty charm patch issue

@kosehy
Copy link
Author

kosehy commented Dec 2, 2020

@chen98 : I installed and enabled kubeflow on a cloud virtual machine using
sudo snap install microk8s --classic --channel=latest/edge
microk8s.enable dns dashboard storage
microk8s.enable kubeflow

I get a success message saying kubeflow dashboard available at http://localhost

I setup SOCKS proxy on port 9999 and able to open Kubeflow page using clusterIP in the services, but unable to access pipelines and notebook server page

Any idea how we can get notebook server and pipelines page working ?

@danudeep90 Would you please share some more details about "I setup SOCKS proxy on port 9999 and able to open Kubeflow page using clusterIP in the services" part?

@danudeep90
Copy link

@kosehy : Please follow the steps in the video below from 7:55 to 8:55 to setup socks proxy
https://www.youtube.com/watch?v=KPEGKKNB63Q&t=557s&ab_channel=celebrateubuntu

After that, you can get IP of the kubeflow dashboard from command below
microk8s kubectl -n kubeflow describe service/kubeflow-dashboard

Enter the IP in your browser and you should see the kubeflow dashboard.
These steps work only for a VM provisioned on cloud. I am not sure if similar steps work for local machine.

@danudeep90
Copy link

@chen98 : I installed and enabled kubeflow on a cloud virtual machine using
sudo snap install microk8s --classic --channel=latest/edge
microk8s.enable dns dashboard storage
microk8s.enable kubeflow

I get a success message saying kubeflow dashboard available at http://localhost

I setup SOCKS proxy on port 9999 and able to open Kubeflow page using clusterIP in the services, but unable to access pipelines and notebook server page

Any idea how we can get notebook server and pipelines page working ?

@knkski : Can you help me on this issue ?

@chen98
Copy link

chen98 commented Dec 9, 2020

@kosehy is this issue resolved for you, or still there for the new microk8s releases?
I am using miniKF in GCP marketplace now. The cost is about the same in the cloud, whether it is microk8s or miniKF, in GCP or in AWS. If you run kubeflow locally on your machine, microk8s was working before until after Nov. releases for me, but not anymore.

@kosehy
Copy link
Author

kosehy commented Dec 9, 2020

@kosehy is this issue resolved for you, or still there for the new microk8s releases?
I am using miniKF in GCP marketplace now. The cost is about the same in the cloud, whether it is microk8s or miniKF, in GCP or in AWS. If you run kubeflow locally on your machine, microk8s was working before until after Nov. releases for me, but not anymore.

@chen98 I will test several versions below on my local ubuntu machine now and let you know!

mutatingwebhook/validatingwebhook(m/v) issue

mutatingwebhook-validatingwebhook issue

1.19/edge:        v1.19.4         2020-12-08 (1846) 216MB classic classic m/v issue
1.19/edge:        v1.19.5         2020-12-09 (1856) 216MB classic classic m/v issue
latest/edge:      v1.19.4         2020-12-08 (1845) 215MB classic cannot access pipeline and notebook servers
latest/edge:      v1.20.0         2020-12-08 (1861) 219MB classic cannot access notebook servers

1.19/edge: v1.19.5 2020-12-09 (1856) Inspection report
latest/edge: v1.20.0 2020-12-08 (1861) inspection-report

@VibratingPig
Copy link

Following the instructions above all pods and services come up but then webpage is empty.

Ubuntu 20.04 LTS
Microk8s
installed: v1.20.0 (1851) 219MB classic
Inspection report
inspection-report-20201210_072055.tar.gz

@kosehy
Copy link
Author

kosehy commented Dec 11, 2020

kubeflow/kubeflow#5429 (comment)
Based on @bsamadi instructions:

microk8s and kubeflow installation
# remove the microk8s
sudo snap remove microk8s --purge

# reinstall microk8s (version v1.20.0  2020-12-10 (1861) 219MB classic)
sudo snap install microk8s --classic --channel=latest/edge
microk8s enable dns dashboard gpu helm3 host-access storage istio
microk8s enable kubeflow

After enabling kubeflow

microk8s juju config dex-auth static-username=admin
microk8s juju config dex-auth static-password=1234

kubeflow_enable

kubeflow dashboard login
go to "http://10.64.140.44.xip.io"

you can login

id: admin
password:1234

kubeflow_dashboard_login

kubeflow dashboard UI

kubeflow_dashboard_UI

Now I can access pipeline.
However, I still cannot access notebook servers

kubeflow dashboard UI

notebook_servers_not_working

Environment:
microk8s version: v1.20.0  2020-12-10 (1861) 219MB classic
OS (e.g. from /etc/os-release): On-prem ubuntu 20.04 LTS

Inspection report
inspection-report-20201211_121455.tar.gz

@kosehy
Copy link
Author

kosehy commented Dec 13, 2020

kubeflow/kubeflow#5429 (comment)

I tested two version, v1.20.0(1864) and v1.20.0(1873)

Version:
microk8s version : latest/stable v1.20.0 2020-12-12 (1864) 219MB classic
problem:

  1. kubeflow is not enabled
microk8s status

image

  1. I cannot access Notebook Servers section.

Inspection Report:
inspection-report-20201213_010143.tar.gz

Version:
microk8s version : latest/stable v1.20.0 2020-12-11 (1873) 219MB classic
problem:

  1. kubeflow is not enabled
microk8s status

inspection-report-20201213_010207.tar.gz
image

  1. I cannot access both Pipelines and Notebook Servers section.

Inspection Report:
inspection-report-20201213_010207.tar.gz

@danudeep90
Copy link

@kosehy : Is there a fix from microk8s side so that everything works including able to access dashboard and spin up notebook servers ? If yes, Can you tell me how to install microk8s so that everything works

@VibratingPig
Copy link

installed: v1.20.0 (1876) 219MB classic

followed the instructions above and the kubeflow page appears and i login but the page has errors:

  1. When launching it reports the following JS errors:
    Uncaught TypeError: Cannot read property 'pipelines' of null
    Uncaught TypeError: Cannot read property 'runs' of null
    this may indicate some underlying services are not running
  2. Navigating to create a new notebook server I find:

Uncaught SyntaxError: Unexpected token '<' three times for runtime, polyfill and main.js[]

Inspection report
inspection-report-20201215_072743.tar.gz

@VibratingPig
Copy link

installed: v1.20.0 (1876) 219MB classic

followed the instructions above and the kubeflow page appears and i login but the page has errors:

1. When launching it reports the following JS errors:
   Uncaught TypeError: Cannot read property 'pipelines' of null
   Uncaught TypeError: Cannot read property 'runs' of null
   this may indicate some underlying services are not running

2. Navigating to create a new notebook server I find:

Uncaught SyntaxError: Unexpected token '<' three times for runtime, polyfill and main.js[]

Inspection report
inspection-report-20201215_072743.tar.gz

Actually I've been looking at the bundle making up the Kubeflow deployment and specifically the jupyter web app and the release used looks like it doesn't work. It's early in the morning and I'll try and look at this over the next couple of days but the docker image has inconsistent references to the required javascript.

@kosehy
Copy link
Author

kosehy commented Jan 1, 2021

Finally, I found a great installation tutorial about how to install kubeflow 1.12 using microk8s.
It also covers both dex and without dex version.
how-to-install-kubeflow1.2
kubeflow/kubeflow#5429 (comment)

I tested 1.18/stable, 1.19/stable, and 1.20 stable version and can access kubeflow dashboard, pipeline, and notebook servers

installed:          v1.18.13            (1854) 199MB classic with kfctl_istio_dex.v1.2.0.yaml
installed:          v1.19.5             (1856) 216MB classic with kfctl_istio_dex.v1.2.0.yaml
installed:          v1.20.0             (1864) 219MB classic with kfctl_istio_dex.v1.2.0.yaml

@kosehy kosehy closed this as completed Jan 4, 2021
@knkski
Copy link
Contributor

knkski commented Jan 6, 2021

@kosehy: apologies for the slow response. Can you try running microk8s enable kubeflow with the latest version of microk8s?

sudo snap switch microk8s --channel=latest/edge && sudo snap refresh

@kosehy kosehy reopened this Jan 6, 2021
@kosehy
Copy link
Author

kosehy commented Jan 7, 2021

@knkski I still cannot access the notebook servers

image

Here is my tarball file below:
inspection-report-20210107_045728.tar.gz

@tritran-cotai
Copy link

I got the same issue with @kosehy installing on a fresh Ubuntu 18.0.5 --channel=latest/edge

@burningion
Copy link

burningion commented Jan 13, 2021

Just adding to the conversation, I also tried installing --channel=latest/edge, and also don't have access to the notebook servers.

I'm running on 18.04, and had to manually set a username and password with:

$ microk8s juju config dex-auth static-username=my_username
$ microk8s juju config dex-auth static-password=my_password

It seems the password and username set via the microk8s enable kubeflow don't actually get set, and end up as empty strings for me.

From what I can tell, it appears as if the JS and CSS assets aren't being served from the /jupyter routes as expected. Instead, it's the same HTML being served for each compiled JS and CSS asset.

@knkski
Copy link
Contributor

knkski commented Jan 20, 2021

@kosehy, @tritran-cotai, @burningion: Sorry about the issues with this. I've got #1919 merged, which should fix these issues. Can you try it again after making sure to refresh the snap (sudo snap refresh microk8s --channel latest/edge)? I've also added some functional UI tests that will ensure we don't get regressions on this.

@kosehy
Copy link
Author

kosehy commented Jan 20, 2021

sudo snap refresh microk8s --channel latest/edge

@ knkski I will try!

@chen98
Copy link

chen98 commented Jan 20, 2021

@knkski working like a charm, both notebook servers and pipelines. Thank you very much.

@kosehy
Copy link
Author

kosehy commented Jan 21, 2021

@knkski works great!!

@tritran-cotai
Copy link

@kosehy, @tritran-cotai, @burningion: Sorry about the issues with this. I've got #1919 merged, which should fix these issues. Can you try it again after making sure to refresh the snap (sudo snap refresh microk8s --channel latest/edge)? I've also added some functional UI tests that will ensure we don't get regressions on this.

@knkski Fantastic !!!

@kosehy kosehy closed this as completed Feb 4, 2021
@danudeep90
Copy link

danudeep90 commented Feb 8, 2021

@kosehy : I tried doing a fresh installation using

sudo snap remove microk8s --purge
sudo snap install microk8s --classic --channel=latest/edge

Enabling kubeflow is taking lot of time.. and still shows waited 4375s for operator pods to come up.

Can you share the fresh installation instructions so that kubeflow is deployed properly

@knkski: Can you help me with this ?

@knkski
Copy link
Contributor

knkski commented Feb 8, 2021

@danudeep90: can you create a new issue with the output from microk8s enable kubeflow --debug?

@danudeep90
Copy link

@knkski : Created a new issue as suggested #1995

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants