Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to start Che on Microsoft Azure (AKS) - Exception while retrieving OpenId configuration from endpoint #17760

Closed
4 of 22 tasks
desaiRahulS opened this issue Aug 31, 2020 · 10 comments
Labels
area/install Issues related to installation, including offline/air gap and initial setup kind/question Questions that haven't been identified as being feature requests or bugs.

Comments

@desaiRahulS
Copy link

desaiRahulS commented Aug 31, 2020

Describe the bug

I am very new to Eclipse Che and wanted to get it installed in Azure. So I have followed the steps documented here - https://www.eclipse.org/che/docs/che-7/overview/installing-che-on-microsoft-azure/
I was able to get through all the steps except the last one which installs che:
chectl server:start --installer=helm --platform=k8s --domain=cheide.site --multiuser

› Current Kubernetes context: 'eclipse-che'
√ Verify Kubernetes API...OK
√ � Looking for an already existing Eclipse Che instance
√ Verify if Eclipse Che is deployed into namespace "che"...it is not
✈️ Kubernetes preflight checklist
√ Verify if kubectl is installed
√ Verify remote kubernetes status...done.
√ Check Kubernetes version: Found v1.16.13.
√ Verify domain is set...set to cheide.site.
↓ Check if cluster accessible [skipped]
Eclipse Che logs will be available in 'C:\Users\rahul\AppData\Local\Temp\chectl-logs\1598889415391'
√ Start following logs
↓ Start following Operator logs [skipped]
√ Start following Eclipse Che logs...done
√ Start following Postgres logs...done
√ Start following Keycloak logs...done
√ Start following Plugin registry logs...done
√ Start following Devfile registry logs...done
√ Start following events
√ Start following namespace events...done
√ �‍ Running Helm to install Eclipse Che
√ Verify if helm is installed
√ Check Helm Version: Found v2.16.10+gbceca24
√ Create Namespace (che)...does already exist.
√ Check Eclipse Che TLS certificate...TLS certificate secret found
√ Create Tiller Role Binding...it already exists.
√ Create Tiller Service Account...it already exists.
√ Create Tiller RBAC
√ Create Tiller Service...it already exists.
√ Preparing Eclipse Che Helm Chart...done.
√ Updating Helm Chart dependencies...done.
√ Deploying Eclipse Che Helm Chart...done.
✅ Post installation checklist
√ PostgreSQL pod bootstrap
√ scheduling...done.
√ downloading images...done.
√ starting...done.
√ Devfile registry pod bootstrap
√ scheduling...done.
√ downloading images...done.
√ starting...done.
√ Plugin registry pod bootstrap
√ scheduling...done.
√ downloading images...done.
√ starting...done.
> Eclipse Che pod bootstrap
√ scheduling...done.
√ downloading images...done.
× starting
→ ERR_TIMEOUT: Timeout set to pod ready timeout 130000
Retrieving Eclipse Che server URL
Eclipse Che status check
Show important messages
» Error: Error: ERR_TIMEOUT: Timeout set to pod ready timeout 130000
» Installation failed, check logs in 'C:\Users\rahul\AppData\Local\Temp\chectl-logs\1598889415391'

Logs from che.log
Caused by: java.lang.RuntimeException: Exception while retrieving OpenId configuration from endpoint: https://keycloak-che.cheide.site/auth/realms/che/.well-known/openid-configuration
	at org.eclipse.che.multiuser.keycloak.server.KeycloakSettings.<init>(KeycloakSettings.java:103)
	at org.eclipse.che.multiuser.keycloak.server.KeycloakSettings$$FastClassByGuice$$e0d0786b.newInstance(<generated>)
	at com.google.inject.internal.DefaultConstructionProxyFactory$FastClassProxy.newInstance(DefaultConstructionProxyFactory.java:89)
	at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:114)
	at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91)
	at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:306)
	at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
	at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:168)
....................
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
	at java.base/sun.security.validator.PKIXValidator.doBuild(Unknown Source)
	at java.base/sun.security.validator.PKIXValidator.engineValidate(Unknown Source)
	at java.base/sun.security.validator.Validator.validate(Unknown Source)
	at java.base/sun.security.ssl.X509TrustManagerImpl.validate(Unknown Source)
	at java.base/sun.security.ssl.X509TrustManagerImpl.checkTrusted(Unknown Source)
	at java.base/sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(Unknown Source)
	... 116 more
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
	at java.base/sun.security.provider.certpath.SunCertPathBuilder.build(Unknown Source)
	at java.base/sun.security.provider.certpath.SunCertPathBuilder.engineBuild(Unknown Source)
	at java.base/java.security.cert.CertPathBuilder.build(Unknown Source)
	... 122 more

The url https://keycloak-che.cheide.site/auth/realms/che/.well-known/openid-configuration is accessible from the browser

Che version

  • latest
  • nightly
  • other: please specify

Steps to reproduce

Follow the steps here https://www.eclipse.org/che/docs/che-7/overview/installing-che-on-microsoft-azure/

Expected behavior

Che installed in Microsoft Azure

Runtime

  • kubernetes (include output of kubectl version)
    Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.0", GitCommit:"e19964183377d0ec2052d1f1fa930c4d7575bd50", GitTreeState:"clean", BuildDate:"2020-08-26T14:30:33Z", GoVersion:"go1.15", Compiler:"gc", Platform:"windows/amd64"}
    Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.13", GitCommit:"1da71a35d52fa82847fd61c3db20c4f95d283977", GitTreeState:"clean", BuildDate:"2020-07-15T21:59:26Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
  • Openshift (include output of oc version)
  • minikube (include output of minikube version and kubectl version)
  • minishift (include output of minishift version and oc version)
  • docker-desktop + K8S (include output of docker version and kubectl version)
  • other: (please specify)

Screenshots

Installation method

  • chectl
    chectl server:start --installer=helm --platform=k8s --domain=cheide.site --multiuser
    • provide a full command that was used to deploy Eclipse Che (including the output)
    • provide an output of chectl version command
  • OperatorHub
  • I don't know

Environment

  • my computer
    • Windows
    • Linux
    • macOS
  • Cloud
    • Amazon
    • Azure
    • GCE
    • other (please specify)
  • other: please specify

Eclipse Che Logs

che.log
keycloak.log
events.txt

Additional context

kubectl describe certificate/che-tls -n che

Events:
Type Reason Age From Message


Normal GeneratedKey 36m cert-manager Generated a new private key
Normal Requested 36m cert-manager Created new CertificateRequest resource "che-tls-2540347572"
Normal Issued 35m cert-manager Certificate issued successfully

@desaiRahulS desaiRahulS added the kind/bug Outline of a bug - must adhere to the bug report template. label Aug 31, 2020
@che-bot che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Aug 31, 2020
@desaiRahulS
Copy link
Author

Apologies for the formatting issues.
chectl version
chectl/0.0.20200828-next.3ba7a8c win32-x64 node-v10.22.0

@tolusha
Copy link
Contributor

tolusha commented Sep 1, 2020

@desaiRahulS
Did you configure correctly dnsNames while creating certificate?

Let's check.

  1. Grab public part of the certificate (or any other command) oc get secret che-tls -n che -o json | jq '.data["tls.crt"]' | tr -d '"' | base64 -d
  2. Use command openssl x509 -text -noout to decode the certificate
  3. Check attribute X509v3 Subject Alternative Name:

@tolusha tolusha added area/install Issues related to installation, including offline/air gap and initial setup kind/question Questions that haven't been identified as being feature requests or bugs. labels Sep 1, 2020
@skabashnyuk skabashnyuk removed kind/bug Outline of a bug - must adhere to the bug report template. status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. labels Sep 1, 2020
@desaiRahulS
Copy link
Author

desaiRahulS commented Sep 1, 2020

Thank you for your response. Here are the details you have asked for:

  1. Grab public part of the certificate (or any other command) oc get secret che-tls -n che -o json | jq '.data["tls.crt"]' | tr -d '"' | base64 -d
    This was saved to a .crt file

  2. Use command openssl x509 -text -noout to decode the certificate

  3. Check attribute X509v3 Subject Alternative Name:
    openssl x509 -noout -text -in mycert.crt | grep DNS:
    DNS:*.cheide.site

X509v3 Subject Alternative Name:
DNS:*.cheide.site

The certificate was generated by following this url https://cert-manager.io/docs/tutorials/acme/dns-validation/

cat <<EOF | kubectl apply -f -
apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
name: che-tls
namespace: che
spec:
secretName: che-tls
dnsNames:
'*.cheide.site'
issuerRef:
name: che-certificate-issuer
kind: ClusterIssuer
EOF

The domain "cheide.site" is registered with GoDaddy

@tolusha
Copy link
Contributor

tolusha commented Sep 1, 2020

Do you have self-signed-certificate secret in che namespace? I guess no.

I see that server certificate is signed by some intermediate certificate.

  1. So, create a file ca.crt.
  2. Put in this file certificate chain of trust of your website (from the intermediate to the root).
  3. Create a secret kubectl create secret generic self-signed-certificate --from-file=ca.crt=./ca.crt -n che
  4. Scale down/up che-server and keycloak deployments to propagate CA certificates to the components:
    kubectl scale deployment keycloak --replicas=0 -n che
    kubectl scale deployment keycloak --replicas=1 -n che
    kubectl scale deployment che --replicas=0 -n che
    kubectl scale deployment che --replicas=1 -n che

@desaiRahulS
Copy link
Author

desaiRahulS commented Sep 1, 2020

@tolusha , I am using letsEncrypt https://letsencrypt.org/docs/challenge-types/ for certificate management.

Could you please shed some light on the outcome expected after these 4 steps are done?
Not sure if this helps but here is additonal information

kubectl describe certificate/che-tls -n che

Name:         che-tls
Namespace:    che
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"cert-manager.io/v1alpha2","kind":"Certificate","metadata":{"annotations":{},"name":"che-tls","namespace":"che"},"spec":{"dn...
API Version:  cert-manager.io/v1alpha3
Kind:         Certificate
Metadata:
  Creation Timestamp:  2020-08-31T15:49:22Z
  Generation:          1
  Resource Version:    2858
  Self Link:           /apis/cert-manager.io/v1alpha3/namespaces/che/certificates/che-tls
  UID:                 b5090632-21c8-44a0-964d-eb683c67d00b
Spec:
  Dns Names:
    *.cheide.site
  Issuer Ref:
    Kind:       ClusterIssuer
    Name:       che-certificate-issuer
  Secret Name:  che-tls
Status:
  Conditions:
    Last Transition Time:  2020-08-31T15:50:29Z
    Message:               Certificate is up to date and has not expired
    Reason:                Ready
    Status:                True
    Type:                  Ready
  Not After:               2020-11-29T14:50:28Z
Events:                    <none>

@desaiRahulS
Copy link
Author

desaiRahulS commented Sep 1, 2020

Ok, looks like I am past that issue. I realized that I was using the staging acme environment url instead of the production.
Modifying the acme server url to https://acme-v02.api.letsencrypt.org/directory has let me pass this issue.
I can access the keycloak url at https://keycloak-che.cheide.site/auth/

However eclipse che install still fails with a timeout.

`2020-09-01 17:45:48,025[ost-startStop-1] [INFO ] [o.e.c.a.w.s.WorkspaceRuntimes 175] - Configured factories for environments: '[kubernetes, no-environment]'
2020-09-01 17:45:48,026[ost-startStop-1] [INFO ] [o.e.c.a.w.s.WorkspaceRuntimes 176] - Registered infrastructure 'kubernetes'
2020-09-01 17:45:48,096[ost-startStop-1] [INFO ] [o.e.c.a.w.s.WorkspaceRuntimes 701] - Infrastructure is tracking 0 active runtimes
2020-09-01 17:45:48,218[ost-startStop-1] [INFO ] [o.e.c.a.c.u.ApiInfoLogInformer 36] - Eclipse Che Api Core: Build info '7.19.0-SNAPSHOT' scmRevision '7fe641196a005f14e2cdb984156efe9189044374' implementationVersion '7.19.0-SNAPSHOT'
2020-09-01 17:45:48,256[ost-startStop-1] [WARN ] [p.s.AdminPermissionInitializer 69] - Admin admin not found yet.
2020-09-01 17:45:48,417[ost-startStop-1] [ERROR] [o.a.c.c.C.[.[localhost].[/api] 175] - Exception sending context initialized event to listener instance of class [org.eclipse.che.inject.CheBootstrap]
com.google.inject.CreationException: Unable to create injector, see the following errors:

  1. Error injecting constructor, org.eclipse.che.api.workspace.server.spi.InfrastructureException: Neither KUBERNETES_NAMESPACE nor POD_NAMESPACE is defined. Unable to determine Che installation location
    at org.eclipse.che.workspace.infrastructure.kubernetes.namespace.CheNamespace.(CheNamespace.java:54)
    at org.eclipse.che.workspace.infrastructure.kubernetes.namespace.CheNamespace.class(CheNamespace.java:54)
    while locating org.eclipse.che.workspace.infrastructure.kubernetes.namespace.CheNamespace

1 error
at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:543)
at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:186)
at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:109)
at com.google.inject.Guice.createInjector(Guice.java:87)
at org.everrest.guice.servlet.EverrestGuiceContextListener.getInjector(EverrestGuiceContextListener.java:141)
at com.google.inject.servlet.GuiceServletContextListener.contextInitialized(GuiceServletContextListener.java:45)
at org.everrest.guice.servlet.EverrestGuiceContextListener.contextInitialized(EverrestGuiceContextListener.java:86)
at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4689)
at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5155)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:743)
at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:719)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:705)
at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:970)
at org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1840)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
Caused by: org.eclipse.che.api.workspace.server.spi.InfrastructureException: Neither KUBERNETES_NAMESPACE nor POD_NAMESPACE is defined. Unable to determine Che installation location
at org.eclipse.che.workspace.infrastructure.kubernetes.environment.CheInstallationLocation.getInstallationLocationNamespace(CheInstallationLocation.java:44)
at org.eclipse.che.workspace.infrastructure.kubernetes.namespace.CheNamespace.(CheNamespace.java:55)
at org.eclipse.che.workspace.infrastructure.kubernetes.namespace.CheNamespace$$FastClassByGuice$$2cecf5a3.newInstance()
at com.google.inject.internal.DefaultConstructionProxyFactory$FastClassProxy.newInstance(DefaultConstructionProxyFactory.java:89)
at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:114)
at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91)
at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:306)
at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:168)
at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:39)
at com.google.inject.internal.InternalInjectorCreator.loadEagerSingletons(InternalInjectorCreator.java:211)
at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:182)
... 18 common frames omitted`

@tolusha
Copy link
Contributor

tolusha commented Sep 2, 2020

@desaiRahulS
We are working on this issue right now
#17766

Modifying the acme server url to https://acme-v02.api.letsencrypt.org/directory has let me pass this issue.

Yes, I see that right now the sever certificate is signed by let's encrypt which doesn't require creation self-signed-certificate secret.

@tolusha
Copy link
Contributor

tolusha commented Sep 2, 2020

We have an issue with the latest nightly version.
But is it still possible to use the latest stable version
So pls:

  1. chectl update stable
  2. chectl server:delete
  3. chect server:start .....

@desaiRahulS
Copy link
Author

desaiRahulS commented Sep 2, 2020

Thanks @tolusha. The installation worked!!

√ Retrieving Eclipse Che server URL... https://che-che.cheide.site
√ Eclipse Che status check
√ Show important messages
√ Autogenerated Keycloak credentials are: "admin:<....>"
Command server:start has completed successfully.

I do have one follow up question on factories. The version which got installed is 7.18.0 and it gives me an option to create Factories, however I do not see any formal documentation around it. I see a 404 https://www.eclipse.org/che/docs/factories-getting-started.html
I created a fee account at che.openshift.io which also is a 7.18.0 version but does not give me an option to create factories.
Is this feature deprecated?

@tolusha tolusha mentioned this issue Sep 3, 2020
58 tasks
@tolusha tolusha closed this as completed Sep 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/install Issues related to installation, including offline/air gap and initial setup kind/question Questions that haven't been identified as being feature requests or bugs.
Projects
None yet
Development

No branches or pull requests

4 participants