Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-user Che fails to start on k8s using helm #13625

Closed
z0r0 opened this issue Jun 25, 2019 · 23 comments
Closed

Multi-user Che fails to start on k8s using helm #13625

z0r0 opened this issue Jun 25, 2019 · 23 comments
Assignees
Labels
kind/bug Outline of a bug - must adhere to the bug report template. severity/blocker Causes system to crash and be non-recoverable or prevents Che developers from working on Che code.
Milestone

Comments

@z0r0
Copy link

z0r0 commented Jun 25, 2019

Description

multiuser fails to complete installation when using minikube, and the chectl operator.

Reproduction Steps

chectl server:start --multiuser
...

    > Che pod bootstrap
      √ scheduling...done.
      √ downloading images...done.
      × starting
        → ERR_TIMEOUT: Timeout set to pod ready timeout 130000
      Retrieving Che Server URL
      Che status check
Error: ERR_TIMEOUT: Timeout set to pod ready timeout 130000
    at KubeHelper.<anonymous> (C:/snapshot/chectl/lib/api/kube.js:0:0)
    at Generator.next (<anonymous>)
    at fulfilled (C:/snapshot/chectl/node_modules/tslib/tslib.js:107:62)

OS and version:
Tried on both centos, and windows.
I'm running latest.
chectl/0.0.2-33a6fa1 win32-x64 node-v10.4.1

Diagnostics:
Error on the che pod

Error on the che pod logs
Using embedded assembly...
2019-06-25 14:05:41,894[main]             [INFO ] [o.a.c.s.VersionLoggerListener 89]    - Server version:        Apache Tomcat/8.5.35
2019-06-25 14:05:41,896[main]             [INFO ] [o.a.c.s.VersionLoggerListener 91]    - Server built:          Nov 3 2018 17:39:20 UTC
2019-06-25 14:05:41,897[main]             [INFO ] [o.a.c.s.VersionLoggerListener 93]    - Server number:         8.5.35.0
2019-06-25 14:05:41,897[main]             [INFO ] [o.a.c.s.VersionLoggerListener 95]    - OS Name:               Linux
2019-06-25 14:05:41,897[main]             [INFO ] [o.a.c.s.VersionLoggerListener 97]    - OS Version:            4.15.0
2019-06-25 14:05:41,897[main]             [INFO ] [o.a.c.s.VersionLoggerListener 99]    - Architecture:          amd64
2019-06-25 14:05:41,898[main]             [INFO ] [o.a.c.s.VersionLoggerListener 101]   - Java Home:             /usr/lib/jvm/java-1.8-openjdk/jre
2019-06-25 14:05:41,898[main]             [INFO ] [o.a.c.s.VersionLoggerListener 103]   - JVM Version:           1.8.0_191-b12
2019-06-25 14:05:41,898[main]             [INFO ] [o.a.c.s.VersionLoggerListener 105]   - JVM Vendor:            Oracle Corporation
2019-06-25 14:05:41,898[main]             [INFO ] [o.a.c.s.VersionLoggerListener 107]   - CATALINA_BASE:         /home/user/eclipse-che/tomcat
2019-06-25 14:05:41,899[main]             [INFO ] [o.a.c.s.VersionLoggerListener 109]   - CATALINA_HOME:         /home/user/eclipse-che/tomcat
2019-06-25 14:05:41,899[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Djava.util.logging.config.file=/home/user/eclipse-che//tomcat/conf/logging.properties
2019-06-25 14:05:41,899[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
2019-06-25 14:05:41,899[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -XX:MaxRAMFraction=2
2019-06-25 14:05:41,899[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -XX:+UseParallelGC
2019-06-25 14:05:41,900[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -XX:MinHeapFreeRatio=10
2019-06-25 14:05:41,900[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -XX:MaxHeapFreeRatio=20
2019-06-25 14:05:41,900[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -XX:GCTimeRatio=4
2019-06-25 14:05:41,900[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -XX:AdaptiveSizePolicyWeight=90
2019-06-25 14:05:41,900[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -XX:+UnlockExperimentalVMOptions
2019-06-25 14:05:41,901[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -XX:+UseCGroupMemoryLimitForHeap
2019-06-25 14:05:41,901[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Dsun.zip.disableMemoryMapping=true
2019-06-25 14:05:41,901[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Xms20m
2019-06-25 14:05:41,901[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Dche.docker.network=bridge
2019-06-25 14:05:41,901[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Dport.http=8080
2019-06-25 14:05:41,901[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Dche.home=/home/user/eclipse-che/
2019-06-25 14:05:41,902[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Dche.logs.dir=/logs/
2019-06-25 14:05:41,902[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Dche.logs.level=INFO
2019-06-25 14:05:41,902[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Djuli-logback.configurationFile=file:/home/user/eclipse-che//tomcat/conf/tomcat-logger.xml
2019-06-25 14:05:41,902[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Djdk.tls.ephemeralDHKeySize=2048
2019-06-25 14:05:41,902[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Djava.protocol.handler.pkgs=org.apache.catalina.webresources
2019-06-25 14:05:41,903[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Dorg.apache.catalina.security.SecurityListener.UMASK=0022
2019-06-25 14:05:41,903[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Dcom.sun.management.jmxremote
2019-06-25 14:05:41,903[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Dcom.sun.management.jmxremote.ssl=false
2019-06-25 14:05:41,904[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Dcom.sun.management.jmxremote.authenticate=false
2019-06-25 14:05:41,904[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Dche.local.conf.dir=/home/user/eclipse-che//tomcat/conf/
2019-06-25 14:05:41,904[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Dignore.endorsed.dirs=
2019-06-25 14:05:41,904[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Dcatalina.base=/home/user/eclipse-che//tomcat
2019-06-25 14:05:41,905[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Dcatalina.home=/home/user/eclipse-che//tomcat
2019-06-25 14:05:41,905[main]             [INFO ] [o.a.c.s.VersionLoggerListener 115]   - Command line argument: -Djava.io.tmpdir=/home/user/eclipse-che//tomcat/temp
2019-06-25 14:05:41,989[main]             [INFO ] [o.a.c.http11.Http11NioProtocol 560]  - Initializing ProtocolHandler ["http-nio-8080"]
2019-06-25 14:05:41,997[main]             [INFO ] [o.a.t.util.net.NioSelectorPool 67]   - Using a shared selector for servlet write/read
2019-06-25 14:05:42,005[main]             [INFO ] [o.a.catalina.startup.Catalina 649]   - Initialization processed in 321 ms
2019-06-25 14:05:42,017[main]             [INFO ] [c.m.JmxRemoteLifecycleListener 336]  - The JMX Remote Listener has configured the registry on port [32001] and the server on port [32101] for the [Platform] server
2019-06-25 14:05:42,017[main]             [INFO ] [o.a.c.core.StandardService 416]      - Starting service [Catalina]
2019-06-25 14:05:42,017[main]             [INFO ] [o.a.c.core.StandardEngine 259]       - Starting Servlet Engine: Apache Tomcat/8.5.35
2019-06-25 14:05:42,449[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 957]        - Deploying web application archive [/home/user/eclipse-che/tomcat/webapps/swagger.war]
2019-06-25 14:05:42,539[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 1020]       - Deployment of web application archive [/home/user/eclipse-che/tomcat/webapps/swagger.war] has finished in [89] ms
2019-06-25 14:05:42,540[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 957]        - Deploying web application archive [/home/user/eclipse-che/tomcat/webapps/api.war]
2019-06-25 14:05:43,417[ost-startStop-1]  [INFO ] [.e.c.c.d.JNDIDataSourceFactory 63]   - This=org.eclipse.che.core.db.postgresql.PostgreSQLJndiDataSourceFactory@5724fcab obj=ResourceRef[className=javax.sql.DataSource,factoryClassLocation=null,factoryClassName=org.apache.naming.factory.ResourceFactory,{type=scope,content=Shareable},{type=auth,content=Container},{type=singleton,content=true},{type=factory,content=org.eclipse.che.api.CommonJndiDataSourceFactory}] name=che Context=org.apache.naming.NamingContext@6a839d1d environment={}
2019-06-25 14:05:45,620[ost-startStop-1]  [ERROR] [o.a.c.c.C.[.[localhost].[/api] 4798] - Exception sending context initialized event to listener instance of class [org.eclipse.che.inject.CheBootstrap]
com.google.inject.CreationException: Unable to create injector, see the following errors:

1) No implementation for java.util.Map<java.lang.String, java.util.Set<java.lang.String>> annotated with @com.google.inject.name.Named(value=allowedEnvironmentTypeUpgrades) was bound.
  while locating java.util.Map<java.lang.String, java.util.Set<java.lang.String>> annotated with @com.google.inject.name.Named(value=allowedEnvironmentTypeUpgrades)
    for the 2nd parameter of org.eclipse.che.workspace.infrastructure.kubernetes.devfile.KubernetesEnvironmentProvisioner.<init>(KubernetesEnvironmentProvisioner.java:59)
  while locating org.eclipse.che.workspace.infrastructure.kubernetes.devfile.KubernetesEnvironmentProvisioner
    for the 2nd parameter of org.eclipse.che.workspace.infrastructure.kubernetes.devfile.KubernetesComponentToWorkspaceApplier.<init>(KubernetesComponentToWorkspaceApplier.java:66)
  at org.eclipse.che.workspace.infrastructure.kubernetes.KubernetesInfraModule.lambda$configure$1(KubernetesInfraModule.java:194) (via modules: org.eclipse.che.api.deploy.WsMasterModule -> org.eclipse.che.workspace.infrastructure.kubernetes.KubernetesInfraModule)

2) No implementation for java.util.Map<java.lang.String, java.util.Set<java.lang.String>> annotated with @com.google.inject.name.Named(value=allowedEnvironmentTypeUpgrades) was bound.
  while locating java.util.Map<java.lang.String, java.util.Set<java.lang.String>> annotated with @com.google.inject.name.Named(value=allowedEnvironmentTypeUpgrades)
    for the 2nd parameter of org.eclipse.che.workspace.infrastructure.kubernetes.devfile.KubernetesEnvironmentProvisioner.<init>(KubernetesEnvironmentProvisioner.java:59)
  while locating org.eclipse.che.workspace.infrastructure.kubernetes.devfile.KubernetesEnvironmentProvisioner
    for the 2nd parameter of org.eclipse.che.workspace.infrastructure.kubernetes.devfile.DockerimageComponentToWorkspaceApplier.<init>(DockerimageComponentToWorkspaceApplier.java:78)
  at org.eclipse.che.workspace.infrastructure.kubernetes.KubernetesInfraModule.lambda$configure$1(KubernetesInfraModule.java:197) (via modules: org.eclipse.che.api.deploy.WsMasterModule -> org.eclipse.che.workspace.infrastructure.kubernetes.KubernetesInfraModule)

2 errors
        at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:543)
        at com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:159)
        at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:106)
        at com.google.inject.Guice.createInjector(Guice.java:87)
        at org.everrest.guice.servlet.EverrestGuiceContextListener.getInjector(EverrestGuiceContextListener.java:140)
        at com.google.inject.servlet.GuiceServletContextListener.contextInitialized(GuiceServletContextListener.java:45)
        at org.everrest.guice.servlet.EverrestGuiceContextListener.contextInitialized(EverrestGuiceContextListener.java:85)
        at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4792)
        at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5256)
        at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
        at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:754)
        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:730)
        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:734)
        at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:985)
        at org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1857)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
2019-06-25 14:05:45,620[ost-startStop-1]  [ERROR] [o.a.c.core.StandardContext 5257]     - One or more listeners failed to start. Full details will be found in the appropriate container log file
2019-06-25 14:05:45,621[ost-startStop-1]  [ERROR] [o.a.c.core.StandardContext 5308]     - Context [/api] startup failed due to previous errors
2019-06-25 14:05:45,637[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 1020]       - Deployment of web application archive [/home/user/eclipse-che/tomcat/webapps/api.war] has finished in [3,097] ms
2019-06-25 14:05:45,637[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 957]        - Deploying web application archive [/home/user/eclipse-che/tomcat/webapps/docs.war]
2019-06-25 14:05:45,990[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 1020]       - Deployment of web application archive [/home/user/eclipse-che/tomcat/webapps/docs.war] has finished in [353] ms
2019-06-25 14:05:45,990[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 957]        - Deploying web application archive [/home/user/eclipse-che/tomcat/webapps/workspace-loader.war]
2019-06-25 14:05:46,044[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 1020]       - Deployment of web application archive [/home/user/eclipse-che/tomcat/webapps/workspace-loader.war] has finished in [54] ms
2019-06-25 14:05:46,045[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 957]        - Deploying web application archive [/home/user/eclipse-che/tomcat/webapps/ROOT.war]
2019-06-25 14:05:46,868[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 1020]       - Deployment of web application archive [/home/user/eclipse-che/tomcat/webapps/ROOT.war] has finished in [823] ms
2019-06-25 14:05:46,869[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 957]        - Deploying web application archive [/home/user/eclipse-che/tomcat/webapps/dashboard.war]
2019-06-25 14:05:47,096[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 1020]       - Deployment of web application archive [/home/user/eclipse-che/tomcat/webapps/dashboard.war] has finished in [227] ms
2019-06-25 14:05:47,101[main]             [INFO ] [o.a.c.http11.Http11NioProtocol 588]  - Starting ProtocolHandler ["http-nio-8080"]
2019-06-25 14:05:47,113[main]             [INFO ] [o.a.catalina.startup.Catalina 700]   - Server startup in 5107 ms

Received SIGTERM
Stopping Che server running on localhost:8080
2019-06-25 14:07:01,223[main]             [INFO ] [o.a.c.core.StandardServer 524]       - A valid shutdown command was received via the shutdown port. Stopping the Server instance.
2019-06-25 14:07:01,224[main]             [INFO ] [o.a.c.http11.Http11NioProtocol 609]  - Pausing ProtocolHandler ["http-nio-8080"]
2019-06-25 14:07:01,242[main]             [INFO ] [o.a.c.core.StandardService 480]      - Stopping service [Catalina]
2019-06-25 14:07:01,285[main]             [INFO ] [o.a.c.http11.Http11NioProtocol 629]  - Stopping ProtocolHandler ["http-nio-8080"]
2019-06-25 14:07:01,289[main]             [INFO ] [o.a.c.http11.Http11NioProtocol 643]  - Destroying ProtocolHandler ["http-nio-8080"]
@sleshchenko
Copy link
Member

@z0r0 Can you check Che Server image that is used? Is should be fixed by #13466. Changes are present in 7.0.0-RC-2.0

@sleshchenko sleshchenko added the kind/bug Outline of a bug - must adhere to the bug report template. label Jun 26, 2019
@tomevision
Copy link

tomevision commented Jun 27, 2019

i have tried this with single user and it gave me this error

    ✔ Create Tiller Role Binding...it already exist.
    ✔ Create Tiller Service Account...it already exist.
    ✖ Create Tiller RBAC
      → Error from server (NotFound): the server could not find the requested resource
      Create Tiller Service
      Preparing Che Helm Chart
      Updating Helm Chart dependencies
      Deploying Che Helm Chart
Error: Command failed: /bin/sh -c echo "#
# Copyright (c) 2012-2019 Red Hat, Inc.
# This program and the accompanying materials are made
# available under the terms of the Eclipse Public License 2.0
# which is available at https://www.eclipse.org/legal/epl-2.0/
#
# SPDX-License-Identifier: EPL-2.0
#
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: tiller-role-binding
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: tiller
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: tiller
  namespace: kube-system
####################################################
# after applying this resource, run this command:
#   helm init --service-account tiller
# or if your already performed helm init, run this command:
#   kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}'
# see: https://github.com/kubernetes/helm/issues/2224, https://stackoverflow.com/a/45306258/2365824
####################################################
" | kubectl apply -f -
Error from server (NotFound): the server could not find the requested resource
    at makeError (/snapshot/chectl/node_modules/execa/index.js:174:9)
    at module.exports.Promise.all.then.arr (/snapshot/chectl/node_modules/execa/index.js:278:16)

I have also tried this with multiuser and it gave me the following error message

    ❯ Keycloak pod bootstrap
      ✔ scheduling...done.
      ✔ downloading images...done.
      ✖ starting
        → ERR_TIMEOUT: Timeout set to pod ready timeout 130000
      Che pod bootstrap
      Retrieving Che Server URL
      Che status check
Error: ERR_TIMEOUT: Timeout set to pod ready timeout 130000
    at KubeHelper.<anonymous> (/snapshot/chectl/lib/api/kube.js:0:0)
    at Generator.next (<anonymous>)
    at fulfilled (/snapshot/chectl/node_modules/tslib/tslib.js:107:62)

@nils-mosbach
Copy link

Fails on nightly and RC2 images while setup che pod scheduling as well. Installing in multiusermode results in the following message.

✔ PostgreSQL pod bootstrap
✔ scheduling...done.
✔ downloading images...done.
✔ starting...done.
✔ Keycloak pod bootstrap
✔ scheduling...done.
✔ downloading images...done.
✔ starting...done.
❯ Che pod bootstrap
✖ scheduling

Error: ERR_TIMEOUT: Timeout set to pod wait timeout 300000
    at KubeHelper.<anonymous> (/snapshot/chectl/lib/api/kube.js:0:0)
    at Generator.next (<anonymous>)
    at fulfilled (/snapshot/chectl/node_modules/tslib/tslib.js:107:62)

Single user mode gives a slighty different message but also fails.

❯ 🏃‍ Running Helm to install Che
✔ Verify if helm is installed
✔ Create Tiller Role Binding...it already exist.
✔ Create Tiller Service Account...it already exist.
✔ Create Tiller RBAC
✔ Create Tiller Service...it already exist.
✔ Preparing Che Helm Chart...done.
✔ Updating Helm Chart dependencies...done.
✖ Deploying Che Helm Chart

→ Error: Upgrade --force successfully deleted the previous release, but encountered 4 error(s) and cannot continue: release "che": object "" not found, skipping delete; release "che": object "" not f
…

Error: Command failed: /bin/sh -c helm upgrade --install che --force --namespace che --set global.ingressDomain=10.--.--.---.nip.io --set cheImage=eclipse/che-server:nightly --set global.cheWorkspacesNamespace=che   /root/.cache/chectl/templates/kubernetes/helm/che/
Error: UPGRADE FAILED: Upgrade --force successfully deleted the previous release, but encountered 4 error(s) and cannot continue: release "che": object "" not found, skipping delete; release "che": object "" not found, skipping delete; release "che": object "" not found, skipping delete; release "che": object "" not found, skipping delete
UPGRADE FAILED
Error: Upgrade --force successfully deleted the previous release, but encountered 4 error(s) and cannot continue: release "che": object "" not found, skipping delete; release "che": object "" not found, skipping delete; release "che": object "" not found, skipping delete; release "che": object "" not found, skipping delete
    at makeError (/snapshot/chectl/node_modules/execa/index.js:174:9)
    at module.exports.Promise.all.then.arr (/snapshot/chectl/node_modules/execa/index.js:278:16)

@l0rd
Copy link
Contributor

l0rd commented Jul 1, 2019

Should be fixed by che-incubator/chectl#179

@ghost
Copy link

ghost commented Jul 1, 2019

@l0rd Thank you for this. Quick question do we have to build this from source? currently, I cannot see it in the release panel and only older versions are available;

@l0rd
Copy link
Contributor

l0rd commented Jul 1, 2019

@gito0o the PR hasn't been merged yet. We are working on it but it's not ready yet. I will provide an update when the PR is merged and a new version of chectl is released.

@ghost
Copy link

ghost commented Jul 1, 2019

@l0rd Any advice which releases we can revert to?
I was able to bring the pod to running and test to pass by manually changing the image from nightly to latest that works for me

@l0rd
Copy link
Contributor

l0rd commented Jul 1, 2019

@gito0o If that works for you that's great. Otherwise, until the PR get merged there are 2 options:

  1. use helm with multi-user support:
chectl server:start --installer=helm --multiuser
  1. clone eclipse/che-operator repository and run the deploy script as described here https://github.com/eclipse/che-operator/#how-to-deploy

@l0rd l0rd changed the title multiuser fails to complete installation when using minikube, and the chectl operator. Multi-user deploy fails on minikube Jul 1, 2019
@l0rd
Copy link
Contributor

l0rd commented Jul 1, 2019

I have been confused by the title of this issue. This issue isn't related to the Che operator and doesn't look related to chectl neither. I have changed the issue title (cc @z0r0).

Anyway I have been able to reproduce it cloning eclipse/che and using the helm chart in master branch (as described here).

@sleshchenko the image is eclipse/che-server:nightly but my error log doesn't match exactly the log in the issue description. I have pasted my che server logs here https://gist.github.com/l0rd/0bbaafb52c51bf2fa9d6e56e2d957951

@l0rd l0rd added the severity/blocker Causes system to crash and be non-recoverable or prevents Che developers from working on Che code. label Jul 1, 2019
@l0rd l0rd modified the milestones: 7.1.0, 7.0.0 Jul 1, 2019
@ghost
Copy link

ghost commented Jul 1, 2019

@l0rd Thank you for the provided information, I am as well confused. The error you posted also showed up in some of my past troubleshooting but here is the error that I am getting most of the times. I believe that has something to do with permission on the mounted volume. I tried to give permission in Dockerfile but changing the image to latest had the same effect.

This is happening in Keycloack and not in che , Che is just in waiting mode and keycloack never finishes.

{
 severity: "ERROR"  
 textPayload: "Failed to read or configure the org.jboss.logmanager.LogManager
java.lang.IllegalArgumentException: Failed to instantiate class "org.jboss.logmanager.handlers.PeriodicRotatingFileHandler" for handler "FILE"  
}

@l0rd
Copy link
Contributor

l0rd commented Jul 2, 2019

Ok it looks like image eclipse/che-server:nightly contains che-server 7.0.0-beta-4.0-SNAPSHOT

$ docker run --rm -ti --entrypoint=bash eclipse/che-server:nightly

bash-4.4# jar xvf home/user/eclipse-che/tomcat/webapps/api.war META-INF/maven/org.eclipse.che/assembly-wsmaster-war/pom.properties
 inflated: META-INF/maven/org.eclipse.che/assembly-wsmaster-war/pom.properties

bash-4.4# more META-INF/maven/org.eclipse.che/assembly-wsmaster-war/pom.properties 
#Created by Apache Maven 3.3.9
version=7.0.0-beta-4.0-SNAPSHOT
groupId=org.eclipse.che
artifactId=assembly-wsmaster-war

bash-4.4# 

Assigning to @vparfonov

UPDATE: actually that was not the issue, I had not pulled the image locally whereas helm apply correctly cheImagePullPolicy: Always

@l0rd l0rd changed the title Multi-user deploy fails on minikube che-server nightly image contains che-server beta4-snapshot Jul 2, 2019
@l0rd l0rd changed the title che-server nightly image contains che-server beta4-snapshot Che server nightly image contains beta4-snapshot Jul 2, 2019
@l0rd l0rd mentioned this issue Jul 2, 2019
85 tasks
@l0rd
Copy link
Contributor

l0rd commented Jul 2, 2019

I have double checked and it the version of che-server used is the correct one (7.0.0-rc-3.0-SNAPSHOT ) but nevertheless che-server is still not starting when using helm.

Assigning back to @skabashnyuk

@l0rd l0rd assigned skabashnyuk and unassigned vparfonov Jul 2, 2019
@l0rd l0rd changed the title Che server nightly image contains beta4-snapshot Multi-user Che fails to start on minikube using helm Jul 2, 2019
@ghost
Copy link

ghost commented Jul 2, 2019

I can only imagine the init container and the whole waiting thing between psql -> Kycloak -> Che might have some room for improvement. I can contribute a few hours to this if this is currently the focuse.

@l0rd
Copy link
Contributor

l0rd commented Jul 2, 2019

@gito0o yes that's definitely a blocker we need to fix. Consider that it used to work so that should be a "subtle" regression.

@mkuznyetsov
Copy link
Contributor

So the failure of Che startup on minikube+helm is missing Che configuration on Keycloak
And Che Keycloak container initialization script is failing on minikube (first entries of pod logs) :

Configuring Keycloak by modifying realm and user templates...
/scripts/kc_realm_user.sh: line 17: /scripts/che-users-0.json: Permission denied
/scripts/kc_realm_user.sh: line 27: /scripts/che-realm.json: Permission denied
Creating Admin user...

the user, under which this script is attempted to being executed is jboss, which indeed doesn't have such permissions
And On Openshift, this problem doesn't exist, since script is being executed by some other user with these permissions

@davidfestal
Copy link
Contributor

Wouldn't it be sufficient, to add the 0 group to the Dockerfile user, since the 0 group already has required permissions in the scripts folder ?

Instread of:

USER root
RUN chgrp -R 0 /scripts && \
    chmod -R g+rwX /scripts

USER 1000

we would simply have:

USER root
RUN chgrp -R 0 /scripts && \
    chmod -R g+rwX /scripts

USER 1000:0

?

@mkuznyetsov Do you think you could test it ?

@mkuznyetsov
Copy link
Contributor

it didn't work, keycloak then fails to start for another permissions related reason:
logs-from-keycloak-in-keycloak-7694fd66db-hhqqz (1).txt

@benoitf
Copy link
Contributor

benoitf commented Jul 8, 2019

Updating description to k8s instead of minikube as we've the same on top of kubernetes

@benoitf benoitf changed the title Multi-user Che fails to start on minikube using helm Multi-user Che fails to start on k8s using helm Jul 8, 2019
@davidwindell
Copy link
Contributor

Just hit this error che-users-0.json: Permission denied trying to get a new install of Che on K8s running. Is there a temporary workaround?

@ghost
Copy link

ghost commented Jul 10, 2019

I was able to get it working by setting the securityContext to root. not ideal but really haven't had time to get to this property.

@mkuznyetsov
Copy link
Contributor

this PR would have solutions to file permission related issues: #13798

@SDAdham
Copy link

SDAdham commented Jul 12, 2019

Could this issue >> #13838 related to this one here?

@hari1992-web
Copy link

@gito0o If that works for you that's great. Otherwise, until the PR get merged there are 2 options:

1. use helm with multi-user support:
chectl server:start --installer=helm --multiuser
1. clone eclipse/che-operator repository and run the deploy script as described here https://github.com/eclipse/che-operator/#how-to-deploy

whenver iam using with multi user its failed why ?????

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Outline of a bug - must adhere to the bug report template. severity/blocker Causes system to crash and be non-recoverable or prevents Che developers from working on Che code.
Projects
None yet
Development

No branches or pull requests