Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PKIX error and impossible to start workspaces on OCP 4.1 #13607

Closed
slemeur opened this issue Jun 21, 2019 · 18 comments
Closed

PKIX error and impossible to start workspaces on OCP 4.1 #13607

slemeur opened this issue Jun 21, 2019 · 18 comments
Labels
kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system. status/blocked Issue that can’t be moved forward. Must include a comment on the reason for the blockage.
Milestone

Comments

@slemeur
Copy link
Contributor

slemeur commented Jun 21, 2019

Description

2019-06-21 08:52:06,511[557-wjt8m-47779]  [WARN ] [unknown.jul.logger 49]               - Problem getting Pod json from Kubernetes Client[masterUrl=https://172.30.0.1:443/api/v1, headers={}, connectTimeout=5000, readTimeout=30000, operationAttempts=3, operationSleep=1000, streamProvider=org.openshift.ping.common.stream.TokenStreamProvider@3d3d5e6a] for cluster [EclipseLinkCommandChannel], namespace [che7], labels [app=che]; encountered [java.lang.Exception: 3 attempt(s) with a 1000ms sleep to execute [OpenStream] failed. Last failure was [javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target]]

And then:

2019-06-21 08:52:06,965[aceSharedPool-1]  [WARN ] [.i.k.KubernetesInternalRuntime 245]  - Failed to start Kubernetes runtime of workspace workspacef4huz4zllvbwgxom. Cause: Pod creation timeout exceeded. -id: workspacef4huz4zllvbwgxom.workspace -message: null

Reproduction Steps

  • Deploy Che 7 RC 2 on OCP 4.1
  • Create 5 workspaces
  • stop all of them
  • try to start a new workspace.

OS and version:
Che 7 RC 2 - OCP 4.1

@slemeur slemeur added kind/bug Outline of a bug - must adhere to the bug report template. severity/blocker Causes system to crash and be non-recoverable or prevents Che developers from working on Che code. team/platform labels Jun 21, 2019
@davidfestal
Copy link
Contributor

@slemeur could you precise the installation mode, and if installed from the che operator, the detail of your CheCluster Custom resource ?

@skabashnyuk
Copy link
Contributor

might be related to

  1. kubernetes.KUBE_PING using token auth does not verify CA and always d.... kubernetes.KUBE_PING using token auth does not verify CA and always d… jgroups-extras/jgroups-kubernetes#69
  2. CLOUD-3228 openshift.KUBE_PING doesn't work on OCP 4.1 since it only CLOUD-3228 openshift.KUBE_PING doesn't work on OCP 4.1 since it only … jboss-openshift/openshift-ping#43

@skabashnyuk
Copy link
Contributor

@mshaposhnik what is the state of this task?

@mshaposhnik
Copy link
Contributor

I did brief investigation. We still can't upgrate to JGroups 4.x due to it is unsupported in Eclipse Link (https://bugs.eclipse.org/bugs/show_bug.cgi?id=531910)
So the only way to get it fixed is to ask jgroups-kubernetes guys to backport theirs CA fix into the 0.9.x banch (called stable) n theirs repo. It can't be done easily (by merge or cherry-pick) since old J-K versions have another code structure, modules etc

@l0rd
Copy link
Contributor

l0rd commented Jun 27, 2019

We discussed yesterday that this is not a blocker for 7.0.0 release but will be a blocker for 7.1.0 release. Hence I am labelling it as P1 and setting the milestion to 7.1.0.

@l0rd l0rd added severity/P1 Has a major impact to usage or development of the system. and removed severity/blocker Causes system to crash and be non-recoverable or prevents Che developers from working on Che code. labels Jun 27, 2019
@l0rd l0rd added this to the 7.1.0 milestone Jun 27, 2019
@mshaposhnik
Copy link
Contributor

mshaposhnik commented Aug 6, 2019

So, i performed an bunch of testing and seems that switch to JGroups version 4.x and KUBE_PING version 1.10+ solves the described error. So the main impediment as for now is accepting PR in the EclipseLinlk (eclipse-ee4j/eclipselink#500) and waiting for release of it.

@ibuziuk
Copy link
Member

ibuziuk commented Aug 8, 2019

@l0rd @slemeur looks like eclipse-ee4j/eclipselink#500 it is utterly important to get this PR merged before GA, should we ask someone from foundation to speed-up the review process?

@slemeur
Copy link
Contributor Author

slemeur commented Aug 8, 2019

@ibuziuk this is not for GA, but 7.1.0

@ibuziuk
Copy link
Member

ibuziuk commented Aug 8, 2019

@slemeur correct, I was just reviewing issue for 7.1.0 and looks like eclipse-ee4j/eclipselink#500 might be a blocker for it, so just a heads up - we might need to ask / push for review sooner rather than later

@skabashnyuk
Copy link
Contributor

Max already send a review request https://www.eclipse.org/lists/eclipselink-dev/msg07786.html

@ibuziuk
Copy link
Member

ibuziuk commented Aug 20, 2019

it looks like no one on eclipse side reviewed eclipse-ee4j/eclipselink#500
@slemeur @l0rd I guess this issue is a blocker for 7.1.0

@skabashnyuk skabashnyuk modified the milestones: 7.1.0, 7.2.0 Sep 5, 2019
@mshaposhnik
Copy link
Contributor

PR is merged, let's wait for EL release and the switch to new version.

@skabashnyuk skabashnyuk modified the milestones: 7.2.0, 7.x Sep 19, 2019
@skabashnyuk
Copy link
Contributor

waiting for EclipseLink release

@skabashnyuk skabashnyuk added the status/blocked Issue that can’t be moved forward. Must include a comment on the reason for the blockage. label Sep 19, 2019
@nickboldt
Copy link
Contributor

EL fix was pushed in eclipse-ee4j/eclipselink#500 on Sept 11 for the 2.7 branch. But no new EL releases since 2.6.8 on 19 Jun and 2.7.4 on 18 Jan, so looks like we're still blocked here.

As this blocks https://issues.jboss.org/browse/CRW-304 we need this for Che 7.2 (or 7.3 if upstream EclipseLink won't deliver in time).

Can anyone push on the EclipseLink team to deliver their next 2.7.x release?

@skabashnyuk
Copy link
Contributor

Lukas Jungmann response was

That they wish to have 2.7.5 this month

@nickboldt
Copy link
Contributor

Does that mean in the next 8 days? Or in time for 7.2?

@skabashnyuk
Copy link
Contributor

I believe that was a wish not an obligation. And I doubt that it can be a part of Eclipse Che 7.2.

@nickboldt
Copy link
Contributor

Too bad they don't do releases as part of the Eclipse simrel trains, or we'd have had it last week. :'(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system. status/blocked Issue that can’t be moved forward. Must include a comment on the reason for the blockage.
Projects
None yet
Development

No branches or pull requests

7 participants