Skip to content

Added propagation of plugin broker failing error#11344

Merged
sleshchenko merged 3 commits intoeclipse-che:masterfrom
sleshchenko:brokerPodStatus
Sep 27, 2018
Merged

Added propagation of plugin broker failing error#11344
sleshchenko merged 3 commits intoeclipse-che:masterfrom
sleshchenko:brokerPodStatus

Conversation

@sleshchenko
Copy link
Copy Markdown
Member

@sleshchenko sleshchenko commented Sep 25, 2018

What does this PR do?

It adds propagation of plugin broker failing error and instead of Plugins installation process timed out.
Changes are separate into different commits:

  1. CHE-11070 Add waiting until Plugin Broker Pod become RUNNING before proceeding to next phase. But actually, it does not solve an issue with propagating original error since Restart policy is Always because of creating pods via Deployments and Pod status is always Running while a particular container may be failed and restarted with some period.
  2. Make BrokerEvent suitable to be used for STARTED status event. It includes renaming BrokerResultEvent to BrokerStatusChangedEvent and removes check that error or plugin list is required.
  3. Extract unrecoverable events listener from KubernetesInternalRuntime. It is needed to be able to reuse it in Plugin Broker.
  4. Added listening to unrecoverable events during Plugin Broker starting.

Example what a user sees when plugin broker fails to start because of non-existing image
screenshot_20180926_154525

What issues does this PR fix or reference?

#11070
#11068

Release Notes

N/A

Docs PR

N/A

@sleshchenko sleshchenko added kind/enhancement A feature request - must adhere to the feature request template. status/code-review This issue has a pull request posted for it and is awaiting code review completion by the community. labels Sep 25, 2018
@sleshchenko sleshchenko self-assigned this Sep 25, 2018
Copy link
Copy Markdown

@garagatyi garagatyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I have commented on some stuff, please, take a look at inlined comments.
I would LOVE to see unit tests in the PR too! I know that it is me who haven't had them in place firsthand, but the code gets complicated. I started adding unit tests in my recent PRs and would really love some assistance :-)


if (pods.size() > 1) {
throw new InternalInfrastructureException(
"There are multiply pods in plugin broker environment");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT if we change two checks to just one? Something like:
Plugin brokering supports one pod only. Workspace 'ID' contains 'N' pods

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.
Changed to

format("Plugin broker environment must have only one pod. Workspace %s contains %s pods.", workspaceId, pods.size())

Do you think it is good enough?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

String encodedTooling = event.getTooling();
if (event.getStatus() == null
|| event.getWorkspaceId() == null
|| (event.getError() == null && event.getTooling() == null)) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that null tooling may cause an actual NPE here. Please, ensure we don't have an NPE here.
We also can have an NPE because of that. So would be nice to fix this too.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I've added a check before completion of future, and the whole plugin broker process will be failed if DONE event is published but plugins are null 4858738

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tnx

finishFuture.completeExceptionally(
new InfrastructureException("Broker process failed with error: " + event.getError()));
InfrastructureException ex =
new InfrastructureException("Broker process failed with error: " + event.getError());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also include the workspace ID in the message? Just in case.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

DeployBroker deployBroker = getDeployBrokerPhase(kubernetesNamespace, brokerEnvironment);
DeployBroker deployBroker =
getDeployBrokerPhase(
runtimeID.getWorkspaceId(), kubernetesNamespace, brokerEnvironment, startFuture);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather set WS_ID in the BrokerEnvironmentFactory, WDYT?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was done in the same way for ListenBrokerEvents and PrepareStorage.
Sorry, I'm not sure that I fully understand your idea. Do you propose to store WS_ID in KubernetesEnvvironment that is created by brokerEnvironmentFactory and then fetch this value in phases from there?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see that it is not in the K8sEnv. Sorry for confusing

@sleshchenko
Copy link
Copy Markdown
Member Author

@garagatyi I've added few more tests classes, please let me know if you think that any other components need to be covered with tests.

@sleshchenko
Copy link
Copy Markdown
Member Author

ci-test

@riuvshin
Copy link
Copy Markdown
Contributor

Results of automated E2E tests of Eclipse Che Multiuser on OCP:
Build details
Test report
docker image: eclipseche/che-server:11344
https://github.com/orgs/eclipse/teams/eclipse-che-qa please check this report.

Copy link
Copy Markdown
Member

@ibuziuk ibuziuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 but I have only reviewed bits related to Unrecoverable events

@sleshchenko
Copy link
Copy Markdown
Member Author

ci-test

@riuvshin
Copy link
Copy Markdown
Contributor

Results of automated E2E tests of Eclipse Che Multiuser on OCP:
Build details
Test report
docker image: eclipseche/che-server:11344
https://github.com/orgs/eclipse/teams/eclipse-che-qa please check this report.

@Ohrimenko1988
Copy link
Copy Markdown
Contributor

ci-test

@riuvshin
Copy link
Copy Markdown
Contributor

Results of automated E2E tests of Eclipse Che Multiuser on OCP:
Build details
Test report
docker image: eclipseche/che-server:11344
https://github.com/orgs/eclipse/teams/eclipse-che-qa please check this report.

@Ohrimenko1988
Copy link
Copy Markdown
Contributor

The selenium tests did not reveal any regression in next report #11344 (comment)

@sleshchenko sleshchenko merged commit eddff2b into eclipse-che:master Sep 27, 2018
@sleshchenko sleshchenko deleted the brokerPodStatus branch September 27, 2018 15:11
@benoitf benoitf removed the status/code-review This issue has a pull request posted for it and is awaiting code review completion by the community. label Sep 27, 2018
@benoitf benoitf added this to the 6.12.0 milestone Sep 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/enhancement A feature request - must adhere to the feature request template.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants