Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly handle situations when a ws agent is killed by OOM #1988

Closed
ghost opened this issue Jul 29, 2016 · 7 comments
Closed

Properly handle situations when a ws agent is killed by OOM #1988

ghost opened this issue Jul 29, 2016 · 7 comments
Assignees
Labels
kind/bug Outline of a bug - must adhere to the bug report template.

Comments

@ghost
Copy link

ghost commented Jul 29, 2016

When a workspace agent stop responding for whatever reason, projects disappear from project tree, and page refresh results in Cannot get project types error.

Reproduction Steps:

  1. Start a workspace with 1GB
  2. Import a fairly large Java project
  3. Start Maven build

Expected behavior:

Build is a success and the IDE remains functional

Observed behavior:

When OOM happens in a workspace container, kernel may kill WS agent or the WS agent becomes unreachable for some reason. The IDE keeps trying to reach ws agent and fails to do so. As a result, projects disappear from project tree. Page refresh results in Cannot load project type error which is caused by the client trying to reach API deployed with a workspace agent.

Proposed solution

A user should be at least notified that the workspace is malfunctioning probably because of OOM. It may also be a good idea to try to restart a workspace agent, however warn a user that this situation is likely to happen again with this kind of workspace and more RAM may be required.

  • Problem started happening recently, didn't happen in an older version of Che: [Yes]
  • Problem can be reliably reproduced, doesn't happen randomly: [Yes]
@ghost ghost added the kind/bug Outline of a bug - must adhere to the bug report template. label Jul 29, 2016
@riuvshin
Copy link
Contributor

it is same issue as #1817

@bmicklea
Copy link

bmicklea commented Aug 2, 2016

#1817

@JamesDrummond
Copy link
Contributor

@vparfonov Please assign this to yourself. Thanks.

@vparfonov
Copy link
Contributor

We can't detect properly OOM but we can in some check that ws-agent process still alive and notify if it killed by OS

@vparfonov
Copy link
Contributor

vparfonov commented Aug 19, 2016

We found other solution without additional process on dev machine side.
If client (IDE) lost websocket connection to the ws-agent it will ask some other service in our infrastructure to check ws-agent state if it not available for this service to IDE will propose stop workspace.

@bmicklea bmicklea added the status/analyzing An issue has been proposed and it is currently being analyzed for effort and implementation approach label Aug 23, 2016
@bmicklea
Copy link

This sounds reasonable - thanks Vitalii

@vparfonov vparfonov added status/in-progress This issue has been taken by an engineer and is under active development. and removed status/analyzing An issue has been proposed and it is currently being analyzed for effort and implementation approach sprint/current labels Aug 30, 2016
@vparfonov vparfonov added status/open-for-dev An issue has had its specification reviewed and confirmed. Waiting for an engineer to take it. and removed status/in-progress This issue has been taken by an engineer and is under active development. labels Sep 7, 2016
@vparfonov vparfonov added status/pending-merge and removed status/open-for-dev An issue has had its specification reviewed and confirmed. Waiting for an engineer to take it. labels Oct 5, 2016
@vparfonov
Copy link
Contributor

Workaround #2369

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Outline of a bug - must adhere to the bug report template.
Projects
None yet
Development

No branches or pull requests

5 participants