Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workspace diagnosis capabilities in Eclipse Che #15047

Closed
12 tasks done
skabashnyuk opened this issue Oct 31, 2019 · 3 comments
Closed
12 tasks done

Workspace diagnosis capabilities in Eclipse Che #15047

skabashnyuk opened this issue Oct 31, 2019 · 3 comments
Assignees
Labels
kind/epic A long-lived, PM-driven feature request. Must include a checklist of items that must be completed. severity/P1 Has a major impact to usage or development of the system.

Comments

@skabashnyuk
Copy link
Contributor

skabashnyuk commented Oct 31, 2019

Is your task related to a problem? Please describe.

A workspace is a set of microservices and there is a lot of complexity hidden from the end-user.
When a workspace does not behave as expected, it is very hard to understand what is going on, where to get the information, where to get the logs. It makes the diagnosis of bugs very hard

Describe the solution you'd like

Workspace startup

  • All logs of the workspace startup should be persisted and available to the end user.
    • All the containers logs of the workspace should be available.
    • The plugin broker logs
    • The JWT proxy logs
    • Che server (when reading the devfile)
    • The language servers, the editor and any plugins logs (note that those logs are generally not available as container logs)
  • Logs should be downloadable, so they can be exported in an issue and shared.
  • We would keep history of the latest 5 startup of the workspace
  • Logs should be available from chectl (to start)

Running workspace

  • We should have a widget displaying the status of the workspace. This could be an option of the “workspace panel”.
    • We should display the resource usage in each of the containers
    • We should allow the user to access to the logs of a container (links to command on getting the logs, or links to the openshift console)
  • Chectl should allow to export the logs of a currently running workspace.

Implementation

Access to the workspace logs

as a result of #15134

Access to the workspace lifetime event log

as a result of #15135
TBD

Describe alternatives you've considered

Additional context

@skabashnyuk skabashnyuk added kind/task Internal things, technical debt, and to-do tasks to be performed. kind/epic A long-lived, PM-driven feature request. Must include a checklist of items that must be completed. team/platform and removed kind/task Internal things, technical debt, and to-do tasks to be performed. labels Oct 31, 2019
@che-bot che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Oct 31, 2019
@skabashnyuk skabashnyuk added severity/P1 Has a major impact to usage or development of the system. and removed status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. labels Oct 31, 2019
@skabashnyuk skabashnyuk added this to the Backlog - Platform milestone Oct 31, 2019
@skabashnyuk skabashnyuk added this to In progress in Platform Epics Nov 20, 2019
@skabashnyuk skabashnyuk moved this from In progress to To do in Platform Epics Jan 9, 2020
@skabashnyuk
Copy link
Contributor Author

@l0rd @slemeur

  1. The description of this epic is more concentrated around logs. Should we invest our time on k8s events [POC] Access to the workspace lifetime event log #15135 or logs should be our p1?

  2. Can we put resource usage out of the scope of this task? It is very connected to Dashboard: provide target k8s namespace information  #13484 maybe we should implement it as part of that epic?

@skabashnyuk skabashnyuk moved this from To do to In progress in Platform Epics Jan 10, 2020
@sparkoo
Copy link
Member

sparkoo commented Jan 23, 2020

Here's some proposal draft of possible solution:

1. archive stdout logs

  1. stdout/err logs of all ws containers to file on containers. The location has to be on workspace PV and has to be known.
  2. Watch ws exit event -> start collector pod that will grab all logs and send them to the backend.

What is backend is open question. At this point, we will send logs to the master's logs endpoint, which does not exist yet.

2. access archived logs

  1. Design che-server API to provide logs to the user
  2. Provide the logs on request via the API. This very much depends on chosen storage backend.

3. collect file logs

Some components might be logging into files somewhere in the container. We can't know where all the logs are, thus we have to give the option to define the directory, where the component is logging, to be able to collect them later. We need to enhance plugin's meta.yaml as well as the devfile. Once we have that, we can collect the logs with collector pod, same as in 1..

  1. Design meta.yaml and devfile changes needed to be able to define file logs dir to collect
  2. Somehow grab these logs with collector

extras: things to not forget to at least think about them

  • quotas management
  • what if logs are big (gigs)? collect only last 100M?
  • what if ephemeral mode ?
  • what all should be configurable? (number of archived workspace starts, max size of the logs per user, cleanup old logs based on time)

@sleshchenko sleshchenko moved this from TODO to In progress in Controller Epics Mar 5, 2020
@sleshchenko sleshchenko moved this from In progress to Done in Controller Epics Mar 26, 2020
@skabashnyuk skabashnyuk moved this from In progress to Done in Platform Epics Apr 1, 2020
@sparkoo
Copy link
Member

sparkoo commented Apr 8, 2020

all issues int the scope of this epic are done, closing

@sparkoo sparkoo closed this as completed Apr 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/epic A long-lived, PM-driven feature request. Must include a checklist of items that must be completed. severity/P1 Has a major impact to usage or development of the system.
Projects
No open projects
Development

No branches or pull requests

3 participants