You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a tracking issue for LFX Mentorship 2024 March-May Term (project link, related issue #3651). Our mentors are @STRRL, @g1eny0ung and @cwen0. In this project, we aim to improve the observability of StressChaos by exposing the pod-level metrics (e.g., CPU usage seconds) related to the StressChaos experiments and providing a way to visualizing them. In the end, we hope Chaos Mesh end users can observe the injected stress at the Pod-level.
Steps
Step 1. Enabling Pod-level Metrics
In this step, we want to make sure the related metrics for StressChaos can be observed at pod-level. To achieve this, we need to evaluate the current implementation of StressChaos to see whether the stressor process has been injected into the current PID namespace, cgroups, etc.
Step 2. Exposing Pod-level Metrics
In this step, we want to provide a way to expose the pod-level metrics to the external observability stack (e.g., prometheus). All the metrics should contain labels about the injected StressChaos.
Step 3. Visualizing the Metrics
In this step, we want to provide a way for the end user to visualize the exposed experiment metrics. We expect to provide a Grafana dashboard and related prometheus configuration and provide related documents.
Progress
Step 1. Enabling Pod-level Metrics (Done)
Evaluating the current StressChaos implementation
We had done a code review on the current implementation of the namespace and cgroup mechanism.
The injected stress-ng has been spawned in the same PID namespace of the Pod container.
This is a tracking issue for LFX Mentorship 2024 March-May Term (project link, related issue #3651). Our mentors are @STRRL, @g1eny0ung and @cwen0. In this project, we aim to improve the observability of StressChaos by exposing the pod-level metrics (e.g., CPU usage seconds) related to the StressChaos experiments and providing a way to visualizing them. In the end, we hope Chaos Mesh end users can observe the injected stress at the Pod-level.
Steps
Step 1. Enabling Pod-level Metrics
In this step, we want to make sure the related metrics for StressChaos can be observed at pod-level. To achieve this, we need to evaluate the current implementation of StressChaos to see whether the stressor process has been injected into the current PID namespace, cgroups, etc.
Step 2. Exposing Pod-level Metrics
In this step, we want to provide a way to expose the pod-level metrics to the external observability stack (e.g., prometheus). All the metrics should contain labels about the injected StressChaos.
Step 3. Visualizing the Metrics
In this step, we want to provide a way for the end user to visualize the exposed experiment metrics. We expect to provide a Grafana dashboard and related prometheus configuration and provide related documents.
Progress
Step 1. Enabling Pod-level Metrics (Done)
(*CommandBuilder) SetNS
, which utilizes nsexec(*AttachCGroupV1).AttachProcess
or(*AttachCGroupV2).AttachProcess
PidPath
returns an unexpected error #4407Step 2. Exposing Pod-level Metrics (Done)
Step 3. Visualizing the Metrics (WIP)
The text was updated successfully, but these errors were encountered: