-
Notifications
You must be signed in to change notification settings - Fork 805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support cgroup v2 for linux stress experiments #2928
feat: support cgroup v2 for linux stress experiments #2928
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
Welcome @igcherkaev! |
Signed-off-by: Igor Cherkaev <igor.cherkaev@copart.com>
66db273
to
d05294c
Compare
Codecov Report
@@ Coverage Diff @@
## master #2928 +/- ##
==========================================
+ Coverage 37.96% 38.12% +0.16%
==========================================
Files 106 106
Lines 9083 9106 +23
==========================================
+ Hits 3448 3472 +24
- Misses 5327 5328 +1
+ Partials 308 306 -2
Continue to review full report at Codecov.
|
This PR cannot close #2652, because the problem in #2652 is actually not for "Unified" mode, but "Hybrid" mode. In #2652 (comment) , only the But it's still fine to support unified group v2 only, for now. |
Oh, sorry, I did not know that. @STRRL was talking about cgroup v2 there so I assumed the unified mode. I don't have environment with cgroups in the hybrid mode, and I don't know the root cause of the issue in #2652 and why it's related to blkio, so I'll leave it out of the scope of this PR then. |
Also, I can backport this fix to |
I tested this patch for different container runtime; it works on contained but does not work on docker. It seems docker would isolate the "cgroup namespace" for the container but containerd (also crio) would not. When we call
But chaos-daemon( in docker) would return
The latter one would not work well. Should we also mount the hosts /proc into chaos-daemon like /host-sys? @YangKeao |
Should I add a condition to apply it only when runtime is containerd? Docker runtime is deprecated in k8s anyway, maybe Chaos Mesh should do the same and begin deprecating docker runtime? |
I think this PR would not require that. I think the support for docker should be finished in the next several PRs. |
Hi @igcherkaev , could you resolve the conflicts? Then appending an entry in the |
@STRRL sure, will work on that. |
Signed-off-by: Igor Cherkaev <igor.cherkaev@copart.com>
Signed-off-by: SiyuChen <ryougi201@gmail.com>
* chore: move fake_image to linux_amd64 Signed-off-by: SiyuChen <ryougi201@gmail.com> * chore: fix upload image Signed-off-by: SiyuChen <ryougi201@gmail.com> * fix: remove deafult project env Signed-off-by: SiyuChen <ryougi201@gmail.com> * chore: add FakeImage to fake_image Signed-off-by: SiyuChen <ryougi201@gmail.com> * chore: update e2e image name Signed-off-by: SiyuChen <ryougi201@gmail.com> * chore: fix e2e test Signed-off-by: SiyuChen <ryougi201@gmail.com> * chore: fix e2e test Signed-off-by: SiyuChen <ryougi201@gmail.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
* fix(ui): pod phases should be first letter capitalized Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: unify label case Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
* feat: OpenAPI to TypeScript API Client and Forms Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: license checker Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: go (verify) Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: add comments Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: reuse kubebuilder marks Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: update according to RFC Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: prevent ui:form appearing in crds Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: add missing actions Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: update lockfile Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: support PhysicalMachineChaos Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: merge commands into codegen Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: license checker Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: ci Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: supplement another ignore check Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: update package info Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: update README Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: update Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: gen HTTPChaos Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: update changelog Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: remove nested ignores Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: remove remaining markers Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: typo Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: supplement comments Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Signed-off-by: imlonghao <git@imlonghao.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
…-mesh#2921) * helm: update archived data's ttl Signed-off-by: cwen0 <cwenyin0@gmail.com> * update helm doc Signed-off-by: cwen0 <cwenyin0@gmail.com> * fix comments Signed-off-by: cwen0 <cwenyin0@gmail.com> * fix comments Signed-off-by: cwen0 <cwenyin0@gmail.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
* fix wrong comment on workflow Dashboard HTTP API Signed-off-by: xiang <xiang13225080@163.com> * update swagger Signed-off-by: xiang <xiang13225080@163.com>
* add customized logger for daemon and bpm Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * extract the grpc metadata into the context Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * make check Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * make check Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * remove log parameter in killIOChaos Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add logs Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * remove log parameter in httpchaos server Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add the context comments for process builder Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * make check Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * use the logger in arguments Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add back boilerplate Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * fix make check Signed-off-by: YangKeao <yangkeao@chunibyo.icu>
* add auto compile github action Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * modify the name of action Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * remove comment expression Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add comments validation Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * rename the zst file name Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * auto detect buildenv and devenv Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * fix artifact url Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add debug information Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add cache and debug artifact url Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * fix bash script Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * modify cache step name Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * use another docker buildx driver Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add download bash script Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add boilerplate Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add some comments and help message Signed-off-by: YangKeao <yangkeao@chunibyo.icu> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Signed-off-by: Igor Cherkaev <igor.cherkaev@copart.com>
Hi @igcherkaev, the |
Signed-off-by: Igor Cherkaev <igor.cherkaev@copart.com>
Hmm, I thought I'd done that. Did it again and pushed the changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks for your contribution!
/merge |
This pull request has been accepted and is ready to merge. Commit hash: a434135
|
/run-e2e-tests |
I think DCO check fails due to rebasing my branch and including commits made by other folks. |
I would manually set it to PASS. 🥲 |
Thank you! Glad my efforts didn't go to waste :) |
What problem does this PR solve?
Support cgroup v2 driver when running experiments on linux (CPU & Mem). If the host/containers where chaosdaemon is running is in full cgroup v2 mode (unified), users will no longer see:
When they run CPU or Memory stress experiments.
What's changed and how it works?
Introduced a cgroupv2 manager to add newly created stress processes to the respective cgroup if containers based on what cgroup driver containers are using.
P.S. I did not re-order import statements myself, it must have been done by the format/lint/tidy goals in the Makefile when I ran make check or make test. I can revert these changes, if need be.
Related changes
chaos-mesh/website
Dashboard UI
Checklist
Tests
Start kubernetes 1.21+ with cgroupfs and cgroup v2 driver.
Verify what cgroup driver you have on the node and in the container:
Schedule new CPU/Mem stress experiment and verify it's working by checking cgroup assigned to respective processes on the worker nodes.
Verify it on the node by using pids of the main process in the container and of the stress-ng/memstress process:
Side effects
N/A
Release note
DCO
If you find the DCO check fails, please run commands like below (Depends on the actual situations. For example, if the failed commit isn't the most recent) to fix it: