-
Notifications
You must be signed in to change notification settings - Fork 806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor bpm: generate uid for each process; remove identifier lock; remove blockingBuffer #2918
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
/run-e2e-tests |
1 similar comment
/run-e2e-tests |
Codecov Report
@@ Coverage Diff @@
## master #2918 +/- ##
==========================================
+ Coverage 36.88% 37.25% +0.37%
==========================================
Files 111 110 -1
Lines 9487 9365 -122
==========================================
- Hits 3499 3489 -10
+ Misses 5666 5564 -102
+ Partials 322 312 -10
Continue to review full report at Codecov.
|
Signed-off-by: xixi <i@hexilee.me>
Signed-off-by: xixi <i@hexilee.me>
Signed-off-by: xixi <i@hexilee.me>
Signed-off-by: xixi <i@hexilee.me>
/run-e2e-tests |
Signed-off-by: xixi <i@hexilee.me>
Signed-off-by: xixi <i@hexilee.me>
/run-e2e-tests |
1 similar comment
/run-e2e-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-e2e-tests |
Signed-off-by: xixi <i@hexilee.me>
Signed-off-by: xixi <i@hexilee.me>
Signed-off-by: xixi <i@hexilee.me>
Signed-off-by: xixi <i@hexilee.me>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! ❤️
My only concern is about the process lifecycle managed in BackgroundProcessManager
: from starting a process to waiting for a process dead, the logic is splited into several methods. IMO, it would be much helpful if we could split the lifecycle of the process out from the uid things.
If you do not want to do more things in this PR in the future, I think we could merge it. @Hexilee |
/merge |
This pull request has been accepted and is ready to merge. Commit hash: 1138ef8
|
Do you mean we shouldn't recycle processes by the uid? |
We should recycle the uid, else we would cause a resource leak. But I find that the path from |
…remove blockingBuffer (chaos-mesh#2918) * support uuid Signed-off-by: xixi <i@hexilee.me> * chaosdaemon fits new bpm Signed-off-by: xixi <i@hexilee.me> * save chaosdaemon/server Signed-off-by: xixi <i@hexilee.me> * make check Signed-off-by: xixi <i@hexilee.me> * add some comments Signed-off-by: xixi <i@hexilee.me> * fix stress chaos Signed-off-by: xixi <i@hexilee.me> * deprecate stdio lock and blockingbuffer Signed-off-by: xixi <i@hexilee.me> * remove buffers in bpm Signed-off-by: xixi <i@hexilee.me> * make check Signed-off-by: xixi <i@hexilee.me> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
* Support cgroup v2 for linux stress experiments Signed-off-by: Igor Cherkaev <igor.cherkaev@copart.com> * Update CHANGELOG.md Signed-off-by: Igor Cherkaev <igor.cherkaev@copart.com> * Trigger verify action when helm chart is modified (#2937) Signed-off-by: SiyuChen <ryougi201@gmail.com> * ci: fix `upload-image` action (#2935) * chore: move fake_image to linux_amd64 Signed-off-by: SiyuChen <ryougi201@gmail.com> * chore: fix upload image Signed-off-by: SiyuChen <ryougi201@gmail.com> * fix: remove deafult project env Signed-off-by: SiyuChen <ryougi201@gmail.com> * chore: add FakeImage to fake_image Signed-off-by: SiyuChen <ryougi201@gmail.com> * chore: update e2e image name Signed-off-by: SiyuChen <ryougi201@gmail.com> * chore: fix e2e test Signed-off-by: SiyuChen <ryougi201@gmail.com> * chore: fix e2e test Signed-off-by: SiyuChen <ryougi201@gmail.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io> * fix(ui): pod phases should be first letter capitalized (#2915) * fix(ui): pod phases should be first letter capitalized Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: unify label case Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io> * feat: OpenAPI to TypeScript API Client and Form Data (#2770) * feat: OpenAPI to TypeScript API Client and Forms Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: license checker Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: go (verify) Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: add comments Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: reuse kubebuilder marks Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: update according to RFC Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: prevent ui:form appearing in crds Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: add missing actions Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: update lockfile Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: support PhysicalMachineChaos Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: merge commands into codegen Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: license checker Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: ci Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: supplement another ignore check Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: update package info Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: update README Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: update Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: gen HTTPChaos Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * chore: update changelog Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: remove nested ignores Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: remove remaining markers Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: typo Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> * fix: supplement comments Signed-off-by: Yue Yang <g1enyy0ung@gmail.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io> * fix(workflow): no more event after accomplished (#2911) Signed-off-by: imlonghao <git@imlonghao.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io> * helm: add TTL configuration on value.yaml and update helm docs (#2921) * helm: update archived data's ttl Signed-off-by: cwen0 <cwenyin0@gmail.com> * update helm doc Signed-off-by: cwen0 <cwenyin0@gmail.com> * fix comments Signed-off-by: cwen0 <cwenyin0@gmail.com> * fix comments Signed-off-by: cwen0 <cwenyin0@gmail.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io> * fix wrong comment on workflow Dashboard HTTP API (#2947) * fix wrong comment on workflow Dashboard HTTP API Signed-off-by: xiang <xiang13225080@163.com> * update swagger Signed-off-by: xiang <xiang13225080@163.com> * api: export one module for a group of api (#2824) * move v1alpha1 module to api Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * make check Signed-off-by: YangKeao <yangkeao@chunibyo.icu> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io> * fix: replaced localhost:5000 with ghcr.io as registry name (#2919) * fix: replaced localhost:5000 with ghcr.io as registry name when building images Signed-off-by: Rohan Kumar <rohankmr414@gmail.com> * import fix Signed-off-by: Rohan Kumar <rohankmr414@gmail.com> * import fix Signed-off-by: Rohan Kumar <rohankmr414@gmail.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io> * refactor bpm: generate uid for each process; remove identifier lock; remove blockingBuffer (#2918) * support uuid Signed-off-by: xixi <i@hexilee.me> * chaosdaemon fits new bpm Signed-off-by: xixi <i@hexilee.me> * save chaosdaemon/server Signed-off-by: xixi <i@hexilee.me> * make check Signed-off-by: xixi <i@hexilee.me> * add some comments Signed-off-by: xixi <i@hexilee.me> * fix stress chaos Signed-off-by: xixi <i@hexilee.me> * deprecate stdio lock and blockingbuffer Signed-off-by: xixi <i@hexilee.me> * remove buffers in bpm Signed-off-by: xixi <i@hexilee.me> * make check Signed-off-by: xixi <i@hexilee.me> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io> * log: add customized logger for daemon and bpm (#2902) * add customized logger for daemon and bpm Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * extract the grpc metadata into the context Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * make check Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * make check Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * remove log parameter in killIOChaos Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add logs Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * remove log parameter in httpchaos server Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add the context comments for process builder Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * make check Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * use the logger in arguments Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add back boilerplate Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * fix make check Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add auto compile github action (#2948) * add auto compile github action Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * modify the name of action Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * remove comment expression Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add comments validation Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * rename the zst file name Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * auto detect buildenv and devenv Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * fix artifact url Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add debug information Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add cache and debug artifact url Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * fix bash script Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * modify cache step name Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * use another docker buildx driver Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add download bash script Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add boilerplate Signed-off-by: YangKeao <yangkeao@chunibyo.icu> * add some comments and help message Signed-off-by: YangKeao <yangkeao@chunibyo.icu> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io> * Support cgroup v2 for linux stress experiments Signed-off-by: Igor Cherkaev <igor.cherkaev@copart.com> * Update auto-generated code Signed-off-by: Igor Cherkaev <igor.cherkaev@copart.com> Co-authored-by: Siyu Chen <ryougi201@gmail.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io> Co-authored-by: Yue Yang <g1enyy0ung@gmail.com> Co-authored-by: imlonghao <git@imlonghao.com> Co-authored-by: CWen <cwenyin0@gmail.com> Co-authored-by: WangXiang <xiang13225080@163.com> Co-authored-by: YangKeao <yangkeao@chunibyo.icu> Co-authored-by: Rohan Kumar <rohankmr414@gmail.com> Co-authored-by: xixi <i@hexilee.me>
…remove blockingBuffer (chaos-mesh#2918) * support uuid Signed-off-by: xixi <i@hexilee.me> * chaosdaemon fits new bpm Signed-off-by: xixi <i@hexilee.me> * save chaosdaemon/server Signed-off-by: xixi <i@hexilee.me> * make check Signed-off-by: xixi <i@hexilee.me> * add some comments Signed-off-by: xixi <i@hexilee.me> * fix stress chaos Signed-off-by: xixi <i@hexilee.me> * deprecate stdio lock and blockingbuffer Signed-off-by: xixi <i@hexilee.me> * remove buffers in bpm Signed-off-by: xixi <i@hexilee.me> * make check Signed-off-by: xixi <i@hexilee.me> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
What problem does this PR solve?
Currently, the bpm locate managed processes by both PID and create time, because the OS may reuse PID, we must check the create time to avoid locating the wrong process.
However, the two-step locating is messy and the create time may be imprecise (we have fixed a relevant bug). In this PR, we generate random UID for each process to avoid id-reusing.
For example, to locate and kill a process, current code is complex:
In this PR, we do that in the following code:
Moreover, the identifier lock caused some dead-lock issues, like #2660 and #2661, so we remove the lock, in this PR, applying requests will fail fast by conflicts of identifiers instead of running into dead-locked.
What's changed and how it works?
To keep compatible with v2.0 and v2.1, current we need a mapping between the
pid pair
(pid and create time) and theUID
.In v2.2, we should store
UID
in the status of CRDs (like podiochaos, podhttpchaos and stresschaos), then deprecate theinstance
(PID) andstartTime
(create time). In future versions like v3, we can remove theinstance
(PID) andstartTime
(createTime) from CRDs.Others
sync.Map
os.Pipe
Related changes
chaos-mesh/website
Dashboard UI
Checklist
Tests
Side effects
Release note
DCO
If you find the DCO check fails, please run commands like below (Depends on the actual situations. For example, if the failed commit isn't the most recent) to fix it: