-
Notifications
You must be signed in to change notification settings - Fork 6.3k
cmake/AddCephTest: bind crimson unittest to different cores #55328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
jenkins test make check arm64 |
|
For x86 make check, the longest crimson unittest job After(x86) |
|
@ljflores @athanatos @Matan-B @rzarzynski @cyx1231st Please help to review, thanks. |
|
@rosinL So different crimson unittest processes are being bound to core 0? I'd have expected --smp 0 to limit any particular process to a single reactor, but not to limit it to any specific core. |
|
jenkins test make check arm64 |
|
@athanatos As I test, with --smp 0 , the test will report |
|
I run 2 crimson unittest jobs parallelly With |
|
@athanatos If there are other better solutions, please let me know, thanks. |
|
jenkins test make check arm64 |
e6ef226 to
f1fa30c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rosinL Hi Rixin, thank you for your efforts to improve the efficiency of crimson tests. i'd suggest taking a look at https://cmake.org/cmake/help/latest/manual/ctest.1.html#ctest-resource-allocation, and consider CPU shards as a resource. actually, the same applies to the memory.
|
@tchaikov I will do some research and have a try |
|
jenkins test make check arm64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The --cpuset configuration policy looks good to me to distribute the crimson tests to all the available cores in a round-robin way.
|
@tchaikov According my understand, we need |
|
jenkins test api |
|
As make check(arm64) running pretty slow, Can we merge this or move a step forward if needed? @ljflores |
ctest populates the allocated resource using environment variables like |
@tchaikov Do you mean like this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left couple comments. also, please explain the motive in the related commit message. and add paste the test results before and after the change to show the improvement contributed by this change.
60331be to
e786793
Compare
|
jenkins test make check |
|
jenkins retest this please |
|
For x86 irvingi07 with 24 cores, the Total Test time keep the same as the long tail job After |
There are some older Arm server running pretty slow, the make check jobs like `check-generated.sh` are killed as the job timeout. Make CEPH_TEST_TIMEOUT more longer. Signed-off-by: luo rixin <luorixin@huawei.com>
e786793 to
2e124b5
Compare
tchaikov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rosinL Rixin, we are close. lgtm modulo couple nits.
When running crimson unittest, the seastar framework always use and only use cpu0, and with many parallel crimson unittest jobs, all the jobs are running on cpu0, the other cpu cores can't used, make the make check run very slow, even timeout happens. Use set_property RESOURCE_GROUPS to specify cpu resources to crimson unittest, and accelerate make check running. Fixes: https://tracker.ceph.com/issues/64117 Co-authored-by: Kefu Chai <tchaikov@gmail.com> Signed-off-by: luo rixin <luorixin@huawei.com>
Co-authored-by: Kefu Chai <tchaikov@gmail.com> Signed-off-by: luo rixin <luorixin@huawei.com>
Co-authored-by: Kefu Chai <tchaikov@gmail.com> Signed-off-by: luo rixin <luorixin@huawei.com>
2e124b5 to
7fe2323
Compare
tchaikov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
|
@athanatos hey Sam, do you want to take a final look? |
|
Can we merge this directly? As this pr only affect unittest and we have no qa suite to test this. |
|
yes, agreed. merged. |
When running crimson unittest, the seastar framework always
use and only use cpu0, and with many parallel crimson unittest
jobs, all the jobs are running on cpu0, the other cpu cores
can't used, make the make check run very slow, even timeout
happens. Use set_property RESOURCE_GROUPS to specify cpu resources
to crimson unittest, and accelerate make check running.
Fixes: https://tracker.ceph.com/issues/64117
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e