-
Notifications
You must be signed in to change notification settings - Fork 396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tests for cgroup subsystem availability #6520
Add tests for cgroup subsystem availability #6520
Conversation
@babsingh Most of the test code is copied from the implementation. If I were to continue to add tests this way there will be a lot of test code, especially for the iterator functions where I would have to copy over the structs as well. However there is no way to check if the api returns correct values other than reading from the cgroup and cgroup metric files. Would it be better to just check that the values returned by the api make sense rather than checking if they are the same as the values in the cgroup files? |
@EricYangIBM Copying code from the actual implementation is not the only path. In the tests, we just need an alternate implementation to derive the values from cgroup and cgroup metric files for verification purposes. Since this is test code, we can incorporate more expensive approaches such as using C++ libraries and third-party libraries to achieve our goals. This would reduce the amount of work to derive the values from cgroup and cgroup metric files. Example libraries which can be used for verification:
Copying code from the actual implementation also has a drawback: if our actual implementation has a bug it won't be caught in our tests. I will leave it up to you on which libraries to incorporate in the test code. We want to choose an approach which will minimize test code and our effort. For cgroup, we only need something that works on all Linux platforms (x, p, z, arm, arch). But, it will be preferable to use libraries which will work on all OMR supported platforms. This will allow your test examples to be used by others in the future due to their platform agnostic property.
This won't offer complete verification. There is a probability that the values returned by the API may seem valid (make sense) but they can still be incorrect. |
I haven't been able to find any third party libraries to match our api. If we use the c++ standard libraries the test code will still have the same logic as our api's and will have similar complexity and amount of code. |
@EricYangIBM Can you provide more details on how the values will be checked? We can get a second opinion if this approach will be sufficient. @pshipton @tajila for a second opinion and ideas for easily achieving complete verification. Note: Values generated by some of the cgroup API can be used by autoscalers which rely upon information about system resources (CPU, memory) to make decisions. Having incorrect values will lead to bad decisions in terms of perf, or potentially fatal decisions if the resource limit is exceeded. |
Maybe something like https://github.com/eclipse/omr/blob/974dee8597698debd9babcd12af9d4aa4b207cb4/fvtest/porttest/si.cpp#L1245? It checks that no errors are returned and that the memory limit is >0 and the usage is < limit. There will be no way to check some of the metrics e.g. Throttled count, oom_kill_disable (a boolean). |
We can prioritize metric values related to perf (CPU, memory) with stricter verification. We can minimize work by using a set of generic functions which will read a specified file and apply a regex to derive the required value. Others can have simple bound checks as the For the
Sure. All cgroup tests will be consistent i.e. written in C++. It will be an alternate impl. The drawback of reusing the current impl won't apply. It should be less code. Plus, you will get C++ experience. |
191eba3
to
4ede54e
Compare
Added tests |
9b568ad
to
c3adfdb
Compare
|
Yes |
jenkins build all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests are failing: https://ci.eclipse.org/omr/job/PullRequest-linux_x86/2795/consoleFull
14:18:25 6: [ FAILED ] CgroupTest.sysinfo_cgroup_get_available_subsystems
14:18:25 6: [ FAILED ] CgroupTest.sysinfo_cgroup_are_subsystems_available
14:18:25 6: [ FAILED ] CgroupTest.sysinfo_cgroup_enable
Infra changes, for running tests on cgroup v1/v2 systems, are still WIP: #6525.
For the time being, can you locally verify if these tests pass on both cgroup v1 and v2 systems?
Also, the commits can be squashed.
3ffc941
to
e3d0d91
Compare
jenkins build linux |
jenkins build xlinux |
jenkins build linux |
jenkins build all |
@EricYangIBM Not everyone can launch PR builds. Message me on Slack, and I will launch them for you. |
I think the error is in the regex instantiation: |
All the machines, where the tests are failing, have:
This @jdekonin @AdamBrousseau We were updating machines to use a newer Ubuntu OS and newer gcc/g++. What's the state of this work (expected timeline)? Are we planning to adopt gcc/g++-11? |
We have two machines:
These machines have CHANGE the below line: https://github.com/eclipse/omr/blob/dc315bf5cbe9bd629da73dcb186c77e0753ba3a1/buildenv/jenkins/omrbuild.groovy#L307 TO run tests on the cgroup v1 machine:
TO run tests on the cgroup v2 machine:
|
jenkins build all |
jenkins build x32linux |
fyi @jdekonin The following failures were seen on Failures
Failing build: https://ci.eclipse.org/omr/job/PullRequest-linux_x86/2798/consoleFull Resolved build (after installing There are other failures seen for the resolved build:
|
Thanks @babsingh. I've updated ub20-x64-omr3 -> ub20-x64-omr6 as well. Those are pending a jiro PR merge that should go in over the weekend. |
@babsingh the test failure is in porttest
Is this the same failure you saw in #6525 (comment)? |
Current test output, from https://ci.eclipse.org/omr/job/PullRequest-linux_x86/2799/consoleFull, does not indicate the failing test. So, verifying via the PR build in #6533; it has Also,
|
jenkins build all |
3fe8478
to
db999a8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
omrsysinfo_get_cgroup_subsystem_list
looks good to me.- Both commits mention:
add test for omrsysinfo_get_cgroup_subsystem_list
. This should be fixed.
reportTestExit(OMRPORTLIB, testName); | ||
return; | ||
} | ||
#endif /* !defined(LINUX) || (GTEST_GCC_VER_ >= 40900) */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should send a message to the OMR user that the Cgroup tests are disabled. Once all the machines in the OMR build-farm have newer compilers, we should attempt to change this to an #error
.
#endif /* !defined(LINUX) || (GTEST_GCC_VER_ >= 40900) */ | |
#else /* !defined(LINUX) || (GTEST_GCC_VER_ >= 40900) */ | |
#warning "Cgroup tests are disabled due to an unsupported compiler." | |
#endif /* !defined(LINUX) || (GTEST_GCC_VER_ >= 40900) */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing locally:
error: #warning "Cgroup tests are disabled due to an unsupported compiler." [-Werror=cpp]
#warning "Cgroup tests are disabled due to an unsupported compiler."
Won't the builds fail with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ref: https://gcc.gnu.org/onlinedocs/gcc-5.3.0/gcc/Warnings-and-Errors.html
Warnings report other unusual conditions in your code that may indicate a problem, although compilation can (and does) proceed. Warning messages also report the source file name and line number, but include the text ‘warning:’ to distinguish them from error messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are compiling with -Werror
, which converts all warnings to errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work?
#pragma message("Cgroup tests are disabled due to an unsupported compiler.")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me locally but may not be supported on older gcc. Pushed the changes, can you try running the builds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what happened with the windows build but the ppc failures also appear in other PRs
Add tests for omrsysinfo_cgroup_is_system_available, get_available_subsystems, are_subsystems_available, get_enabled_subsystems, enable_subsystems, and are_subsystems_enabled. Also add test for omrsysinfo_get_cgroup_subsystem_list. Issue: eclipse-omr#1281 Signed-off-by: Eric Yang <eric.yang@ibm.com>
db999a8
to
343a93a
Compare
jenkins build all |
x86 and PPC are known failures:
Window's build crashed very quickly (3.68 sec). Probably related to infra. Let me see if it goes away with a rerun. |
jenkins build win |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @0xdaryl Passing the PR to you for review/merge.
Add tests for omrsysinfo_cgroup_is_system_available, get_available_subsystems,
are_subsystems_available, get_enabled_subsystems, enable_subsystems, and
are_subsystems_enabled. Also add test for omrsysinfo_get_cgroup_subsystem_list.
Issue: #1281
Signed-off-by: Eric Yang eric.yang@ibm.com