-
Notifications
You must be signed in to change notification settings - Fork 314
Add resource leak integration test #6009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
87b60f3 to
1c7e6db
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #6009 +/- ##
========================================
Coverage 90.21% 90.21%
========================================
Files 180 180
Lines 15822 15822
========================================
Hits 14274 14274
Misses 1548 1548
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
f5025f1 to
fa13db4
Compare
|
|
||
|
|
||
| @pytest.mark.usefixtures("instance", "os", "scheduler") | ||
| def test_resource_leaks( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we avoid having a specific integration test for it? and instead add this logic to check file descriptors and resource leak by default in all the tests. I mean, we don't need to wait 30 minutes in every test, but we can do the check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the PR to add the logic in functions in utils.py and called it in the starccm test. Unfortunately I don't think we can add it by default to all the tests unless we add the assert in cluster creation/deletion, which I think we'd want to avoid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long does the starccm test run typically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
~20 mins, which is long enough for the resource leak test to fail. I tested running this starccm test with the resource leak issue here and it failed as expected.
fa13db4 to
1e15161
Compare
| return result.stdout.strip() | ||
|
|
||
|
|
||
| def check_file_handler_leak(remote_command_executor, slurm_commands, cluster, region): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this function name best describes what this is doing. It seems like this just gets the number of open files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I renamed it to be get_compute_ip_to_no_files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think the term no is less explicit than num to abbreviate the number of files. no has more binary implications or boolean implications
da28e00 to
5f04eda
Compare
Signed-off-by: Judy Ng <njud@amazon.com>
Signed-off-by: Judy Ng <njud@amazon.com>
5f04eda to
599feb1
Compare
* Add resource leak check into util, call checks in starccm test Signed-off-by: Judy Ng <njud@amazon.com>
Description of changes
Tests
References
Checklist
developadd the branch name as prefix in the PR title (e.g.[release-3.6]).Please review the guidelines for contributing and Pull Request Instructions.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.