New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
flux-jobs: improve recursive job listing #4024
Conversation
4c12325
to
c04b07f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, there are a lot of great improvements in here! Approving, just a couple of minor comments.
src/bindings/python/flux/job/info.py
Outdated
@@ -191,7 +191,7 @@ def __getattr__(self, attr): | |||
raise AttributeError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit message for c4a3555, missed a "t" in "AtributeError".
doc/man1/flux-jobs.rst
Outdated
**--recurse-all** | ||
By default, jobs not owned by the user running ``flux jobs`` are | ||
skipped with ``-R, --recurseive``. This option forces the command | ||
to attempt to recurse into the jobs of other users. Implies | ||
``--recursive``. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a note that "By default, Flux instances can only permit the instance owner to connect"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea! I added that comment, rebased on current master and pushed the result.
c04b07f
to
78d0cac
Compare
Problem: In a multi-user instance, flux-jobs -R, --recursive will attempt to recurse into all jobs found in the initial listing. If the -A or --user=all options are used, then many of these jobs may be for different users, which almost certainly will result in failures to connect, greatly slowing the program down as each flux_open(3) fails. Only recurse into jobs owned by the current user by default, unless a new --recurse-all option is provided.
Problem: The --recurse-all option is not documented in the flux-jobs man page. Add documentation for this option.
Problem: flux-jobs -R, --recursive currently queries recursive job lists serially, which can be quite slow especially if one or more connections time out or get errors. Use a ThreadPoolExecutor to run all recursive job-list queries.
Problem: It may be useful to allow user to tune the max number of worker threads when flux-jobs uses a ThreadPoolExecutor. Add a --threads option to set the max_workers option in the ThreadPoolExecutor constructor.
Problem: The flux-jobs --threads option is not documented. Add a section describing the --threads option to the flux-jobs(1) manpage.
Problem: t2800-jobs-recursive.t doesn't test handling of jobs owned by users other than the current user and the --recurse-all option. Add support for running a fake job as an alternate user to this test, and ensure that this job is skipped by default for recursion, even with -A, and recursion on this job is attempted wtih --recurse-all.
78d0cac
to
fe89e96
Compare
Rebased after #4022 landed. Ok to set MWP? |
went ahead and set MWP. |
Codecov Report
@@ Coverage Diff @@
## master #4024 +/- ##
==========================================
+ Coverage 83.32% 83.36% +0.04%
==========================================
Files 373 373
Lines 61060 61078 +18
==========================================
+ Hits 50877 50918 +41
+ Misses 10183 10160 -23
|
This PR makes some improvements to recursive job listing by
--recurse-all
is used.--threads
option is added to optionally adjust themax_workers
value for this recursive jobs as well as instance-info gathering ThreadPools.This is based on top of #4022