New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add flux-pstree command #4026
add flux-pstree command #4026
Conversation
This pull request introduces 1 alert when merging a701c17 into 412831f - view on LGTM.com new alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks awesome, just some initial few comments
src/bindings/python/flux/util.py
Outdated
): | ||
if pstack is None: | ||
pstack = [] | ||
if level and level < len(pstack): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be level > len(pstack)
? just doesn't seem right to be <
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we want to stop recursion when the actual depth (represented by the length of pstack
) is greater than the requested level
. Perhaps the variable names are confusing here, I'll fix that (level
is the requested maximum depth to recurse, not the current level of the tree recursion)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahhh, i think that was it, I was thinking current level not max level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed level
to max_level
in _render()
in the most recent fixup command (and reversed the sense of the test, you were right it was pretty confusing)
label = f"{self.count}*[{self.label}]" | ||
else: | ||
label = self.label | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
took me awhile to grok this recursion. I'm sure it was hard to write :-)
I think the pstack[:-1]
sort of threw me for a loop b/c we're not using the last piece of the pstack, so it sort of doesn't make sense until you walk yourself through it a bit. Like one level of recursion down, the append to the pstack isn't used at all.
I'm sure some comments would help.
One random idea (possibly not good, but just brainstorming), since the last append of the pstack
isn't used when you recurse only one level down, instead of appending to pstack
, how about just pass a variable down. Like hypothetically:
def _render(
self,
pstack=None,
next_in_pstack=None
connector="",
style="box",
level=None,
truncate=None,
):
So like if next_in_pstack != None
then you append to pstack, then set next_in_pstack
to the next recursive step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it help if you think of the stack as the stack of characters required in before the tree connector if you need to recurse another level down? Your suggestion to pass the next character required as a separate argument seems good, but I don't see much difference between the current approach and that (except adding more parameters to the function). Perhaps some comments will help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, some comments would help. The way I eventually thought about it is that you are passing a connector and an additional prefix into the next level of recursion. It just so happens that the additional prefix is never used in the next recursive level, but is only used 2 recursions later.
Another brainstorming idea (again, possibly not good), you could do something like:
if pstack[-1] == connectors.TEE
pstack[-1] = connectors.PIPE
pstack.append(connects.TEE)
kinda thing in the recursive step so that the whole pstack
is just output everytime you recurse. I'm not sure if it's a net win logic wise or not. Perhaps simplifying the output code but complicating the recursive step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, it feels like we're getting into a matter of opinion and style here and I'd prefer to keep the code the way I wrote it since I understand that approach a little better. Let me push my fixup commit with additional comments and see what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
acknowledged, its a style thing at this point. The comments you added definitely help.
src/bindings/python/flux/util.py
Outdated
|
||
self.duplicates = {} | ||
self.children = [] | ||
self.count = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would suggest combine_count
or duplicate_count
or something like that for this variable, as I thought we might be counting levels or something. Below
if self.count > 1:
label = f"{self.count}*[{self.label}]"
was initially confusing, I had no idea what was going on until I figured out "ohh this is the duplicate count label".
perhaps increment
below would be renamed as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems like a valid suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in most recent push
src/bindings/python/flux/util.py
Outdated
for child in self.children: | ||
if child == self.children[-1]: | ||
pstack.append(connectors.BLANK) | ||
connector = connectors.ELBOW |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if connector
is the best variable since you have the connectors
variable. Although unsure of a better variable name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll try to think of something better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed connectors
to cstyle
in latest fixup commit.
src/cmd/flux-pstree.py
Outdated
parser.add_argument( | ||
"--no-skip-root", | ||
action="store_false", | ||
dest="skip_root", | ||
help="Include root in output", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very first time I ran this I got:
flux
flux
and I thought something was broken. But running flux pstree --no-skip-root
, I got what I thought I would see
.
├── flux
└── flux
dunno if default should be with root?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I struggled with this as well, and had it the way you suggest for awhile. The problem I ran into is that including root doesn't add much useful data, except when you catch a couple instances before they have run any jobs as you have here, and makes it more likely the tree will overflow the terminal.
However, maybe my contrived examples were deeper than in the common case, so possibly it won't be a problem in reality.
Another idea would be to include the "root" in the normal case, but exclude in when printing "detailed" information, in which case the output would be less likely to be confusing and we'd be more likely to exceed the current terminal width.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another idea would be to include the "root" in the normal case, but exclude in when printing "detailed" information, in which case the output would be less likely to be confusing and we'd be more likely to exceed the current terminal width.
I like this idea.
In time perhaps a jobid input would go along with flux pstree
? In that case I could see skip-root
being normal (sort of like when you do pstree PID
). But for the general listing of all jobs of a user, listing root seems right? Sort of like the default output of pstree
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flux pstree
can already take a list of JOBIDs.
The real problem with "root" is that we can't easily create a JobInfo
object for it, and for test instances or root instances run under a foreign RM there isn't even a jobid, so the comparison to pstree
falls flat there. However, since the root is handled specially anyway and the label is always forced to .
, maybe all that is fine.
I can also try to make a "fake" JobInfo object for the enclosing instance and see how that goes.
This pull request introduces 1 alert when merging b6efea6 into 0bc2092 - view on LGTM.com new alerts:
|
src/bindings/python/flux/util.py
Outdated
truncate=None, | ||
): | ||
|
||
# `pstack` is a stack of prfix characters used during |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo prfx
src/bindings/python/flux/util.py
Outdated
# | ||
# Note: the connector at the top of the stack is always | ||
# the connector required for the _next_ level down in the | ||
# tree. This is because we know what the prefx connector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo prefx
label = f"{self.count}*[{self.label}]" | ||
else: | ||
label = self.label | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
acknowledged, its a style thing at this point. The comments you added definitely help.
@chu11: The most recent fixup commit here does the following
Let me know what you think. |
After testing on fluke I found that To remedy, I added support for using a ThreadPoolExecutor to |
It's lookin pretty good! One high level comment. Do you imagine the Second high level comment (now that I read the above), |
My first thought was to redo Additionally, having the class in the bindings allows easier unit testing.
That's a good question. Since Python namespaces everything, the class is actually |
Problem: There is currently no way to display tree-like information in the Python bindings. Add flux.util.Tree, a simple helper class for the display of tree-like data, with optional support for combining identical leaf nodes, as in pstree(1).
Problem: Flux doesn't have a tool to effectively display and list information for job hierarchies. Add a `flux pstree` command, which by default displays a tree of running jobs by job name, similar to the pstree(1) system command. Like pstree(1), identical leaves of the tree are combined, which should result in condensed output when many jobs share the same name. The flux-pstree(1) command supports custom node labels, which accept any `JobInfo` format fields (i.e. anything supported by flux-jobs(1)), and "parent" nodes (i.e. Flux instances) may be labeled separately from simple jobs. The command only lists actively running jobs by default, but similar to flux-jobs(1) supports an `-a, --all` option to query jobs in all states for the current user. In the case `-a` is used, then the label will automatically include the job state, unless separately overridden on the commandline. flux-pstree(1) additionally supports listing of extended information before the tree display with the `-x, --extended` and `-d, --details=NAME` options, e.g.: $ flux pstree -x JOBID USER ST NTASKS NNODES RUNTIME ƒ3KrufpK grondo R 1 1 13.61s flux ƒUXhfWP grondo R 1 1 11.77s └── flux ƒUAwqZZ grondo R 1 1 10.03s └── flux ƒ2DrcgAs grondo R 1 1 6.400s └── sleep Several builtin detail formats are available via the `-d, --details` option, e.g. progress, resources, and stats. $ flux pstree --details=stats JOBID STATS RUNTIME ƒ3KrufpK PD:2 R:1 CD:0 F:0 0:01:24 flux ƒUXhfWP PD:0 R:1 CD:0 F:0 0:01:22 └── flux ƒUAwqZZ PD:934 R:1 CD:65 F:0 0:01:20 └── flux ƒ2LrqHPN 0:00:01 └── sleep A custom details format can be supplied with the `--detail-format=FORMAT` option. Trees are truncated if they exceed the current value of the COLUMNS environment variable, or the terminal size if COLUMNS is not set. To disable this behavior, a `-l, --long` option is provided. By default, the enclosing instance, or root of the tree, is skipped when -x, --extended and/or -d, --details=NAME options are used, but the root is displayed when flux-pstree(1) is run without these options. This behavior can be changed with `--skip-root=[yes|no]`. See `flux pstree --help` for other available options.
Problem: The limit of 4 characters for !P conversion spec causes issues in boundary conditions such as 0.05% and 99.9%. Update the conversion to allow one digit more precision. This causes the field to now take 5 characters instead of 4. As a side benefit, CORE% can now be used as the default heading instead of CPU%, which could be seen as slightly misleading. Update documentation. Update tests.
acc31be
to
8029f61
Compare
Problem: The flux.util.Tree class doesn't have any unit tests. Add python/t0026-tree.py to test basic functionality of the Tree class.
I've added some tests and a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall LGTM, just a few comments below
@@ -0,0 +1,231 @@ | |||
============== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this manpage worthy of being added to flux help output? I would think so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure at this point if it rose to that level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its certainly a "user tool", so not in the same class as lets say flux keygen
. But it's clearly a more advanced user tool. I dunno, I would have leaned yes, but it's definitely borderline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and added it. Thanks for the suggestion!
SEE ALSO | ||
======== | ||
|
||
:man1:`flux-jobs` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should add pstree
to flux jobs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Will do.
t/t2803-flux-pstree.t
Outdated
test_expect_success 'flux-pstree truncates at COLUMNS' ' | ||
name="truncated" && | ||
COLUMNS=16 flux pstree -a > ${name}.output && | ||
test_debug "cat ${name}.output" && | ||
cat <<-EOF >${name}.expected && | ||
. | ||
├── flux | ||
│ ├── flux | ||
│ │ └── sle+ | ||
│ └── flux | ||
│ └── sle+ | ||
└── flux:CD | ||
EOF | ||
test_cmp ${name}.expected ${name}.output | ||
' | ||
test_expect_success 'flux-pstree truncates at COLUMNS' ' | ||
name="truncated" && | ||
COLUMNS=16 flux pstree -a > ${name}.output && | ||
test_debug "cat ${name}.output" && | ||
cat <<-EOF >${name}.expected && | ||
. | ||
├── flux | ||
│ ├── flux | ||
│ │ └── sle+ | ||
│ └── flux | ||
│ └── sle+ | ||
└── flux:CD | ||
EOF | ||
test_cmp ${name}.expected ${name}.output | ||
' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicated test? (or if its different and i'm missing the difference, the title of the test is the same atleast)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Accidental cut-and-paste. Will fix.
Problem: The flux-pstree command is not tested in the testsuite. Add t2803-flux-pstree.t with a suite of tests for flux-pstree(1).
Problem: No documentation for flux-pstree exists. Add a first-cut flux-pstree(1) man page.
Problem: The flux-pstree(1) command offers a different way to view Flux jobs recursively, but is not referenced in flux-jobs(1). Add a reference to flux-pstree(1) in the SEE ALSO section of flux-jobs(1).
Ok, I've addressed @chu11's comments and will now set MWP. |
Codecov Report
@@ Coverage Diff @@
## master #4026 +/- ##
==========================================
+ Coverage 83.35% 83.40% +0.05%
==========================================
Files 373 374 +1
Lines 61122 61356 +234
==========================================
+ Hits 50948 51177 +229
- Misses 10174 10179 +5
|
This PR adds a
flux-pstree(1)
command which offerspstree(1)
like output for Flux job hierarchies.This is still a WIP because there isn't yet any documentation or tests. It seemed a good idea to get some early feedback before taking a pass at either, in case the interface needs to be changed. It is built on top of the two pending PRs #4024 and #4022.
Currently this command only targets jobs for the calling user, as it didn't seem too useful to attempt to list other user's jobs at this time. Perhaps in the future a
-A, --user=all
option could be added if necessary.The commit that adds the command has a good overview, so I won't repeat that here. Mainly I'll just show some examples:
You can watch a complex hierarchy form using
watch -n1 flux pstree
(here I've used the--ascii
option since Unicode box connectors don't show up withasciicast2gif
: