-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rewrite flux-resource status #4997
Conversation
Some examples of changed output: Offline nodes now reported separately for each "state":
After
After
No more "all" state, but "all" selects all states even if empty: Before:
After:
online/offline act as filters, not a state in and of themselves:
Note you can get similar output with a format that displays the status instead of state:
|
Great! The one thing that gives me pause is listing nodes that are turned off as
Before
After
Maybe slightly more informative?
Aside, I couldn't get
|
Ah, sorry, I forgot to add a header for I also don't like |
How about UP with y/n values? |
Ok, I added BTW, I would have renamed |
Hm, what about something like this? $ flux resource status -o "{state:>8} {st:>2} {nnodes:>4} {nodelist}"
STATE ON NNODES NODELIST
avail ✓ 2 asp,asp
avail* ✗ 1 asp
drained ✓ 1 asp |
I like it! |
Sorry I missed that comment initially. That seems fine to me. |
I think I like yours more but either way works for me. |
|
I'll add a commit that makes this the new default? $ flux resource status -o "{statex:>8} {up:>2} {nnodes:>6} {nodelist}"
STATE UP NNODES NODELIST
avail y 2 asp,asp
avail n 1 asp
drained y 1 asp Edit: fixed the format |
Sounds good to me. |
Ok I pushed the updated default format but I'll still have to fix all the tests :-( |
Problem: The ability to easily copy an instance of the OutputFormat class, possibly with some fields removed, would be useful when utilities wish to attempt deduplication of similar lines - but this functionality does not exist. Add a copy method to the OutputFormat class with an optional `except_fields` parameter. This allows copying of format strings with some fields removed.
Problem: It is probably not useful to print a blank line in the OutputFormat print_items() method. Skip blank lines in flux.util.OutputFormat.print_items().
Problem: The combination of like lines in formatted output is a recurring use case in utilities that output line-based output, e.g. the flux-resource list, status and drain commands. However, each utility implements the deduplication of lines in their own way. Attempt to add a reusable class that automates deduplication of similar lines based on formatted output. Given an output formatter and a function used to combine similar items, the Deduplicator class simply combines items where the supplied formatter (minus any excluded fields) renders the exact same line. The class can then be iterated like a list to output the combined items.
Problem: Functions for accessing the scheduler's view of resources are present in the Python API, but no such functionality exists for obtaining the resource module view of resource, including drained and excluded ranks. Add a new function flux.resource.resource_status() which returns a ResourceStatus object containining a combination of information from the resource.status and sched.resource-status RPCs. The resulting object can be used to get idsets of ranks in various states, e.g. online, offline, excluded, allocated, drained, draining, along with any drained rank extra information such as timestamp and reason.
Problem: The `flux resource status` subcommand is awkward to use in many situations, since it conflates resource "state" (i.e. available, excluded, drained), with online/offline status. This results in nodes appearing in multiple lines, which can be confusing, and erroneous output when attempting to combine similar formatted lines in the output. There are also other design issues, such as the use of the floating point drain timestamp when combining similar drained nodes in output, which means nodes must be drained at the same millisecond to be combined. Rewrite `flux resource status` using the newly added interfaces in flux.resource.status, and the new flux.util.Deduplicator class. High level changes include: * Add a new field for online/offline status, instead of treating the online status as a "state". * Rename the state field header to STATE, and add a new status field which returns one of "online" or "offline". * A node/rank may now only appear in one of the displayed output lines. * Lines are combined based on the current output format using the Deduplicator class. * The "all" state is dropped. When selected, `--state=all` just displays all states, even if they have zero resources in them. * The "online", "offline" states are dropped. When selected, `--state=online|offline` filters the output to only include resource with the specified online status. * Instead of displaying a row of offline ranks/nodes, the command now tags offline nodes in a given state (avail, exclude, draining, drained) with an asterisk if they are offline. Therefore 'avail*' means the resources would be available if they were not offline.
Problem: expected output of `flux resource status` has changed. Update tests given the changes.
Problem: The status field takes up too much horizontal space when used in flux-resource status output. Add an "up" field with a heading of UP and alternate non-ascii and ascii "y/n" values. The ascii values are available as `up.ascii`.
Codecov Report
@@ Coverage Diff @@
## master #4997 +/- ##
==========================================
- Coverage 83.58% 83.11% -0.48%
==========================================
Files 440 444 +4
Lines 74740 76257 +1517
==========================================
+ Hits 62471 63379 +908
- Misses 12269 12878 +609
|
I've updated the tests and documentation here, so this PR is probably ready for a more thorough review. Note that there is still one case where a node may appear in multiple lines of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, and I really like the new output format. Very readable to me.
Just a couple typos to fix.
@@ -36,7 +36,7 @@ class FluxResourceConfig(UtilConfig): | |||
builtin_formats["status"] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the commit message for "status: support color in UP column": s/rsources/resources/ and s/throught/through/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Fixed that and will force a push and set MWP.
Problem: It isn't clear which lines of flux resource status output correspond to online vs offline nodes. Add the 'up' field to the default `flux resource status` output.
Problem: The online/offline state of resources could be made more readily apparent through the use of color. Add a new field ({color_up}) that emits an ANSI color escape sequence based on the online/offline status of a resource status line. Also add a corresponding field to reset colors ({color_off}). Change `{up:>2}` in the default output format for flux-resource status to `{color_up}{up:>2}{color_off}` so that the online/offline indicator character is colorized in status output.
Problem: The "long" output format should probably include the UP column, which is present in the default format. Add the colorized UP column to `flux resource status -o long`.
Problem: Python doesn't handle non-ascii characters in output when the current environment doesn't support it. This causes an exception to be thrown instead of escaping the characters. Reopen stdout with the surrogateescape option to avoid these errors.
Problem: The flux-resource status subcommand has been rewritten, but the tests have not been updated or expanded. Expand the tests in t2351-resource-status-input.t to include coverage of many of the newer features in the flux-resource status command.
Problem: The description of subcommands in the flux-resource(1) man page is compact and therefore difficult to read. Split descriptions of commands in flux-resource(1) into multiple lines so it is easier to read. Give each option its own line for the most part, and move more important options first. Expand the description of the `status` command so it is clearer what the command does.
Problem: Recent changes in the `flux resource status` command are not reflected in the flux-resource(1) man page. Update flux-resource(1) to describe the new version of the status subcommand.
This PR is a WIP. Still need to write a few more tests and documentation. However, since this makes some fundamental changes to the
flux resource status
command, it would be good to get feedback before finishing up those items.This is pretty much a full rewrite of
flux resource status
. It is broken up into a few parts, but the final rewrite is one big change, I wasn't sure how else to accomplish that.flux.util.Deduplicator
class, which combines items that would format to the same string given an output formatter, a list of fields that should be excluded from duplicate detection, and a method to combine similar items.flux.resource.status
flux resource status
is rewritten using these new pieces.The major changes in
flux resource status
include:offline
andonline
from "states" to a new field "status". (results can still be "filtered" by e.g.--states=online
){state}
header from STATUS to STATEall
is no longer a "state". Instead--states=all
displays all states even if they have zero resourcesexclude
,drained
,draining
oravail
(i.e. not excluded or drained/draining).{state}
output field appends an asterisk*
to the state to indicate the rank is offline.{statex}
is available to get the state without the*
flux resource drain
does not collate down to a single line due to timing #4985 among other things(Note this all applies to
flux resource drain
(no args) as well since that command is just a special case offlux resource status
)Fixes #4985
Fixes #4910