nvme top : monitoring dashboard#3350
Conversation
The table API automatically adjusts column width based on the width of the value being printed. While table_print_XXX() already supports unsigned, unsigned long, and long data types, the corresponding helper table_get_value_width() does not account for these types. Add support for unsigned, unsigned long, and long in table_get_value_ width() so that column width calculation is consistent with the supported print helpers. This will be used by the nvme top dashboard, where several statistics are represented using these data types. Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
table_print_centered() open-codes the logic to determine the width of a table value, even though a helper already exists for this purpose. Replace the open-coded width calculation with a call to table_get_value_width() to avoid duplication and keep the behavior consistent across the table helpers. Also, ensure that table_get_value_width() returns -1 on failure so the caller could then handle error as needed. Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
The table_print_XXX() APIs do not currently support printing values of type float or double. Add support for float and double so that these data types can be used with the table printing helpers. This will be later used for printing nvme-top stat. While at it, switch error reporting to nvme_show_error() for consistency with the rest of the code. Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
The table APIs currently print output only to stdout. However, callers may need to direct output to other destinations such as stderr, a file, or an in-memory buffer. Add support for passing a FILE * stream so callers can choose where the table output is written. This will be used by the nvme top dashboard to control how statistics are displayed. Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Add a sigaction handler for SIGWINCH so that nvme-top can detect terminal window size changes. This allows the dashboard layout to be adjusted and redrawn when the terminal is resized. To avoid undefined behavior and async-signal-safety the data type for nvme_sigwinch_received is defined as volatile sig_atomic_t. While we are at it, also update the data type of nvme_sigint_received to volatile sig_atomic_t. Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Add a generic dashboard framework to support interactive, top-like views. The framework renders data within a fixed-size window backed by a larger buffer and provides event-driven APIs for updating the display. Supported events: - key input (ESC, Enter, 'q') - escape sequences (up/down arrows) - SIGWINCH for terminal resize handling - periodic refresh via timeout - kobject uevents (via netlink) to detect NVMe topology changes (subsystem, controller, namespace) - reverse video highlighting for rows Exported APIs: - FILE *dashboard_init(struct dashboard_ctx **db_ctx, int interval) Initialize the dashboard context. Returns a FILE * stream used by the caller to write data for rendering. The returned context is used by all other dashboard APIs. The @interval is specified in seconds which represents the dashboard refresh interval. - void dashboard_exit(struct dashboard_ctx *db_ctx) Tear down the dashboard and free all resources. - int dashboard_draw_frame(struct dashboard_ctx *db_ctx, int scroll) Render the current frame. If @scroll is non-zero, adjust the view based on scrolling (up/down); otherwise render a new buffer. - enum event_type dashboard_wait_for_event(struct dashboard_ctx *db_ctx) Wait for events such as key input, timeout, uevent, or SIGWINCH. Callers are expected to implement an event loop, and then based on returned event type take the action. For instance, if the returned event is timeout then caller shall update/refill the data store buffer and invoke dashboard_draw_frame() so that new/updated data could be rendered on dashboard . If the returned event type is up/down arrow keys then caller shall adjust the data start index in data store buffer and invoke dashboard_draw_frame() for rendering data. If the returned event type is key press 'ESC' or 'q' or kobject uevent then user shall take action as appropriate. Reverse video helpers: dashboard_{set,reset}_header_row_reverse() dashboard_{set,reset}_footer_row_reverse() dashboard_{set,reset}_data_row_reverse() Additional getters/setters: dashboard_get_interval() dashboard_get_header_rows() dashboard_set_header_rows() dashboard_get_footer_rows() dashboard_set_footer_rows() dashboard_get_data_rows() dashboard_get_data_start() dashboard_set_data_start() dashboard_get_frame_data_rows() This framework is intended for use by nvme-top and similar tools that require dynamic, event-driven terminal dashboards. Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Add a new "nvme top" CLI command that provides an interactive, top-like dashboard for real-time monitoring of NVMe devices and paths. The dashboard presents continuously updated information about NVMe fabrics/PCIe paths and devices, similar in spirit to tools such as top or iotop. It helps administrators observe device health, detect path degradation, identify multipath imbalances, and catch transient failures. The interface consists of two views: * Subsystem list view Displays all NVMe subsystems present on the system. Users can navigate the list using arrow keys and select a subsystem for detailed inspection. * Subsystem detail view Shows detailed statistics for the selected subsystem. When native multipath is enabled, this includes namespace head statistics, path statistics, path health, and controller summary. Without multipath, it displays namespace statistics and controller summary. Users can switch between views using the ESC/return key, exit with 'q', and navigate using arrow keys. This command builds on the generic dashboard framework to provide a flexible and extensible real-time monitoring interface. Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
| #include "cmd.h" | ||
|
|
||
| COMMAND_LIST( | ||
| ENTRY("top", "nvme top", top) |
There was a problem hiding this comment.
IIRC, @MatiasBjorling had sorted the COMMAND_LIST according usage, that's why nvme list is on the beginning of the list. I suggest to move it down. I'll do it myself if the rest of the review is just nitpicking.
There was a problem hiding this comment.
Just saw in the other PR that you are planing to send the new version to the mailing list. That's fine with me too :)
There was a problem hiding this comment.
IIRC, @MatiasBjorling had sorted the COMMAND_LIST according usage, that's why
nvme listis on the beginning of the list. I suggest to move it down. I'll do it myself if the rest of the review is just nitpicking.
Yes I'd move down "top".
There was a problem hiding this comment.
Just saw in the other PR that you are planing to send the new version to the mailing list. That's fine with me too :)
Yeah the reason is as I sent the first patchset upstream, I thought it'd be better if we could close it upstream.
There was a problem hiding this comment.
Pull request overview
Adds a new nvme top command to provide an interactive, top-like TUI dashboard for monitoring NVMe subsystem/controller/path performance and reacting to topology/terminal changes (uevents + SIGWINCH).
Changes:
- Introduces a generic dashboard backend (
util/dashboard.*) for frame rendering, scrolling, key handling, SIGWINCH, and NVMe uevent-driven refresh. - Extends the table utility to support float/double values and printing to an arbitrary
FILE *stream (needed for memstream-backed rendering). - Adds a stdout implementation of
nvme top(nvme-print-stdout-top.c) and wires it into the CLI command list and print-ops dispatch.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| util/table.h | Adds float/double value support and table_print_stream(FILE*, ...) API. |
| util/table.c | Implements stream-based printing and float/double formatting; routes errors through nvme error printer. |
| util/sighdl.h | Switches signal flags to volatile sig_atomic_t and adds SIGWINCH support declarations. |
| util/sighdl.c | Implements SIGWINCH handler installation alongside SIGINT. |
| util/meson.build | Adds util/dashboard.c to the build. |
| util/dashboard.h | New public API for the generic dashboard (init/draw/wait/reset/exit, scrolling/highlighting). |
| util/dashboard.c | New dashboard implementation: memstream-backed buffer, frame mapping, pselect event loop, uevent listening. |
| nvme.h | Moves subsystem_iopolicy_filter() into a header inline for reuse. |
| nvme.c | Adds the top subcommand and validates refresh interval/output format; installs SIGWINCH handler. |
| nvme-print.h | Extends print_ops with .top() and declares nvme_show_top(). |
| nvme-print.c | Adds nvme_show_top() dispatch wrapper. |
| nvme-print-stdout.c | Hooks stdout_top into stdout print ops; removes duplicated filter helper. |
| nvme-print-stdout-top.c | New implementation of the interactive dashboard screens and stat formatting. |
| nvme-builtin.h | Registers the new top command. |
| meson.build | Adds nvme-print-stdout-top.c to build sources. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| static inline bool subsystem_iopolicy_filter(const char *name, void *arg) | ||
| { | ||
| libnvme_subsystem_t s = arg; | ||
| const char *iopolicy = libnvme_subsystem_get_iopolicy(s); | ||
|
|
||
| if (!strcmp(iopolicy, "queue-depth")) { | ||
| /* exclude "Nodes" for iopolicy queue-depth */ | ||
| if (!strcmp(name, "Nodes")) | ||
| return false; | ||
| } else if (!strcmp(iopolicy, "numa")) { | ||
| /* exclude "Qdepth" for iopolicy numa */ | ||
| if (!strcmp(name, "Qdepth")) | ||
| return false; | ||
| } else { /* round-robin */ | ||
| /* exclude "Nodes" and "Qdepth" for iopolicy round-robin */ | ||
| if (!strcmp(name, "Nodes") || !strcmp(name, "Qdepth")) | ||
| return false; | ||
| } | ||
|
|
||
| return true; | ||
| } |
There was a problem hiding this comment.
Yes will add string.h header.
| int dashboard_set_data_start(struct dashboard_ctx *db_ctx, int off) | ||
| { | ||
| if (off >= db_ctx->ds.data_rows) | ||
| return -EINVAL; | ||
|
|
||
| db_ctx->ds.data_start_idx = off; | ||
|
|
||
| return 0; |
There was a problem hiding this comment.
Technically that's not possible, as for empty data section start index would be anyways zero. As you could notice when user scroll the dashboard using arrow keys, we update the start index but for empty data buffer user would not be able to scroll the dashboard and so that means start index would also not change.
Above boundary check (off >= db_ctx->ds.data_rows) was necessary to ensure that start index doesn't overflow the data buffer.
| sigemptyset(&ctx->orig_set); | ||
| sigaddset(&sigwinch_set, SIGWINCH); | ||
| /* block SIGWINCH */ | ||
| sigprocmask(SIG_SETMASK, &sigwinch_set, &ctx->orig_set); |
There was a problem hiding this comment.
Yes SIG_SETMASK should be replaced with SIG_BLOCK.
| ds->data_rows = ds->num_rows - resrv_rows; | ||
|
|
There was a problem hiding this comment.
The ds->num_rows already accounts for the header, data, and footer rows. Therefore, ds->data_rows should never become negative unless there is a bug elsewhere that incorrectly sets the number of header or footer rows. In such a case, the caller using the dashboard API is already providing inconsistent state and needs to be fixed accordingly. With incorrect header/footer row configuration, the dashboard layout would appear broken or awkward anyway.
| static int dashboard_uevent_fd(void) | ||
| { | ||
| int fd, ret; | ||
| struct sockaddr_nl sa; | ||
|
|
||
| fd = socket(AF_NETLINK, SOCK_DGRAM, NETLINK_KOBJECT_UEVENT); | ||
| if (fd < 0) { | ||
| nvme_show_perror("uevent socket"); | ||
| return fd; | ||
| } | ||
|
|
||
| sa.nl_family = AF_NETLINK; | ||
| sa.nl_pad = 0; | ||
| sa.nl_pid = getpid(); | ||
| sa.nl_groups = 1; | ||
|
|
||
| ret = bind(fd, (struct sockaddr *)&sa, sizeof(struct sockaddr_nl)); | ||
| if (ret < 0) { | ||
| close(fd); | ||
| nvme_show_perror("uevent bind"); | ||
| return ret; | ||
| } | ||
|
|
There was a problem hiding this comment.
Ok will update this in the code.
| if (!scroll) { | ||
| /* flushing stream would synchronize screen buffer */ | ||
| fflush(stream); | ||
|
|
||
| /* | ||
| * Rewind the stream to reset the file offset position to the | ||
| * start of the buffer. | ||
| */ | ||
| rewind(stream); | ||
|
|
||
| /* If there's nothing to print then return early. */ | ||
| if (!ds->len) | ||
| return 0; | ||
|
|
||
| ds->buf[ds->len] = '\0'; | ||
|
|
There was a problem hiding this comment.
Yes, if the topology shrinks, there may be some leftover trailing contents remaining in ds->buf. However, those contents are intentionally ignored. After each refresh interval, we rewind the memstream, and during the next fflush(stream) the buffer length (ds->len) is updated to reflect the current file offset. We also explicitly place a null terminator at the last valid byte of the buffer, ensuring that anything beyond that point is ignored. This guarantees that on every refresh interval we consume only the current valid contents and never access stale or leftover buffer data.
So I think the current implementation effectively handles both buffer shrink and grow cases without any issue.
|
let's not ask copilot for more review. it starts to hallucinate too much. we can fix the rest in the tree. I'll wait for your post on the mailing list. |
Sure, I will re-spin the patchset and send it upstream. Thanks! |
Add a new "nvme top" CLI command that provides an interactive, top-like dashboard for real-time monitoring of NVMe devices and paths.