Skip to content

nvme top : monitoring dashboard#3350

Open
shroffni wants to merge 7 commits into
linux-nvme:masterfrom
shroffni:nvme-top
Open

nvme top : monitoring dashboard#3350
shroffni wants to merge 7 commits into
linux-nvme:masterfrom
shroffni:nvme-top

Conversation

@shroffni
Copy link
Copy Markdown
Contributor

Add a new "nvme top" CLI command that provides an interactive, top-like dashboard for real-time monitoring of NVMe devices and paths.

shroffni added 7 commits May 12, 2026 15:31
The table API automatically adjusts column width based on the width of
the value being printed. While table_print_XXX() already supports
unsigned, unsigned long, and long data types, the corresponding helper
table_get_value_width() does not account for these types.

Add support for unsigned, unsigned long, and long in table_get_value_
width() so that column width calculation is consistent with the
supported print helpers.

This will be used by the nvme top dashboard, where several statistics
are represented using these data types.

Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
table_print_centered() open-codes the logic to determine the width of a
table value, even though a helper already exists for this purpose.

Replace the open-coded width calculation with a call to
table_get_value_width() to avoid duplication and keep the behavior
consistent across the table helpers. Also, ensure that
table_get_value_width() returns -1 on failure so the caller could
then handle error as needed.

Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
The table_print_XXX() APIs do not currently support printing values of
type float or double.

Add support for float and double so that these data types can be used
with the table printing helpers. This will be later used for printing
nvme-top stat.

While at it, switch error reporting to nvme_show_error() for
consistency with the rest of the code.

Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
The table APIs currently print output only to stdout. However, callers
may need to direct output to other destinations such as stderr, a file,
or an in-memory buffer.

Add support for passing a FILE * stream so callers can choose where the
table output is written.

This will be used by the nvme top dashboard to control how statistics
are displayed.

Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Add a sigaction handler for SIGWINCH so that nvme-top can detect
terminal window size changes.

This allows the dashboard layout to be adjusted and redrawn when the
terminal is resized. To avoid undefined behavior and async-signal-safety
the data type for nvme_sigwinch_received is defined as volatile
sig_atomic_t.

While we are at it, also update the data type of nvme_sigint_received
to volatile sig_atomic_t.

Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Add a generic dashboard framework to support interactive, top-like
views. The framework renders data within a fixed-size window backed by
a larger buffer and provides event-driven APIs for updating the display.

Supported events:

- key input (ESC, Enter, 'q')
- escape sequences (up/down arrows)
- SIGWINCH for terminal resize handling
- periodic refresh via timeout
- kobject uevents (via netlink) to detect NVMe topology changes
  (subsystem, controller, namespace)
- reverse video highlighting for rows

Exported APIs:

- FILE *dashboard_init(struct dashboard_ctx **db_ctx, int interval)
  Initialize the dashboard context. Returns a FILE * stream used by the
  caller to write data for rendering. The returned context is used by
  all other dashboard APIs. The @interval is specified in seconds which
  represents the dashboard refresh interval.

- void dashboard_exit(struct dashboard_ctx *db_ctx)
  Tear down the dashboard and free all resources.

- int dashboard_draw_frame(struct dashboard_ctx *db_ctx, int scroll)
  Render the current frame. If @scroll is non-zero, adjust the view
  based on scrolling (up/down); otherwise render a new buffer.

- enum event_type dashboard_wait_for_event(struct dashboard_ctx *db_ctx)
  Wait for events such as key input, timeout, uevent, or SIGWINCH.
  Callers are expected to implement an event loop, and then based on
  returned event type take the action. For instance, if the returned
  event is timeout then caller shall update/refill the data store buffer
  and invoke dashboard_draw_frame() so that new/updated data could be
  rendered on dashboard . If the returned event type is up/down arrow
  keys then caller shall adjust the data start index in data store
  buffer and invoke dashboard_draw_frame() for rendering data. If the
  returned event type is key press 'ESC' or 'q' or kobject uevent then
  user shall take action as appropriate.

Reverse video helpers:

dashboard_{set,reset}_header_row_reverse()
dashboard_{set,reset}_footer_row_reverse()
dashboard_{set,reset}_data_row_reverse()

Additional getters/setters:

dashboard_get_interval()
dashboard_get_header_rows()
dashboard_set_header_rows()
dashboard_get_footer_rows()
dashboard_set_footer_rows()
dashboard_get_data_rows()
dashboard_get_data_start()
dashboard_set_data_start()
dashboard_get_frame_data_rows()

This framework is intended for use by nvme-top and similar tools that
require dynamic, event-driven terminal dashboards.

Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Add a new "nvme top" CLI command that provides an interactive,
top-like dashboard for real-time monitoring of NVMe devices and
paths.

The dashboard presents continuously updated information about
NVMe fabrics/PCIe paths and devices, similar in spirit to tools
such as top or iotop. It helps administrators observe device
health, detect path degradation, identify multipath imbalances,
and catch transient failures.

The interface consists of two views:

* Subsystem list view
  Displays all NVMe subsystems present on the system. Users can
  navigate the list using arrow keys and select a subsystem for
  detailed inspection.

* Subsystem detail view
  Shows detailed statistics for the selected subsystem. When
  native multipath is enabled, this includes namespace head
  statistics, path statistics, path health, and controller
  summary. Without multipath, it displays namespace statistics
  and controller summary.

Users can switch between views using the ESC/return key, exit with
'q', and navigate using arrow keys.

This command builds on the generic dashboard framework to provide
a flexible and extensible real-time monitoring interface.

Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Comment thread nvme-builtin.h
#include "cmd.h"

COMMAND_LIST(
ENTRY("top", "nvme top", top)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, @MatiasBjorling had sorted the COMMAND_LIST according usage, that's why nvme list is on the beginning of the list. I suggest to move it down. I'll do it myself if the rest of the review is just nitpicking.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just saw in the other PR that you are planing to send the new version to the mailing list. That's fine with me too :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, @MatiasBjorling had sorted the COMMAND_LIST according usage, that's why nvme list is on the beginning of the list. I suggest to move it down. I'll do it myself if the rest of the review is just nitpicking.

Yes I'd move down "top".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just saw in the other PR that you are planing to send the new version to the mailing list. That's fine with me too :)

Yeah the reason is as I sent the first patchset upstream, I thought it'd be better if we could close it upstream.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new nvme top command to provide an interactive, top-like TUI dashboard for monitoring NVMe subsystem/controller/path performance and reacting to topology/terminal changes (uevents + SIGWINCH).

Changes:

  • Introduces a generic dashboard backend (util/dashboard.*) for frame rendering, scrolling, key handling, SIGWINCH, and NVMe uevent-driven refresh.
  • Extends the table utility to support float/double values and printing to an arbitrary FILE * stream (needed for memstream-backed rendering).
  • Adds a stdout implementation of nvme top (nvme-print-stdout-top.c) and wires it into the CLI command list and print-ops dispatch.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
util/table.h Adds float/double value support and table_print_stream(FILE*, ...) API.
util/table.c Implements stream-based printing and float/double formatting; routes errors through nvme error printer.
util/sighdl.h Switches signal flags to volatile sig_atomic_t and adds SIGWINCH support declarations.
util/sighdl.c Implements SIGWINCH handler installation alongside SIGINT.
util/meson.build Adds util/dashboard.c to the build.
util/dashboard.h New public API for the generic dashboard (init/draw/wait/reset/exit, scrolling/highlighting).
util/dashboard.c New dashboard implementation: memstream-backed buffer, frame mapping, pselect event loop, uevent listening.
nvme.h Moves subsystem_iopolicy_filter() into a header inline for reuse.
nvme.c Adds the top subcommand and validates refresh interval/output format; installs SIGWINCH handler.
nvme-print.h Extends print_ops with .top() and declares nvme_show_top().
nvme-print.c Adds nvme_show_top() dispatch wrapper.
nvme-print-stdout.c Hooks stdout_top into stdout print ops; removes duplicated filter helper.
nvme-print-stdout-top.c New implementation of the interactive dashboard screens and stat formatting.
nvme-builtin.h Registers the new top command.
meson.build Adds nvme-print-stdout-top.c to build sources.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread nvme.h
Comment on lines +107 to +127
static inline bool subsystem_iopolicy_filter(const char *name, void *arg)
{
libnvme_subsystem_t s = arg;
const char *iopolicy = libnvme_subsystem_get_iopolicy(s);

if (!strcmp(iopolicy, "queue-depth")) {
/* exclude "Nodes" for iopolicy queue-depth */
if (!strcmp(name, "Nodes"))
return false;
} else if (!strcmp(iopolicy, "numa")) {
/* exclude "Qdepth" for iopolicy numa */
if (!strcmp(name, "Qdepth"))
return false;
} else { /* round-robin */
/* exclude "Nodes" and "Qdepth" for iopolicy round-robin */
if (!strcmp(name, "Nodes") || !strcmp(name, "Qdepth"))
return false;
}

return true;
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes will add string.h header.

Comment thread util/dashboard.c
Comment on lines +283 to +290
int dashboard_set_data_start(struct dashboard_ctx *db_ctx, int off)
{
if (off >= db_ctx->ds.data_rows)
return -EINVAL;

db_ctx->ds.data_start_idx = off;

return 0;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically that's not possible, as for empty data section start index would be anyways zero. As you could notice when user scroll the dashboard using arrow keys, we update the start index but for empty data buffer user would not be able to scroll the dashboard and so that means start index would also not change.

Above boundary check (off >= db_ctx->ds.data_rows) was necessary to ensure that start index doesn't overflow the data buffer.

Comment thread util/dashboard.c
sigemptyset(&ctx->orig_set);
sigaddset(&sigwinch_set, SIGWINCH);
/* block SIGWINCH */
sigprocmask(SIG_SETMASK, &sigwinch_set, &ctx->orig_set);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes SIG_SETMASK should be replaced with SIG_BLOCK.

Comment thread util/dashboard.c
Comment on lines +688 to +689
ds->data_rows = ds->num_rows - resrv_rows;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ds->num_rows already accounts for the header, data, and footer rows. Therefore, ds->data_rows should never become negative unless there is a bug elsewhere that incorrectly sets the number of header or footer rows. In such a case, the caller using the dashboard API is already providing inconsistent state and needs to be fixed accordingly. With incorrect header/footer row configuration, the dashboard layout would appear broken or awkward anyway.

Comment thread util/dashboard.c
Comment on lines +774 to +796
static int dashboard_uevent_fd(void)
{
int fd, ret;
struct sockaddr_nl sa;

fd = socket(AF_NETLINK, SOCK_DGRAM, NETLINK_KOBJECT_UEVENT);
if (fd < 0) {
nvme_show_perror("uevent socket");
return fd;
}

sa.nl_family = AF_NETLINK;
sa.nl_pad = 0;
sa.nl_pid = getpid();
sa.nl_groups = 1;

ret = bind(fd, (struct sockaddr *)&sa, sizeof(struct sockaddr_nl));
if (ret < 0) {
close(fd);
nvme_show_perror("uevent bind");
return ret;
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok will update this in the code.

Comment thread util/dashboard.c
Comment on lines +610 to +625
if (!scroll) {
/* flushing stream would synchronize screen buffer */
fflush(stream);

/*
* Rewind the stream to reset the file offset position to the
* start of the buffer.
*/
rewind(stream);

/* If there's nothing to print then return early. */
if (!ds->len)
return 0;

ds->buf[ds->len] = '\0';

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if the topology shrinks, there may be some leftover trailing contents remaining in ds->buf. However, those contents are intentionally ignored. After each refresh interval, we rewind the memstream, and during the next fflush(stream) the buffer length (ds->len) is updated to reflect the current file offset. We also explicitly place a null terminator at the last valid byte of the buffer, ensuring that anything beyond that point is ignored. This guarantees that on every refresh interval we consume only the current valid contents and never access stale or leftover buffer data.

So I think the current implementation effectively handles both buffer shrink and grow cases without any issue.

@igaw
Copy link
Copy Markdown
Collaborator

igaw commented May 12, 2026

let's not ask copilot for more review. it starts to hallucinate too much. we can fix the rest in the tree. I'll wait for your post on the mailing list.

@shroffni
Copy link
Copy Markdown
Contributor Author

let's not ask copilot for more review. it starts to hallucinate too much. we can fix the rest in the tree. I'll wait for your post on the mailing list.

Sure, I will re-spin the patchset and send it upstream. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants