## Step 3: Excursion Nsight Systems Plugins

In this notebook, we will learn about [Nsight Systems plugins](https://docs.nvidia.com/nsight-systems/UserGuide/index.html#nsight-systems-plugins-preview) and how they can be used to write your own data collectors.


## 3.1 Enable Plugins

Recent versions of Nsight Systems enable the execution of additional executables with the `--enable` flag. With the following command you get a list of the available prebuilt collector plugins.

In [1]:
!nsys profile --enable=help

Available plugins:
	doca: Collects performance metrics from the NVIDIA BlueField Data Path Accelerator (DPA) processor.
	network_interface: Collects network adapter metrics from /sys/class/net/
	nvml_metrics: Collects power and temperature metrics using the NVIDIA Management Library (NVML)
 API
	storage_metrics: Collect traffic quantity, throughput and operation counters of mounted remote volumes.
	efa_metrics: Collects AWS EFA Infiniband and Ethernet metrics


To enable multiple plugins in the same profiling run, the `--enable` flag can be used multiple times.

The plugins use NVTX annotations to pass data to Nsight Systems (or any other NVTX handler).

The `network_interface` plugin is open source. It can be found in */opt/nvidia/nsight-systems/2025.1.1/target-linux-x64/samples/NetworkPlugin.cpp*.

## 3.2 Writing a Custom Plugin

Let's take a look at a simple plugin which uses NVTX to pass counter data to Nsight Systems.
Open the file [mynvml.c](nsys/plugins/mynvml.c) in another tab.
It's a simple C program that uses the NVML API to sample several counters, similar to the prebuilt NVML plugin, but collecting different counters.
NVTX annotations are used to describe the `counters_t` data structure and expose it to Nsight Systems.
We're using the NVTX domain "MyNvml" to avoid collisions with other NVTX instrumented code.

The following code box compiles our custom plugin code and copies the executable and a [yaml file](nsys/plugins/mynvml/nsys-plugin.yaml) into the Nsight Systems plugins folder.

In [17]:
!gcc /home/sanjay42/sanjay/cuda/AcceleratedPythonProgramming/nsys/plugins/mynvml.c \
    -I/dli/task/NVTX/c/include -I/usr/local/cuda-12.9/include/include \
    -L/usr/local/cuda-12.9/lib64/stubs -lnvidia-ml -ldl \
    -o /home/sanjay42/sanjay/cuda/AcceleratedPythonProgramming/nsys/plugins/mynvml/mynvml_plugin

# Copy the plugin binary and the yaml file into the Nsight Systems plugins directory.
!sudo cp -r /home/sanjay42/sanjay/cuda/AcceleratedPythonProgramming/nsys/plugins/mynvml /opt/nvidia/nsight-systems/2025.1.3/target-linux-x64/plugins

!nsys profile --enable=help

[01m[K/home/sanjay42/sanjay/cuda/AcceleratedPythonProgramming/nsys/plugins/mynvml.c:6:10:[m[K [01;31m[Kfatal error: [m[Knvtx3/nvToolsExtCounters.h: No such file or directory
    6 | #include [01;31m[K<nvtx3/nvToolsExtCounters.h>[m[K
      |          [01;31m[K^~~~~~~~~~~~~~~~~~~~~~~~~~~~[m[K
compilation terminated.
Invalid plugin configuration: Executable path does not exist: /opt/nvidia/nsight-systems/2025.1.3/target-linux-x64/plugins/mynvml/mynvml_plugin


Our plugin will now show up when executing `nsys profile --enable=help`.

## 3.3 Profile with Plugins Enabled

Let's enable our custom plugin and the prebuilt plugins _nvml_metrics_ and _network_interface_ for a profiling run of the video segmentation pipeline.

In [None]:
!nsys profile --trace cuda,nvtx,nvvideo \
--output reports/optimized_cvcuda_plugins \
--force-overwrite=true \
--enable=MyNvml \
--enable=nvml_metrics \
--enable="network_interface,--device=.*" \
python video_segmentation/main_nvtx-cvcuda-nvcodec.py

With the last (third) enable flag in the command above you can see how arguments can be passed to a plugin. The `--device=.*` flag tells the plugin to collect data for all devices (default is physical devices only).

To check that the plugins collected additional metrics, we open the generated report file *reports/cvcuda_plugins.nsys-rep* in the Nsight Systems GUI.

Screenshot of the Nsight Systems GUI:

<center><img src=images/step3/nsys_timeline_plugins.png></center>

There are new expandable rows on top of the timeline. After expanding them, you should see green bar charts for
* power usage and temperature of the installed GPUs for the NVML plugin
* transferred and received bytes for the available network interfaces collected by the network interface plugin
* additional GPU utilization metrics collected with our custom plugin in the row named "MyNVML".

## 3.4 Improve the Custom Plugin with NVTX Semantics

Our first custom plugin was pretty basic. Let's improve it and add NVTX semantic to facilitate the analysis.

Execute the following code box to see how NVTX counter semantics can be used to set the counter unit and minimum and maximum values.

In [None]:
!diff -d -U1 --color=always nsys/plugins/mynvml.c nsys/plugins/mynvml_units.c

Execute the following code box to compile the modified code and copy the executable and the yaml file into the Nsight Systems plugins folder.

In [None]:
!gcc nsys/plugins/mynvml_units.c \
    -I/dli/task/NVTX/c/include -I/usr/local/cuda/include \
    -L/usr/local/cuda/lib64/stubs -lnvidia-ml -ldl \
    -o nsys/plugins/mynvml/mynvml_plugin

# Copy the plugin binary and the yaml file into the Nsight Systems plugins directory.
!cp -r nsys/plugins/mynvml /opt/nvidia/nsight-systems/2025.1.1/target-linux-x64/plugins/

!echo done

<div class="alert alert-block alert-info">
<b>Exercise:</b> Profile the improved plugin and see the changes in the rows under <i>MyNVML</i> in the Nsight Systems timeline.
</div>

In [None]:
!nsys profile \
--trace cuda,nvtx,nvvideo \
--output reports/optimized_cvcuda_plugins_mynvml_semantics --force-overwrite=true \
--enable=MyNvml \
python video_segmentation/main_nvtx-cvcuda-nvcodec.py

The following screenshot shows the Nsight Systems timeline for the improved custom plugin.

<center><img src=images/step3/nsys_timeline_plugin_mynvml_semantics.png></center>

Specifying the units allows Nsight Systems to apply appropriate unit prefixes such as *k*, *M*, *G*, etc.
Specifying the limits for the utilization counters provides a better visual assessment.

You can also specify detailed hierarchies for your NVTX collection. See NVTX scopes in the [nvToolsExtPayload.h](/dli/task/NVTX/c/include/nvtx3/nvToolsExtPayload.h) header.


<div class="alert alert-block alert-success">
    <b>Summary</b>
    <p>In this step, we learned about using Nsight Systems' plugins to collect additional data.</p>
    <p>We have built a custom plugin that uses extended NVTX annotations to expose counter data.</p>
    <p>NVTX extended payloads can be used to describe data structures and additional data to NVTX events.</p>
</div>

Further information about plugins can be found in the Nsight Systems documentation: https://docs.nvidia.com/nsight-systems/UserGuide/index.html#nsight-systems-plugins-preview.

Please click [here](step4.ipynb) to move to the next step.