# Find the Bottleneck - Optimize AI Pipelines with Nsight Systems

<center><img src="images/lab_image.jpg" width=45%></center>

[Nsight Systems](https://developer.nvidia.com/nsight-systems) is part of the NVIDIA Developer Tools, a collection of tools which enable you to develop world-class software that efficiently utilizes the latest NVIDIA hardware from large computing clusters to mobile targets.
NVIDIA offers a wide range of developer tools. The below table gives an overview and details which application's are targeted by each tool.

<center><img src="images/nvidia-developer-tools.svg" width=85%></center>

Nsight Systems is designed to give you a system-wide performance view of your application’s algorithms. It helps you identify the largest opportunities to optimize, and tune your application to scale efficiently across any number of CPUs and GPUs.
For further optimizations to CUDA kernels, graphs or OptiX command lists, developers should use [Nsight Compute](https://developer.nvidia.com/nsight-compute).

<center><img src="images/nsight_flow.png" width=60%></center>

The typical optimization workflow is an iterative process with three main steps:

<table style="float: left">
<colgroup>
       <col span="1" style="width: 25%;">
       <col span="1" style="width: 50%;">
    </colgroup>
<tr>
<td>

- #### Profile the application

- #### Inspect and analyze the profile to identify any bottlenecks

- #### Optimize the application to address the bottlenecks

</td>
<td>
<img src=images/Optimization_workflow.jpg width=50%>
</td>
</tr>
</table>

## Working with Jupyterlab Notebooks
You will find the following elements in the Jupyterlab notebooks:

Code cells can be executed by selecting them and pressing *Shift+Enter* or using the `>` (play) icon in the toolbar. You can try it out on the code cell below.

In [1]:
!nvidia-smi

Wed May 28 17:19:46 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.51.03              Driver Version: 575.51.03      CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce RTX 4060 Ti     Off |   00000000:01:00.0 Off |                  N/A |
|  0%   41C    P8             11W /  165W |      58MiB /  16380MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

<div class="alert alert-block alert-info">
    <p>Blue blocks start an exercise.</p>
</div>

Hidden cells. Click to expand them.

Here you will find the suggested solution to an exercise or other notes or instructions.

<div class="alert alert-block alert-success">
    <p>Green blocks recap important lessons.</p>
</div>

## Table of Contents

- [Introduction of the Example: Video Segmentation Pipeline for Background Blurring](Intro.ipynb)
- [Step 1 - Getting Started with Nsight Systems](step1.ipynb)
- [Step 2 - Data Transfers between Host and GPU](step2.ipynb)
- [Step 3 - Nsight Systems Plugins / Write Your Own Data Collectors](step3.ipynb)
- [Step 4 - Multi-Report Analysis](step4.ipynb)
- [Step 5 - Multi-Node Analysis](step5.ipynb)

You can also use the _File Browser_ in the panel on the left to jump to the individual notebooks.

Click on the link below to get started. It will open a new tab in the Jupyterlab notebook. You will see such links at the end of each notebook.

[Let's get started!](Intro.ipynb)