# Top NVTX ranges on CPU and GPU
Users add NVTX ranges on the CPU thread to annotate the various phases of their code’s algorithms. This notebook identifies the top NVTX ranges per report according to duration. Nsight Systems automatically projects a NVTX range onto the GPU by analyzing any CUDA work launched from within that range on the same CPU thread. The projection refits the range's start and end time to tightly wrap the CUDA launches, memcopies and memsets invoked within it. The resulting duration (end-start) is then analyzed here to identify the top NVTX ranges when projected on to the GPU.

NOTES:
* CUDA work launched on threads other than the one which opened & closed the range are not counted towards the projection because they may be intended for other NVTX ranges or not intended to be tracked.
* Any NVTX ranges that start or end outside the scope of the report being analyzed are discarded.
* Any NVTX ranges that start and end on different threads are discarded.
* Any NVTX ranges with zero duration after GPU projection are discarded.

In [None]:
import pandas as pd
import plotly.offline as pyo

from IPython.display import display, HTML, Markdown

import nsys_pres

display(HTML("<style>.container { width:95% !important; }</style>"))
pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.float_format', '{:.1f}'.format)
pyo.init_notebook_mode()

## Top NVTX ranges per rank

The table and the bar chart below show the top N of the NVTX ranges on the CPU according to the total duration for each report. Use the slider to adjust the value of N.

In [None]:
ranks_df = pd.read_parquet("rank_stats.parquet")
original_sum_col_name = "Sum of NVTX Ranges on CPU"
# Convert ns to s.
ranks_df[original_sum_col_name] = ranks_df[original_sum_col_name] * 1e-9

# The following two lines have been added to show report names instead of ranks.
files_df = pd.read_parquet("files.parquet")
df = pd.merge(ranks_df.reset_index(), files_df, on='Rank')

nsys_pres.display_top_n_per_rank(df, 'Text', original_sum_col_name, 'File', xaxis_title='NVTX Range', yaxis_title='Duration (s)', title='Duration of NVTX ranges on CPU')

## Top NVTX ranges per rank when projected on the GPU

The table and the bar chart below show the total duration of the top N of the NVTX ranges when projected on to the GPU for each report. Use the slider to adjust the value of N.

In [None]:
projected_sum_col_name = "Sum"
# Convert ns to s.
ranks_df[projected_sum_col_name] = ranks_df[projected_sum_col_name] * 1e-9

# The following line has been added to show report names instead of ranks.
df = pd.merge(ranks_df.reset_index(), files_df, on='Rank')

nsys_pres.display_top_n_per_rank(df, 'Text', projected_sum_col_name, 'File', xaxis_title='NVTX Range', yaxis_title='Duration (s)', title='Duration of NVTX ranges when projected on GPU')

## Files
The table associates each rank number with the original filename. Ranks are assigned assuming that the file names include the rank with sufficient zero padding for proper sorting. Otherwise, the actual rank may differ from the assigned ID.

In [None]:
display(files_df)