-
Notifications
You must be signed in to change notification settings - Fork 12
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Problem
When using @nsight.analyze.kernel decorator on multiple functions with the same name in a single Python script, profiling can be incorrectly associated with the wrong function. This occurs because the ncu subprocess identifies functions by their name alone, which is insufficient when multiple functions share the same name.
Current Behavior
In the current implementation, ncu uses function name as the primary identifier when determining whether to profile a function. When ncu encounters a function during execution, it checks:
nsight-python/nsight/collection/ncu.py
Lines 243 to 249 in 64f60bf
| else: | |
| # If NSPY_NCU_PROFILE is set, just run the function normally | |
| name = os.environ["NSPY_NCU_PROFILE"] | |
| # If this is not the function we are profiling, stop | |
| if func.__name__ != name: | |
| return None |
This leads to the following problem:
@nsight.analyze.kernel(metric="smsp__sass_inst_executed_op_shared_ld.sum")
def benchmark_test(n: int) -> None:
a = torch.randn(n, n, device="cuda")
b = torch.randn(n, n, device="cuda")
with nsight.annotate("matmul"):
_ = a @ b
# When calling:
res1 = benchmark_test(1024) # Profiles correctly
res2 = benchmark_test(2048) # Return results from the first call or wrong function
print(res1.to_dataframe())
print("\n")
print(res2.to_dataframe())The outputs of the two dataframes are the same:
Annotation n AvgValue StdDev MinValue MaxValue NumRuns ... ComputeClock MemoryClock CI95_Lower CI95_Upper RelativeStdDevPct StableMeasurement Geomean
0 matmul 1024 2131968.0 NaN 2131968.0 2131968.0 1 ... 1980000 2619000 NaN NaN NaN False 2131968.0
[1 rows x 19 columns]
Annotation n AvgValue StdDev MinValue MaxValue NumRuns ... ComputeClock MemoryClock CI95_Lower CI95_Upper RelativeStdDevPct StableMeasurement Geomean
0 matmul 2048 2131968.0 NaN 2131968.0 2131968.0 1 ... 1980000 2619000 NaN NaN NaN False 2131968.0
[1 rows x 19 columns]
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working