Skip to content

Bug: Multiple Functions with Same Name Cause Incorrect Profiling Results #13

@ConvolutedDog

Description

@ConvolutedDog

Problem

When using @nsight.analyze.kernel decorator on multiple functions with the same name in a single Python script, profiling can be incorrectly associated with the wrong function. This occurs because the ncu subprocess identifies functions by their name alone, which is insufficient when multiple functions share the same name.

Current Behavior

In the current implementation, ncu uses function name as the primary identifier when determining whether to profile a function. When ncu encounters a function during execution, it checks:

else:
# If NSPY_NCU_PROFILE is set, just run the function normally
name = os.environ["NSPY_NCU_PROFILE"]
# If this is not the function we are profiling, stop
if func.__name__ != name:
return None

This leads to the following problem:

@nsight.analyze.kernel(metric="smsp__sass_inst_executed_op_shared_ld.sum")
def benchmark_test(n: int) -> None:
    a = torch.randn(n, n, device="cuda")
    b = torch.randn(n, n, device="cuda")

    with nsight.annotate("matmul"):
        _ = a @ b

# When calling:
res1 = benchmark_test(1024)  # Profiles correctly
res2 = benchmark_test(2048)  # Return results from the first call or wrong function
print(res1.to_dataframe())
print("\n")
print(res2.to_dataframe())

The outputs of the two dataframes are the same:

  Annotation     n   AvgValue  StdDev   MinValue   MaxValue  NumRuns  ... ComputeClock  MemoryClock CI95_Lower CI95_Upper RelativeStdDevPct  StableMeasurement    Geomean
0     matmul  1024  2131968.0     NaN  2131968.0  2131968.0        1  ...      1980000      2619000        NaN        NaN               NaN              False  2131968.0

[1 rows x 19 columns]

  Annotation     n   AvgValue  StdDev   MinValue   MaxValue  NumRuns  ... ComputeClock  MemoryClock CI95_Lower CI95_Upper RelativeStdDevPct  StableMeasurement    Geomean
0     matmul  2048  2131968.0     NaN  2131968.0  2131968.0        1  ...      1980000      2619000        NaN        NaN               NaN              False  2131968.0

[1 rows x 19 columns]

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions