-
Notifications
You must be signed in to change notification settings - Fork 206
Description
I'm using torch.profiler to profile model inference. When I first started using the profiler, I was very confused the overall inference time to be ~20x higher than it should be. At the time, I was manually warming up my model by running the model before profiling. I later discovered that the large initial latency was a result of not warming up through the profiler after looking into the source code of torch.profiler and seeing that _default_schedule_fn
causes torch.profiler to record without warming up.
This doesn't seem obvious considering that the documentation and tutorial demonstrate code snippets that do not warmup. For example,
with profile(activities=[ProfilerActivity.CPU], record_shapes=True) as prof:
with record_function("model_inference"):
model(inputs)
The only mention of the need for warming up is step 8 of the tutorial, which is for scheduling on iterations. For code snippets such as the one above, it's not clear how to warmup.
My hacky workarounds look like this:
with torch.profiler.profile(
activities=[
ProfilerActivity.CPU,
ProfilerActivity.CUDA,
],
schedule=lambda step: torch.profiler.ProfilerAction.WARMUP,
) as prof:
// Code to profile
with torch.profiler.profile(
activities=[
ProfilerActivity.CPU,
ProfilerActivity.CUDA,
],
) as prof:
// Code to profile
with torch.profiler.profile(
activities=[
ProfilerActivity.CPU,
ProfilerActivity.CUDA,
],
schedule=torch.profiler.schedule(
wait=0,
warmup=1,
active=1,
),
) as prof:
for iter in range(2):
// Code to profile
prof.step()
Is there a better solution for warming up a single code block to execute? Could the need for warming up be made more explicitly clear in the documentation and tutorial?
Sorry if this issue would be more appropriate in the PyTorch repository. It seems like it would take longer for a response considering there are 5k+ issues.