You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[rank1]:[2024-05-05 09:59:51,519] [0/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[rank0]:[2024-05-05 09:59:51,523] [0/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[rank1]:[2024-05-05 09:59:51,523] [0/0_1] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[rank0]:[2024-05-05 09:59:51,527] [0/0_1] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[rank2]:[2024-05-05 09:59:51,535] [0/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[rank3]:[2024-05-05 09:59:51,538] [0/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[rank2]:[2024-05-05 09:59:51,539] [0/0_1] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[rank3]:[2024-05-05 09:59:51,542] [0/0_1] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
Traceback (most recent call last):
File "/var/folders/h8/1_7bqspx4mj27hqz4qr1gp_m0000gn/T/ipykernel_37353/4058334054.py", line 165, in train_donut
File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 543, in fit
call._call_and_handle_interrupt(
File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py", line 44, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 579, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 986, in _run
results = self._run_stage()
File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 1032, in _run_stage
self.fit_loop.run()
File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py", line 205, in run
self.advance()
File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py", line 363, in advance
self.epoch_loop.run(self._data_fetcher)
File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/loops/training_epoch_loop.py", line 138, in run
self.advance(data_fetcher)
File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/loops/training_epoch_loop.py", line 242, in advance
batch_output = self.automatic_optimization.run(trainer.optimizers[0], batch_idx, kwargs)
File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/loops/optimization/automatic.py", line 184, in run
closure()
File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/loops/optimization/automatic.py", line 144, in __call__
self._result = self.closure(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/loops/optimization/automatic.py", line 129, in closure
step_output = self._step_fn()
File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/loops/optimization/automatic.py", line 319, in _training_step
training_step_output = call._call_strategy_hook(trainer, "training_step", *kwargs.values())
File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py", line 309, in _call_strategy_hook
output = fn(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/strategies/strategy.py", line 390, in training_step
return self._forward_redirection(self.model, self.lightning_module, "training_step", *args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/strategies/strategy.py", line 642, in __call__
wrapper_output = wrapper_module(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 849, in forward
output = self._fsdp_wrapped_module(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/strategies/strategy.py", line 635, in wrapped_forward
out = method(*_args, **_kwargs)
File "/var/folders/h8/1_7bqspx4mj27hqz4qr1gp_m0000gn/T/ipykernel_37353/1410972043.py", line 86, in training_step
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1561, in _call_impl
result = forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 489, in _fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1550, in _call_impl
args_result = hook(self, args)
File "/usr/local/lib/python3.10/site-packages/pytorch_lightning/profilers/pytorch.py", line 72, in _start_recording_forward
record.__enter__()
TypeError: nullcontext.__enter__() missing 1 required positional argument: 'self'
[rank2]:[W record_function.cpp:499] Exception in RecordFunction callback: state_ptr INTERNAL ASSERT FAILED at "../torch/csrc/profiler/standalone/nvtx_observer.cpp":115, please report a bug to PyTorch. Expected profiler state set
Exception raised from updateOutputTensorTracker at ../torch/csrc/profiler/standalone/nvtx_observer.cpp:115 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7fd0d5e76d87 in [/usr/local/lib/python3.10/site-packages/torch/lib/libc10.so](https://file+.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/site-packages/torch/lib/libc10.so))
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x64 (0x7fd0d5e2775f in [/usr/local/lib/python3.10/site-packages/torch/lib/libc10.so](https://file+.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/site-packages/torch/lib/libc10.so))
frame #2: c10::detail::torchInternalAssertFail(char const*, char const*, unsigned int, char const*, char const*) + 0x43 (0x7fd0d5e74873 in [/usr/local/lib/python3.10/site-packages/torch/lib/libc10.so](https://file+.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/site-packages/torch/lib/libc10.so))
frame #3: <unknown function> + 0x56c3f26 (0x7fd0be294f26 in [/usr/local/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so](https://file+.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so))
frame #4: at::RecordFunction::end() + 0x51 (0x7fd0ba5bf411 in [/usr/local/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so](https://file+.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so))
frame #5: at::RecordFunction::~RecordFunction() + 0x22 (0x7fd0ba5bf462 in [/usr/local/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so](https://file+.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so))
frame #6: <unknown function> + 0x4ee58a8 (0x7fd0bdab68a8 in [/usr/local/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so](https://file+.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so))
frame #7: <unknown function> + 0x7a067c (0x7fd0d672267c in [/usr/local/lib/python3.10/site-packages/torch/lib/libtorch_python.so](https://file+.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/site-packages/torch/lib/libtorch_python.so))
frame #8: <unknown function> + 0xa480b5 (0x7fd0d69ca0b5 in [/usr/local/lib/python3.10/site-packages/torch/lib/libtorch_python.so](https://file+.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/site-packages/torch/lib/libtorch_python.so))
frame #9: <unknown function> + 0x4117ab (0x7fd0d63937ab in [/usr/local/lib/python3.10/site-packages/torch/lib/libtorch_python.so](https://file+.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/site-packages/torch/lib/libtorch_python.so))
frame #10: <unknown function> + 0x412731 (0x7fd0d6394731 in [/usr/local/lib/python3.10/site-packages/torch/lib/libtorch_python.so](https://file+.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/site-packages/torch/lib/libtorch_python.so))
<omitting python frames>
frame #22: __libc_start_main + 0xea (0x7fd18ce5ed0a in [/lib/x86_64-linux-gnu/libc.so.6](https://file+.vscode-resource.vscode-cdn.net/lib/x86_64-linux-gnu/libc.so.6))
frame #23: _start + 0x2a (0x55e3c20bf07a in [/usr/local/bin/python](https://file+.vscode-resource.vscode-cdn.net/usr/local/bin/python))
, for the range [pl][module]torch._dynamo.eval_frame.OptimizedModule: model
Bug description
🐛 Bug
I am trying to use PytorchProfiler and write to Tensorboard folder on S3, and get the exception as above
What version are you seeing the problem on?
v2.2
How to reproduce the bug
Error messages and logs
Environment
Current environment
More info
No response
The text was updated successfully, but these errors were encountered: