Open
Description
🚀 Feature
Add a method like to_torchscript in lightgning.py that allow to convert a model in TorchTensorRT in order to increase performance
Motivation
Increase performance during inference
Proposal
@torch.no_grad()
def to_torch_tensorrt(
self,
example_inputs: Optional[Any] = None,
enabled_precisions: Union ( torch.dtype , torch_tensorrt.dtype
**kwargs,
) -> Union[ScriptModule, Dict[str, ScriptModule]]:
mode = self.training
# if no example inputs are provided, try to see if model has example_input_array set
if example_inputs is None:
if self.example_input_array is None:
raise ValueError(
"Choosing method=`trace` requires either `example_inputs`"
" or `model.example_input_array` to be defined."
)
example_inputs = self.example_input_array
# automatically send example inputs to the right device and use trace
example_inputs = self._apply_batch_transfer_handler(example_inputs)
trt_module = torch_tensorrt.compile(self.eval(),
inputs=example_inputs,
enabled_precisions=enabled_precision # Run with FP16
)
self.train(mode)
return trt_module
Additional context
A possible problem could be the dependencies because it depends on CUDA, cuDNN and TensorRT as you can see https://nvidia.github.io/Torch-TensorRT/v1.0.0/tutorials/installation.html and some of these dependencies I think work only on Linux
cc @Borda @carmocca @awaelchli @ninginthecloud @daniellepintz @rohitgr7