Skip to content

Integrate TorchTensorRt in order to increase speed during inference #11438

@Actis92

Description

@Actis92

🚀 Feature

Add a method like to_torchscript in lightgning.py that allow to convert a model in TorchTensorRT in order to increase performance

Motivation

Increase performance during inference

Proposal

    @torch.no_grad()
    def to_torch_tensorrt(
            self,
            example_inputs: Optional[Any] = None,
            enabled_precisions: Union ( torch.dtype , torch_tensorrt.dtype
            **kwargs,
    ) -> Union[ScriptModule, Dict[str, ScriptModule]]:
        mode = self.training

        # if no example inputs are provided, try to see if model has example_input_array set
        if example_inputs is None:
            if self.example_input_array is None:
                raise ValueError(
                    "Choosing method=`trace` requires either `example_inputs`"
                    " or `model.example_input_array` to be defined."
                )
            example_inputs = self.example_input_array

        # automatically send example inputs to the right device and use trace
        example_inputs = self._apply_batch_transfer_handler(example_inputs)
        trt_module = torch_tensorrt.compile(self.eval(),
                                           inputs=example_inputs,
                                           enabled_precisions=enabled_precision  # Run with FP16
                                           )
        self.train(mode)

        return trt_module

Additional context

A possible problem could be the dependencies because it depends on CUDA, cuDNN and TensorRT as you can see https://nvidia.github.io/Torch-TensorRT/v1.0.0/tutorials/installation.html and some of these dependencies I think work only on Linux

cc @Borda @carmocca @awaelchli @ninginthecloud @daniellepintz @rohitgr7

Activity

added this to the 1.6 milestone on Jan 12, 2022
added
3rd partyRelated to a 3rd-party
and removed
hooksRelated to the hooks API
on Jan 13, 2022
modified the milestones: 1.6, future on Feb 1, 2022
luca-medeiros

luca-medeiros commented on Jun 22, 2022

@luca-medeiros
Contributor

Recently PyTorch team integrated Torch-TensorRT into the Pytorch ecosystem. blog post
Any tips on how would one implement an export_trt to Trainer?

carmocca

carmocca commented on Aug 19, 2022

@carmocca
Contributor

We could follow the pattern used by to_onnx: https://github.com/Lightning-AI/lightning/blob/0ca3b5aa1b16667cc2d006c3833f4953b5706e72/src/pytorch_lightning/core/module.py#L1798. Comparing it to the snippet in your linked blogpost, the advantage would be to automatically use self.example_input_array (if defined) and call the batch transfer hooks to apply any transformations (if defined). This is what the top post also suggests.

modified the milestones: pl:future, pl:1.8 on Aug 21, 2022
self-assigned this
on Aug 21, 2022
modified the milestones: v1.8, future on Oct 13, 2022

8 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    3rd partyRelated to a 3rd-partyfeatureIs an improvement or enhancementperformance

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      Participants

      @Borda@carmocca@dgcnz@GdoongMathew@rohitgr7

      Issue actions

        Integrate TorchTensorRt in order to increase speed during inference · Issue #11438 · Lightning-AI/pytorch-lightning