Use BetterTransformer for fast inference #117

BramVanroy · 2023-03-03T10:17:00Z

🚀 Feature

PyTorch 1.12 came with a speed bump in Transformer encoder inference speeds through BetterTransformer. Because of Unbabel's efforts to improve usability and efficiency (~COMETINHO), such improvements might be a useful addition to the library.

Implementation

Implementation is relatively straightforward (blog, docs) to implement. This requires accelerate and optimum, which can be optional dependencies (pip install unbabel-comet[bettertransformer]). If both of these are installed (with sufficient versions) alongside PyTorch 1.13 (min. requirement to work with Optimum API), then we can set a global constant BETTER_TRANSFORMER_AVAILABLE = True.

As an example, this line:

https://github.com/Unbabel/COMET/blob/master/comet/encoders/bert.py#L38

should be followed by:

if self.use_bettertransformer and BETTER_TRANSFORMER_AVAILABLE:
    self.model = BetterTransformer.transform(self.model, keep_original_model=False)

And that's about it.

Of course, which class to implement use_bettertransformer is something that has to be decided but apart from that I think this is a feasible addition that can lead to significant speed improvements.

I can work on this if needed but I need guidance on which class to implement self.use_bettertransformer in and whether the logic (if-statement) above should be implemented on a per-model basis or if we can generalize it somehow.

The text was updated successfully, but these errors were encountered:

ricardorei · 2023-03-06T08:56:26Z

yep this seems like a good idea.

I'll try to find some time to test it! Thanks!

BramVanroy added the enhancement New feature or request label Mar 3, 2023

ricardorei mentioned this issue Oct 2, 2023

Quantization #164

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use BetterTransformer for fast inference #117

Use BetterTransformer for fast inference #117

BramVanroy commented Mar 3, 2023 •

edited

Loading

ricardorei commented Mar 6, 2023

Use BetterTransformer for fast inference #117

Use BetterTransformer for fast inference #117

Comments

BramVanroy commented Mar 3, 2023 • edited Loading

🚀 Feature

Implementation

ricardorei commented Mar 6, 2023

BramVanroy commented Mar 3, 2023 •

edited

Loading