[Faster transformer] having a guide on how to use weights from a Hugginface transfomer model (Roberta based) with faster transformer 3.1 #56

pommedeterresautee · 2021-03-31T12:51:19Z

Related to Faster transformers + Hugging face + Pytorch

Is your feature request related to a problem? Please describe.
It seems that Faster transformer should be able to import weights from a Roberta based huggingface model, but the way to perform it is not obvious.

Describe the solution you'd like
A part of the README dedicated to use weights from huggingface v4 (last version) in a faster transformer model.

Describe alternatives you've considered
N/A

Additional context
At some point in the project, huggingface v2 is used, but my attempt to load a Roberta based model from Huggingface v4 failed, even if in theory it's the same architecture. I tried to rename the layers to match those expected by Bert but it didn't work, the output didn't match the ones before the transfer... There are probably other transformations to perform, but I didn't find which ones.

def rewrite_layer_name(layer_name: str) -> str:
    if "roberta." in layer_name:
        layer_name = layer_name.replace("roberta.", "bert.")
    elif "classifier.dense." in layer_name:
        layer_name = layer_name.replace("classifier.dense.", "bert.pooler.dense.")
    elif "classifier.out_proj." in layer_name:
        layer_name = layer_name.replace("classifier.out_proj.", "classifier.")
    return layer_name

byshiue · 2021-04-06T06:11:04Z

If the model architectures of BERT and Roberta are same, you should be able to run Roberta on FT.
You can refer this converter and check the weights are put correct in https://github.com/NVIDIA/FasterTransformer/blob/main/sample/pytorch/utils/encoder.py.

You can also try to run roberta with BERT model directly because we have showed that FT can use BERT model directly.

pommedeterresautee · 2021-04-06T21:47:02Z

Thanks for the link, it will be very helpful!

byshiue · 2022-04-18T23:49:40Z

Close this bug because it is inactivated. Feel free to re-open this issue if you still have any problem.

ehuaa · 2023-11-08T16:54:38Z

Thanks for the link, it will be very helpful!

@pommedeterresautee I'm encounter with the same issue to use weights from Huggingface Roberta model to use with Bert FT model, can you use Bert directly? Thanks

byshiue self-assigned this Mar 31, 2021

byshiue transferred this issue from NVIDIA/DeepLearningExamples Apr 5, 2021

byshiue closed this as completed Apr 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Faster transformer] having a guide on how to use weights from a Hugginface transfomer model (Roberta based) with faster transformer 3.1 #56

[Faster transformer] having a guide on how to use weights from a Hugginface transfomer model (Roberta based) with faster transformer 3.1 #56

pommedeterresautee commented Mar 31, 2021

byshiue commented Apr 6, 2021

pommedeterresautee commented Apr 6, 2021

byshiue commented Apr 18, 2022

ehuaa commented Nov 8, 2023

[Faster transformer] having a guide on how to use weights from a Hugginface transfomer model (Roberta based) with faster transformer 3.1 #56

[Faster transformer] having a guide on how to use weights from a Hugginface transfomer model (Roberta based) with faster transformer 3.1 #56

Comments

pommedeterresautee commented Mar 31, 2021

byshiue commented Apr 6, 2021

pommedeterresautee commented Apr 6, 2021

byshiue commented Apr 18, 2022

ehuaa commented Nov 8, 2023