Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Faster transformer] having a guide on how to use weights from a Hugginface transfomer model (Roberta based) with faster transformer 3.1 #56

Closed
pommedeterresautee opened this issue Mar 31, 2021 · 4 comments
Assignees

Comments

@pommedeterresautee
Copy link

Related to Faster transformers + Hugging face + Pytorch

Is your feature request related to a problem? Please describe.
It seems that Faster transformer should be able to import weights from a Roberta based huggingface model, but the way to perform it is not obvious.

Describe the solution you'd like
A part of the README dedicated to use weights from huggingface v4 (last version) in a faster transformer model.

Describe alternatives you've considered
N/A

Additional context
At some point in the project, huggingface v2 is used, but my attempt to load a Roberta based model from Huggingface v4 failed, even if in theory it's the same architecture. I tried to rename the layers to match those expected by Bert but it didn't work, the output didn't match the ones before the transfer... There are probably other transformations to perform, but I didn't find which ones.

def rewrite_layer_name(layer_name: str) -> str:
    if "roberta." in layer_name:
        layer_name = layer_name.replace("roberta.", "bert.")
    elif "classifier.dense." in layer_name:
        layer_name = layer_name.replace("classifier.dense.", "bert.pooler.dense.")
    elif "classifier.out_proj." in layer_name:
        layer_name = layer_name.replace("classifier.out_proj.", "classifier.")
    return layer_name
@pommedeterresautee pommedeterresautee changed the title [Fast transformer] having a guide on how to use weights from a Hugginface transfomer model (Roberta based) with fast transformer 3.1 [Faster transformer] having a guide on how to use weights from a Hugginface transfomer model (Roberta based) with faster transformer 3.1 Mar 31, 2021
@byshiue byshiue self-assigned this Mar 31, 2021
@byshiue byshiue transferred this issue from NVIDIA/DeepLearningExamples Apr 5, 2021
@byshiue
Copy link
Collaborator

byshiue commented Apr 6, 2021

If the model architectures of BERT and Roberta are same, you should be able to run Roberta on FT.
You can refer this converter and check the weights are put correct in https://github.com/NVIDIA/FasterTransformer/blob/main/sample/pytorch/utils/encoder.py.

You can also try to run roberta with BERT model directly because we have showed that FT can use BERT model directly.

@pommedeterresautee
Copy link
Author

Thanks for the link, it will be very helpful!

@byshiue
Copy link
Collaborator

byshiue commented Apr 18, 2022

Close this bug because it is inactivated. Feel free to re-open this issue if you still have any problem.

@byshiue byshiue closed this as completed Apr 18, 2022
@ehuaa
Copy link

ehuaa commented Nov 8, 2023

Thanks for the link, it will be very helpful!

@pommedeterresautee I'm encounter with the same issue to use weights from Huggingface Roberta model to use with Bert FT model, can you use Bert directly? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants