Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to load the pretrained models in pytorch #3

Closed
ayushtues opened this issue Jun 12, 2022 · 5 comments
Closed

How to load the pretrained models in pytorch #3

ayushtues opened this issue Jun 12, 2022 · 5 comments

Comments

@ayushtues
Copy link

Hi, how can I instantiate an object of the SpeechT5 model in a Pytorch code file, and maybe load the provided pretrained weights in it?

Something similar to ( this doesn't work btw)
image

@Ajyy
Copy link
Collaborator

Ajyy commented Jun 13, 2022

Hi,

I have updated the steps to instantiate the model and load the checkpoint in here. Thanks.

@ayushtues
Copy link
Author

Hi, thanks for the quick reply and for providing the instructions! I had a few more questions

In the updated code we need access to hubert_label_dir, and data here to create the task object which is used while defining the model architecture:

checkpoint['cfg']['task'].t5_task = 'pretrain'
checkpoint['cfg']['task'].hubert_label_dir = "/path/to/hubert_label"
checkpoint['cfg']['task'].data = "/path/to/tsv_file"

task = SpeechT5Task.setup_task(checkpoint['cfg']['task'])
model = T5TransformerModel.build_model(checkpoint['cfg']['model'], task)

Are there small dummy files which can be used here, or a way to define the model architecture without these files?

I just want to load the model using the SpeechT5 Base pretrained weights provided in the Readme (here) to inspect it, and maybe do some forward passes on dummy inputs, is it necessary to download the data for this (which is pretty huge)?

Thanks in advance!

@Ajyy
Copy link
Collaborator

Ajyy commented Jun 15, 2022

Hi, I'm glad that it helps you.

Yes, if you just want to load the model, you only need to put the dictionary under the paths. More concretely, you need to put the text dictionary under data and the pseudo-code dictionary under hubert_label_dir since they are needed to set up the task in here.

The pseudo-code dictionary can be created by the code here, where n_clusters is 500. The text dictionary can be downloaded in here.

You may need to follow the dataset code for preparing some dummy inputs and doing forward passes.

Thanks!

@ayushtues
Copy link
Author

ayushtues commented Jun 15, 2022

Thanks a lot, this helped me load the model!

The pseudo-code dictionary code is here for future reference for anyone, the link above was referring to the task code.

@Ajyy
Copy link
Collaborator

Ajyy commented Jun 15, 2022

Oh yes, sorry for the mistake. If you have further problems, please tell me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants