Skip to content
This repository has been archived by the owner on Jan 16, 2022. It is now read-only.

Beginner question about the models #15

Closed
SugariuClaudiu opened this issue Jan 17, 2020 · 6 comments
Closed

Beginner question about the models #15

SugariuClaudiu opened this issue Jan 17, 2020 · 6 comments
Labels
wontfix This will not be worked on

Comments

@SugariuClaudiu
Copy link

SugariuClaudiu commented Jan 17, 2020

Hello and thank you for the great work.

This might be a rather stupid question, but I'm just beginning with NPL. Apologies in advance.
Could you give me a brief intro with regards to the files generated in the models ?

I am asking this because when I try to load the head model class and the tokenizer class I get the following error: We assumed '../russian_models/gpt2/m_checkpoint-3364613' was a path or url to a directory containing vocabulary files named ['vocab.json', 'merges.txt'] but couldn't find such vocabulary files at this path or url

Of course, those files are not present there, but I'm not sure where to start at the moment.
And a second question if possible, are you licensing this project under MIT by any chance?

Thank you in advance.

@mgrankin
Copy link
Owner

Hello, how did you get this awesome error message and what you're trying to achieve?

@SugariuClaudiu
Copy link
Author

SugariuClaudiu commented Jan 17, 2020

I'm trying to generate text based on a prompt input.

MODEL_CLASSES = {'gpt2': (GPT2LMHeadModel, GPT2Tokenizer)}
model_class, tokenizer_class = MODEL_CLASSES['gpt2']
tokenizer = tokenizer_class.from_pretrained('gpt2)
model = model_class.from_pretrained('gpt2')
model.to(device)
model.eval();

And then sample a sequence.
I've used this as inspiration: https://github.com/gabrielelanaro/ml-prototypes/blob/master/prototypes/styletransfer/huggingface/huggingface.py

@mgrankin
Copy link
Owner

I didn't use the default tokeniser, but you're using the default code with default tokeniser and the error is because of that.

To understand how to generate text you should start by looking at rest.py.

@SugariuClaudiu
Copy link
Author

Thank you for this. I had a look. However, this way it is too heavy from a computational point of view. I am trying to run it on an EC2 instance in a reasonable amount of time so I can serve it as an API call.

Any suggestions ?
And again the question about the LICENCE. Under what terms can I use your models ?

@mgrankin
Copy link
Owner

Have you tried to run it on a GPU instance?

Apache 2.0

@stale
Copy link

stale bot commented Mar 19, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Mar 19, 2020
@stale stale bot closed this as completed Mar 26, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants