Unable to download community models #2392

cbowdon · 2020-01-03T14:55:43Z

🐛 Bug

Model I am using (Bert, XLNet....): bert-base-cased-finetuned-conll03-english

Language I am using the model on (English, Chinese....): English

The problem arise when using:

the official example scripts: running a small snippet from docs (see below)
my own modified scripts: (give details)

The tasks I am working on is:

an official GLUE/SQUaD task: (give the name)
my own task or dataset: just trying to load the model at this stage

To Reproduce

Steps to reproduce the behavior:

I'm following the instructions at https://huggingface.co/bert-large-cased-finetuned-conll03-english but failing at the first hurdle. This is the snippet from the docs that I've run:

tokenizer = AutoTokenizer.from_pretrained("bert-large-cased-finetuned-conll03-english")
model = AutoModel.from_pretrained("bert-large-cased-finetuned-conll03-english")

It fails with this message:

OSError: Model name 'bert-base-cased-finetuned-conll03-english' was not found in model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased, bert-base-japanese, bert-base-japanese-whole-word-masking, bert-base-japanese-char, bert-base-japanese-char-whole-word-masking, bert-base-finnish-cased-v1, bert-base-finnish-uncased-v1). We assumed 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-conll03-english/config.json' was a path or url to a configuration file named config.json or a directory containing such a file but couldn't find any such file at this path or url.

The message mentions looking at https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-conll03-english/config.json and finding nothing.

I also tried with the CLI: transformers-cli download bert-base-cased-finetuned-conll03-english but I'm afraid that failed with a similar message. However both methods work for the namespaced models, e.g. dbmdz/bert-base-italian-cased.

Expected behavior

The community model should download. :)

Environment

OS: openSUSE Tumbleweed 20200101
Python version: 3.7
PyTorch version: 1.3.1
PyTorch Transformers version (or branch): 2.3.0
Using GPU ? n/a
Distributed of parallel setup ? n/a
Any other relevant information:

Additional context

I browsed https://s3.amazonaws.com/models.huggingface.co/ and see that the model is there, but paths are like:

https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-conll03-english-config.json

rather than:

https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-conll03-english/config.json

(note -config.json vs /config.json)

If I download the files manually and rename, the model loads. So it looks like just a naming problem.

The text was updated successfully, but these errors were encountered:

mandubian · 2020-01-03T21:31:06Z

I confirm what you see... in current master code, bert-large-cased-finetuned-conll03-english has no mapping in tokenizers or models so it can't find it in the same way as bert-base-uncased for example.

but it works if you target it directly:

AutoTokenizer.from_pretrained("https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-conll03-english-config.json")

AutoModel.from_pretrained("https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-conll03-english-pytorch_model.bin")

julien-c · 2020-01-11T04:21:48Z

Hmm, I think I see the issue. @stefan-it @mfuntowicz we could either:

move bert-large-cased-finetuned-conll03-english to dbmdz/bert-large-cased-finetuned-conll03-english
or add shortcut model names inside the codebase (config, model, tokenizer)

What do you think?

(also kinda related to #2281)

stefan-it · 2020-01-14T14:35:12Z

@julien-c I think it would be better to move the model under the dbmdz namespace - as it is no "official" model!

mfuntowicz · 2020-01-14T14:48:53Z

@julien-c moving to dbmdz is fine. We need to update the default NER pipeline's model provider to reflect the new path.

julien-c · 2020-01-15T15:50:20Z

Model now lives at https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english

Let me know if everything works correctly!

cbowdon · 2020-01-17T10:04:20Z

Works perfectly now, thanks!

LysandreJik assigned LysandreJik and julien-c and unassigned LysandreJik Jan 7, 2020

julien-c closed this as completed in e184ad1 Jan 15, 2020

freke70 mentioned this issue Jun 15, 2020

App.py needs to be updated godatadriven/rhyme-with-ai#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to download community models #2392

Unable to download community models #2392

cbowdon commented Jan 3, 2020 •

edited

mandubian commented Jan 3, 2020

julien-c commented Jan 11, 2020

stefan-it commented Jan 14, 2020

mfuntowicz commented Jan 14, 2020

julien-c commented Jan 15, 2020

cbowdon commented Jan 17, 2020

Unable to download community models #2392

Unable to download community models #2392

Comments

cbowdon commented Jan 3, 2020 • edited

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

mandubian commented Jan 3, 2020

julien-c commented Jan 11, 2020

stefan-it commented Jan 14, 2020

mfuntowicz commented Jan 14, 2020

julien-c commented Jan 15, 2020

cbowdon commented Jan 17, 2020

cbowdon commented Jan 3, 2020 •

edited