Help with NER usage #8

max-frai · 2020-02-27T18:07:24Z

Hello, I have this learned model:
https://github.com/deepmipt/DeepPavlov/blob/0.8.0/deeppavlov/configs/ner/ner_ontonotes_bert_mult.json

In downloads section it mentions two items: BERT model and NER. Is it possible to use NER model with this crate? It requires vocab.txt, but list of files is:

checkpoint 
model.data-00000-of-00001
model.index
model.meta
tag.dict

The text was updated successfully, but these errors were encountered:

guillaume-be · 2020-02-27T20:07:10Z

Hello,

This library has been designed to offer compatibility with the Pytorch models that are trained from the Huggingface's Transformers library (https://github.com/huggingface/transformers). These models include the official Huggingface pretrained and finetuned models (https://huggingface.co/transformers/pretrained_models.html) and the community models (https://huggingface.co/models)

They require the following:

config.json file
pytorch_model.bin file
vocab.txt file
(optional) 4. merges.txt file for the BPE-based models

In order to use this Rust implementation there are 3 alternatives:

Use one of the Transformers pre-trained models. An example for NER (finetuned on CoNLL03) can be set-up automatically by running the download-dependencies_bert_ner.py script in this repository. Any other community model should be working similarly by modifying the download script.
Train your own model using the Transformers library, and run the script pointing to the model you trained this way.
Convert an existing model that was not trained using the Transformers library to the format required by this library. This 3rd option is more flexibly but probably requires the most work.

For the 3rd option, the name of the parameters is assumes the same conventions as the models from the Python Transformers library. In order to load weights from a different definition, additional pre-processing steps will be required. These include:

Creation of a config.json file matching the requirements for the library (for BERT an example can be found at https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json)
Saving the wordpiece vocabulary under a format similar to the one used in https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt
If you start from a Tensorflow model, convert the model to a Pytorch set of weights
Update the name of the parameters to match those that are expected in the Transformers implementation. An example can be found at https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin
Modify the download-dependencies_bert.py script to use the files you prepared in order to generate an .ot file expected by the Rust implementation.

I hope this helps,

max-frai · 2020-02-29T11:03:04Z

Hello! Thank you for such descriptive answer. I need NER for russian language, so can't use base models from Huggingface.
Here I found trained BERT for russian:

RuBERT, Russian, cased, 12-layer, 768-hidden, 12-heads, 180M parameters: [deeppavlov], [deeppavlov_pytorch]

The pytorch version contains exactly the same files, needed for this crate.

I converted it with modified download-dependencies_bert.py. The first problem is that initial bert_config.json does not have info about labels (label2id, id2label, num_labels). I tried to fill it manually from the docs page on download page. But I'm not sure how to know id used for labels during training process.

Now NER example fails with:

Error: ErrorMessage { msg: "cannot find classifier.bias in \"/Users/frai/rustbert/bert-ner/model.ot\"" }

I think the problem is with wrong label ids and count.
Also I'm relative new to pytorch, tensorflow and deep learning so I'm not sure about some questions.
Do I need special NER model based on BERT? Do all BERT models have been trained for NER task? Also how to get correct ids and labels if they are missing in config?

Thank you.

max-frai · 2020-02-29T12:24:26Z

I dumped some values from tch in load function:

        let named_tensors: HashMap<_, _> = named_tensors.into_iter().collect();
        let mut variables = self.variables_.lock().unwrap();

So variables contains:

"classifier.bias": [0.014156151562929153, -0.026622507721185684, 0.0007353387773036957, 0.021386444568634033, 0.0057175420224666595, -0.020915638655424118, -0.03568051755428314, -0.025959249585866928, -0.027042701840400696],

And it tries to get classifier.bias from named_tensors where it's missed. So it looks like, the problem is in model itself and maybe some other wrong naming.

So I looked at named_tensors and found the following possible keys:

"cls.seq_relationship.bias": [-0.025538872927427292, 0.0279309693723917],
"cls.predictions.bias": Tensor[[119547], Float],
"bert.embeddings.LayerNorm.bias": Tensor[[768], Float],
"cls.predictions.transform.dense.bias": Tensor[[768], Float],

Maybe some of these keys are exactly classifier.bias?

guillaume-be · 2020-02-29T14:01:53Z

Hello,

I believe the Pytorch model you are trying to use for NER does not contain the set of weights required for the task. The cls.seq_relationship layer seem to be the last layer used for token classification. However, its dimension (2) seems too low for a NER task and definitely too low for an Ontonotes NER model. Instead, it corresponds to a BERT model for sequence classification (e.g. sentiment analysis).

The NER models seem to be stored in another page: http://docs.deeppavlov.ai/en/master/features/models/ner.html. I have tried downloading the model, but could not find a Pytorch version.

However, I validated that the Deeppavlov models would be compatible with this Rust implementation. If you run replace the conversion script with the following section :

weights = torch.load('path/to/rubert_cased_L-12_H-768_A-12_pt/pytorch_model.bin')
nps = {}
for k, v in weights.items():
    nps[k.replace('cls.seq_relationship', 'classifier')] = np.ascontiguousarray(v.cpu().numpy())

np.savez(target_path / 'model.npz', **nps)

And add the following to the config.json file:

  "num_labels": 2,
    "label2id": {
    "O": 0,
    "B-MISC": 1
  },
  "id2label": {
    "0": "O",
    "1": "B-MISC"
  }

You should be able to run the model. However, as this is not the correct last layer for NER, this will run but not yield any useful result.

If you can find a Pytorch version of the NER model, or successfully convert the Tensorflow weights to Pytorch - I'd be glad to help further.

Thank you,

guillaume-be · 2020-03-08T21:47:59Z

Hello,

The Deeppavlov team has uploaded models in the Huggingface repository: https://huggingface.co/DeepPavlov. As far as I could see, this does not include a NER model yet.

I will close this issue for now. Please re-open when you have found a NER model to use and need support for loading the weights.

Thank you

guillaume-be closed this as completed Mar 8, 2020

sachaarbonel mentioned this issue Jun 28, 2020

Support for OpenAIGPTDoubleHeadsModel, GPT2DoubleHeadsModel, SequenceSummary,AdamW #52

Closed

ycat3 mentioned this issue Feb 21, 2022

GPT2 model for T5Tokenizer does not work. #221

Closed

SorooshMortazavi mentioned this issue Apr 9, 2022

cargo run doesn`t work as expected! #240

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help with NER usage #8

Help with NER usage #8

max-frai commented Feb 27, 2020 •

edited

guillaume-be commented Feb 27, 2020

max-frai commented Feb 29, 2020

max-frai commented Feb 29, 2020

guillaume-be commented Feb 29, 2020 •

edited

guillaume-be commented Mar 8, 2020

Help with NER usage #8

Help with NER usage #8

Comments

max-frai commented Feb 27, 2020 • edited

guillaume-be commented Feb 27, 2020

max-frai commented Feb 29, 2020

max-frai commented Feb 29, 2020

guillaume-be commented Feb 29, 2020 • edited

guillaume-be commented Mar 8, 2020

max-frai commented Feb 27, 2020 •

edited

guillaume-be commented Feb 29, 2020 •

edited