Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help with NER usage #8

Closed
max-frai opened this issue Feb 27, 2020 · 5 comments
Closed

Help with NER usage #8

max-frai opened this issue Feb 27, 2020 · 5 comments

Comments

@max-frai
Copy link

max-frai commented Feb 27, 2020

Hello, I have this learned model:
https://github.com/deepmipt/DeepPavlov/blob/0.8.0/deeppavlov/configs/ner/ner_ontonotes_bert_mult.json

In downloads section it mentions two items: BERT model and NER. Is it possible to use NER model with this crate? It requires vocab.txt, but list of files is:

checkpoint 
model.data-00000-of-00001
model.index
model.meta
tag.dict
@guillaume-be
Copy link
Owner

Hello,

This library has been designed to offer compatibility with the Pytorch models that are trained from the Huggingface's Transformers library (https://github.com/huggingface/transformers). These models include the official Huggingface pretrained and finetuned models (https://huggingface.co/transformers/pretrained_models.html) and the community models (https://huggingface.co/models)

They require the following:

  1. config.json file
  2. pytorch_model.bin file
  3. vocab.txt file
    (optional) 4. merges.txt file for the BPE-based models

In order to use this Rust implementation there are 3 alternatives:

  1. Use one of the Transformers pre-trained models. An example for NER (finetuned on CoNLL03) can be set-up automatically by running the download-dependencies_bert_ner.py script in this repository. Any other community model should be working similarly by modifying the download script.
  2. Train your own model using the Transformers library, and run the script pointing to the model you trained this way.
  3. Convert an existing model that was not trained using the Transformers library to the format required by this library. This 3rd option is more flexibly but probably requires the most work.

For the 3rd option, the name of the parameters is assumes the same conventions as the models from the Python Transformers library. In order to load weights from a different definition, additional pre-processing steps will be required. These include:

  1. Creation of a config.json file matching the requirements for the library (for BERT an example can be found at https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json)
  2. Saving the wordpiece vocabulary under a format similar to the one used in https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt
  3. If you start from a Tensorflow model, convert the model to a Pytorch set of weights
  4. Update the name of the parameters to match those that are expected in the Transformers implementation. An example can be found at https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin
  5. Modify the download-dependencies_bert.py script to use the files you prepared in order to generate an .ot file expected by the Rust implementation.

I hope this helps,

@max-frai
Copy link
Author

Hello! Thank you for such descriptive answer. I need NER for russian language, so can't use base models from Huggingface.
Here I found trained BERT for russian:

RuBERT, Russian, cased, 12-layer, 768-hidden, 12-heads, 180M parameters: [deeppavlov], [deeppavlov_pytorch]

The pytorch version contains exactly the same files, needed for this crate.

I converted it with modified download-dependencies_bert.py. The first problem is that initial bert_config.json does not have info about labels (label2id, id2label, num_labels). I tried to fill it manually from the docs page on download page. But I'm not sure how to know id used for labels during training process.

Now NER example fails with:

Error: ErrorMessage { msg: "cannot find classifier.bias in \"/Users/frai/rustbert/bert-ner/model.ot\"" }

I think the problem is with wrong label ids and count.
Also I'm relative new to pytorch, tensorflow and deep learning so I'm not sure about some questions.
Do I need special NER model based on BERT? Do all BERT models have been trained for NER task? Also how to get correct ids and labels if they are missing in config?

Thank you.

@max-frai
Copy link
Author

I dumped some values from tch in load function:

        let named_tensors: HashMap<_, _> = named_tensors.into_iter().collect();
        let mut variables = self.variables_.lock().unwrap();

So variables contains:

"classifier.bias": [0.014156151562929153, -0.026622507721185684, 0.0007353387773036957, 0.021386444568634033, 0.0057175420224666595, -0.020915638655424118, -0.03568051755428314, -0.025959249585866928, -0.027042701840400696],

And it tries to get classifier.bias from named_tensors where it's missed. So it looks like, the problem is in model itself and maybe some other wrong naming.

So I looked at named_tensors and found the following possible keys:

"cls.seq_relationship.bias": [-0.025538872927427292, 0.0279309693723917],
"cls.predictions.bias": Tensor[[119547], Float],
"bert.embeddings.LayerNorm.bias": Tensor[[768], Float],
"cls.predictions.transform.dense.bias": Tensor[[768], Float],

Maybe some of these keys are exactly classifier.bias?

@guillaume-be
Copy link
Owner

guillaume-be commented Feb 29, 2020

Hello,

I believe the Pytorch model you are trying to use for NER does not contain the set of weights required for the task. The cls.seq_relationship layer seem to be the last layer used for token classification. However, its dimension (2) seems too low for a NER task and definitely too low for an Ontonotes NER model. Instead, it corresponds to a BERT model for sequence classification (e.g. sentiment analysis).

The NER models seem to be stored in another page: http://docs.deeppavlov.ai/en/master/features/models/ner.html. I have tried downloading the model, but could not find a Pytorch version.

However, I validated that the Deeppavlov models would be compatible with this Rust implementation. If you run replace the conversion script with the following section :

weights = torch.load('path/to/rubert_cased_L-12_H-768_A-12_pt/pytorch_model.bin')
nps = {}
for k, v in weights.items():
    nps[k.replace('cls.seq_relationship', 'classifier')] = np.ascontiguousarray(v.cpu().numpy())

np.savez(target_path / 'model.npz', **nps)

And add the following to the config.json file:

  "num_labels": 2,
    "label2id": {
    "O": 0,
    "B-MISC": 1
  },
  "id2label": {
    "0": "O",
    "1": "B-MISC"
  }

You should be able to run the model. However, as this is not the correct last layer for NER, this will run but not yield any useful result.

If you can find a Pytorch version of the NER model, or successfully convert the Tensorflow weights to Pytorch - I'd be glad to help further.

Thank you,

@guillaume-be
Copy link
Owner

Hello,

The Deeppavlov team has uploaded models in the Huggingface repository: https://huggingface.co/DeepPavlov. As far as I could see, this does not include a NER model yet.

I will close this issue for now. Please re-open when you have found a NER model to use and need support for loading the weights.

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants