Incorrect prediction Using Huggingface Transformers converted to ONNX format #62

aforoughi1 · 2022-03-05T13:36:00Z

Problem:
All my predictions for the test set were "neutral"

Platform:
1- Huggingface Transformers
2- ProsusAI/finBERT Pretrained Model
3- Ml.Net and VS2022

Steps to reproduce:
1- intalled 64 bit miniconda
2- pip install transformer
3- pip install onnxruntime
4- pip install torch
5- python -m transformers.onnx --model=ProsusAI/finbert --feature=sequence-classification onnx
6- Inspect the model (model.onnx) and figure out its inputs and outputs. We use Netron.
7- Create classes that will handle model input and output.
Loading ONNX Model with ML.NET, loaded using ApplyOnnxModel when creating a training pipeline.
Make sure that you define shapeDictionary when calling the ApplyOnnxModel function during the
pipeline creation. Then, Fit method with an empty list.
8- The first challenge with working with Huggingface Transformers in .NET is that we will need to
build your own tokenizer. This also means that you will need to take care of the vocabulary.
We did not used the Finbert Tokenizer("ProsusAI/finbert") instead a FinBert Vocab uncased downloaded from
https://github.com/yya518/FinBERT

9- The model output is stored in a variable logits which is a n*3 matrix where n is the number of sentences. We used The Softmax function to express our output as a discrete probability distribution and converted label2id using config.json
"label2id": {
"positive": 0,
"negative": 1,
"neutral": 2
}

10- The result:

Bert Tokenizer output
Index: 3 112 21 24250 94 12 22158 4
Token: '[CLS]' 'there' 'are' 'doubts' 'about' 'our' 'finances' '[SEP]'

Index Score Probaiblity Label
0 0.4706 0.3272 Positive
1 -1.0515 0.0714 Negative
2 1.0794 0.6014 Neutral <--- Best Prediction

Expected behaviour:

I tested the model on huggingface and it seems to be working fine

"growth is strong and we have plenty of liquidity"

[
[
{
"label": "positive",
"score": 0.9025793075561523
},
{
"label": "negative",
"score": 0.010695128701627254
},
{
"label": "neutral",
"score": 0.08672556281089783
}
]
]

All my predictions for the test set were neutral which led me to believe something was wrong w my code.

1- the vocabulary
2- The interpretation of model output logits using Softmax
3- the mapping of label2id

could you help?
-provide your vocablury file so we can eliminate 1
-Explain the output and if Softmax is a correct approach?
-Is the label2id correct

aforoughi1 · 2022-03-09T12:01:21Z

The problem was the language model. I switched to "bert-base-uncased".
The ONNX and ML.NET worked perfect.

aforoughi1 closed this as completed Mar 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect prediction Using Huggingface Transformers converted to ONNX format #62

Incorrect prediction Using Huggingface Transformers converted to ONNX format #62

aforoughi1 commented Mar 5, 2022

aforoughi1 commented Mar 9, 2022

Incorrect prediction Using Huggingface Transformers converted to ONNX format #62

Incorrect prediction Using Huggingface Transformers converted to ONNX format #62

Comments

aforoughi1 commented Mar 5, 2022

aforoughi1 commented Mar 9, 2022