Fine-Tune on our dataset #6

miladfa7 · 2020-09-12T13:08:30Z

how to fine-tune ParseBERT Model on our dataset?
Please help me ...
thanks

m3hrdadfi · 2020-09-13T14:05:44Z

You can use this Colab to fine-tuning your dataset based on the text classification tasks. For other down-stream tasks, I'm afraid to say that you need to be patient. I'll add others soon!

tannazhp74 · 2020-09-25T15:15:18Z

how to get embedding of ParsBert pretrain model?

MinaRezaee · 2020-10-03T08:37:58Z

I used this Bert Persian classification model
Model saved with a config
But I want to load that model separately and predict label sentence with model
It gets the error

raise ValueError ('No model found in config file.')
ValueError: No model found in config file.

How can I add a config file so that I do not get this error?

m3hrdadfi · 2020-10-03T09:02:59Z

I used this Bert Persian classification model
Model saved with a config
But I want to load that model separately and predict label sentence with model
It gets the error

raise ValueError ('No model found in config file.')
ValueError: No model found in config file.

How can I add a config file so that I do not get this error?

Did you fine-tune parsbert on your dataset? Which methods did you use (PyTorch, TensorFlow, Script)? Did you save your model except for the script technique (What type of files do you have on your saved model directory)?

MinaRezaee · 2020-10-03T09:09:26Z

https://github.com/hooshvare/parsbert/blob/master/notebooks/Taaghche_Sentiment_Analysis.ipynb

I made my model from this link
Yes, I fine-tuned my data and i have 3 labels
I have two saved files in pytorch_model.bin
Named: tf_model.h5 and config.json
I load my model this way
from keras.models import load_model
model = keras.models.load_model ('tf_model.h5')

m3hrdadfi · 2020-10-03T09:15:05Z

Ok, then. Your model fine-tuned on Transformers you can't load the model just as simple as Keras...load you must load your fine-tuned model using transformers
if you have tf_model.h5 on your saved directory, use this:

from transformers import TFAutoModelForSequenceClassification

tf_model = TFAutoModelForSequenceClassification.from_pretrained(YOURSAVED_DIRECTORY)

otherwise, if you have pytorch_model.bin

from transformers import TFAutoModelForSequenceClassification

tf_model = TFAutoModelForSequenceClassification.from_pretrained(YOURSAVED_DIRECTORY, from_pt=True)

Also, make sure you have config.json and vocab.txt in your directory!

MinaRezaee · 2020-10-03T09:25:25Z

With my model, only these two files are saved
Named: tf_model.h5 and config.json
I do not have this file vocab.txt

from transformers import TFAutoModelForSequenceClassification
tf_model = TFAutoModelForSequenceClassification.from_pretrained('tf_model.h5')

This is how I got this error
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

MinaRezaee · 2020-10-03T09:32:33Z

With my model, only these two files are saved
Named: tf_model.h5 and config.json
I do not have this file vocab.txt

from transformers import TFAutoModelForSequenceClassification
tf_model = TFAutoModelForSequenceClassification.from_pretrained('tf_model.h5')

This is how I got this error
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

Link used
Nothing called this file saved

m3hrdadfi · 2020-10-03T09:44:54Z

First of all, you can download vocab.txt from here.

https://cdn.huggingface.co/HooshvareLab/bert-base-parsbert-uncased/vocab.txt

Secondly, you must load the model from the saved directory, not just the h5 model! supposed that I have a directory with the name of bert-fa-cls-base-uncased and it includes:

+ bert-fa-cls-base-uncased
    - config.json
    - vocab.txt
    - tf_model.h5

you need to pass the directory, not the model singly; I mean, load your model using this piece of code:

from transformers import TFAutoModelForSequenceClassification

tf_model = TFAutoModelForSequenceClassification.from_pretrained("./bert-fa-cls-base-uncased/")

MinaRezaee · 2020-10-04T11:04:10Z

First of all, you can download vocab.txt from here.
https://cdn.huggingface.co/HooshvareLab/bert-base-parsbert-uncased/vocab.txt
Secondly, you must load the model from the saved directory, not just the h5 model! supposed that I have a directory with the name of bert-fa-cls-base-uncased and it includes:
+ bert-fa-cls-base-uncased
    - config.json
    - vocab.txt
    - tf_model.h5
you need to pass the directory, not the model singly; I mean, load your model using this piece of code:
from transformers import TFAutoModelForSequenceClassification

tf_model = TFAutoModelForSequenceClassification.from_pretrained("./bert-fa-cls-base-uncased/")

please help me
i get input sentence and i want predict label from the save model
But I think the preprocessing part and the pad of my sentence were done wrong
Because I can not predict label

from transformers import BertConfig, BertTokenizer
MODEL_NAME_OR_PATH = 'HooshvareLab/bert-fa-base-uncased'
tokenizer = BertTokenizer.from_pretrained(MODEL_NAME_OR_PATH)
sample_comment= "شعار ما هوش مصنوعی برای همه است"
max_length=32
tokens = tokenizer.tokenize(sample_comment, padding=True, max_length=42)
token_ids = tokenizer.convert_tokens_to_ids(tokens)
from transformers import TFAutoModelForSequenceClassification
tf_model = TFAutoModelForSequenceClassification.from_pretrained("./pytorch_model.bin/")
predictions = tf_model.predict(token_ids)
print(predictions)

m3hrdadfi · 2020-10-04T12:58:47Z

The whole process is as simple as you think! but before dive into it, we need to set some grounds

The fine-tuned model saved on a directory in this case bert-fa-base-uncased-sentiment-snappfood
The directory consists of these properties: config.json tf_model.h5 vocab.txt

I'm going to demonstrate the entire steps regarding one of our models base-uncased-sentiment-snappfood the procedure is as follow:

Load the packages
Load the config, tokenizer, and the model
The inference

0 + and a preliminary step regarding the mentioned model, in your case you don't need to this part.

Step 0

!pip install -qU transformers

!mkdir -p /content/bert-fa-base-uncased-sentiment-snappfood
!wget https://s3.amazonaws.com/models.huggingface.co/bert/HooshvareLab/bert-fa-base-uncased-sentiment-snappfood/config.json -qO /content/bert-fa-base-uncased-sentiment-snappfood/config.json
!wget https://cdn.huggingface.co/HooshvareLab/bert-fa-base-uncased-sentiment-snappfood/tf_model.h5 -qO /content/bert-fa-base-uncased-sentiment-snappfood/tf_model.h5
!wget https://cdn.huggingface.co/HooshvareLab/bert-fa-base-uncased-sentiment-snappfood/vocab.txt -qO /content/bert-fa-base-uncased-sentiment-snappfood/vocab.txt

!ls /content/bert-fa-base-uncased-sentiment-snappfood

Output

config.json  tf_model.h5  vocab.txt

Step 1

from transformers import TFBertForSequenceClassification
from transformers import AutoConfig
from transformers import AutoTokenizer

import tensorflow as tf
import numpy as np

Step 2

config = AutoConfig.from_pretrained('/content/bert-fa-base-uncased-sentiment-snappfood/')
tokenizer = AutoTokenizer.from_pretrained('/content/bert-fa-base-uncased-sentiment-snappfood/')

model = TFBertForSequenceClassification.from_pretrained('/content/bert-fa-base-uncased-sentiment-snappfood/')
model.summary()

Output

All model checkpoint weights were used when initializing TFBertForSequenceClassification.

All the weights of TFBertForSequenceClassification were initialized from the model checkpoint at /content/bert-fa-base-uncased-sentiment-snappfood/.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertForSequenceClassification for predictions without further training.
Model: "tf_bert_for_sequence_classification"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
bert (TFBertMainLayer)       multiple                  162841344 
_________________________________________________________________
dropout_37 (Dropout)         multiple                  0         
_________________________________________________________________
classifier (Dense)           multiple                  1538      
=================================================================
Total params: 162,842,882
Trainable params: 162,842,882
Non-trainable params: 0
_________________________________________________________________

Step 3

prompt = 'این خوراک بسیار خوب است'

inputs = tokenizer.encode(prompt, return_tensors="tf", max_length=128, padding=True, truncation=True)
logits = model(inputs)[0]
outputs = tf.keras.backend.softmax(logits)
prediction = tf.argmax(outputs, axis=1)
prediction = prediction[0].numpy()
scores = outputs[0].numpy()

labels = config.id2label
print(scores)
print(labels[prediction])

Output

[0.9952093  0.00479068]
HAPPY

MinaRezaee · 2020-10-04T13:20:12Z

The whole process is as simple as you think! but before dive into it, we need to set some grounds

The fine-tuned model saved on a directory in this case bert-fa-base-uncased-sentiment-snappfood
The directory consists of these properties: config.json tf_model.h5 vocab.txt

I'm going to demonstrate the entire steps regarding one of our models base-uncased-sentiment-snappfood the procedure is as follow:

Load the packages
Load the config, tokenizer, and the model
The inference

0 + and a preliminary step regarding the mentioned model, in your case you don't need to this part.

Step 0

!pip install -qU transformers

!mkdir -p /content/bert-fa-base-uncased-sentiment-snappfood
!wget https://s3.amazonaws.com/models.huggingface.co/bert/HooshvareLab/bert-fa-base-uncased-sentiment-snappfood/config.json -qO /content/bert-fa-base-uncased-sentiment-snappfood/config.json
!wget https://cdn.huggingface.co/HooshvareLab/bert-fa-base-uncased-sentiment-snappfood/tf_model.h5 -qO /content/bert-fa-base-uncased-sentiment-snappfood/tf_model.h5
!wget https://cdn.huggingface.co/HooshvareLab/bert-fa-base-uncased-sentiment-snappfood/vocab.txt -qO /content/bert-fa-base-uncased-sentiment-snappfood/vocab.txt

!ls /content/bert-fa-base-uncased-sentiment-snappfood

Output

config.json  tf_model.h5  vocab.txt

Step 1

from transformers import TFBertForSequenceClassification
from transformers import AutoConfig
from transformers import AutoTokenizer

import tensorflow as tf
import numpy as np

Step 2

config = AutoConfig.from_pretrained('/content/bert-fa-base-uncased-sentiment-snappfood/')
tokenizer = AutoTokenizer.from_pretrained('/content/bert-fa-base-uncased-sentiment-snappfood/')

model = TFBertForSequenceClassification.from_pretrained('/content/bert-fa-base-uncased-sentiment-snappfood/')
model.summary()

Output

All model checkpoint weights were used when initializing TFBertForSequenceClassification.

All the weights of TFBertForSequenceClassification were initialized from the model checkpoint at /content/bert-fa-base-uncased-sentiment-snappfood/.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertForSequenceClassification for predictions without further training.
Model: "tf_bert_for_sequence_classification"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
bert (TFBertMainLayer)       multiple                  162841344 
_________________________________________________________________
dropout_37 (Dropout)         multiple                  0         
_________________________________________________________________
classifier (Dense)           multiple                  1538      
=================================================================
Total params: 162,842,882
Trainable params: 162,842,882
Non-trainable params: 0
_________________________________________________________________

Step 3

prompt = 'این خوراک بسیار خوب است'

inputs = tokenizer.encode(prompt, return_tensors="tf", max_length=128, padding=True, truncation=True)
logits = model(inputs)[0]
outputs = tf.keras.backend.softmax(logits)
prediction = tf.argmax(outputs, axis=1)
prediction = prediction[0].numpy()
scores = outputs[0].numpy()

labels = config.id2label
print(scores)
print(labels[prediction])

Output

[0.9952093  0.00479068]
HAPPY

Thank you very much for your help

MinaRezaee · 2020-10-05T13:38:28Z

Hello Mehrdad
I will help you again
The accuracy of the model for 170,000 data is 89%
What I think is not very good: the amount of loss_validation is that when the number of epochs increases, the amount of loss_valitiona increases.

I used 45300 test data for the model predicate
Unfortunately, I did not have a good prediction and recognizes many negative sentences as positive

اصلاح طلبی باید راهبرد راهگشایی برای برونرفت حاکمیت ازین بن بست سیاسی چهل ساله که ریشه همه یا کمینه بیشتر مشکلات کنونی کشوراست ارائه کند وانهم تناقض وتنافر بزرگ حاکمیتی یعنی جمهوریت وولایت مطلقه است تا موضع وسمت وسوی خود را شفاف وبوضوح بیان نکند ازاصلاح طلبی فقط همان نامش را یدک میکشد,positive,political

label sentence is a positive but predict political

my model have 3 labels :

label2id: {'negative': 0, 'political': 1, 'positive': 2}

id2label: {0: 'negative', 1: 'political', 2: 'positive'}

How do you think I can improve the accuracy and floss of the model to have a better prediction?

MinaRezaee · 2020-10-20T06:13:38Z

The whole process is as simple as you think! but before dive into it, we need to set some grounds

The fine-tuned model saved on a directory in this case bert-fa-base-uncased-sentiment-snappfood
The directory consists of these properties: config.json tf_model.h5 vocab.txt

I'm going to demonstrate the entire steps regarding one of our models base-uncased-sentiment-snappfood the procedure is as follow:

Load the packages
Load the config, tokenizer, and the model
The inference

0 + and a preliminary step regarding the mentioned model, in your case you don't need to this part.

Step 0

!pip install -qU transformers

!mkdir -p /content/bert-fa-base-uncased-sentiment-snappfood
!wget https://s3.amazonaws.com/models.huggingface.co/bert/HooshvareLab/bert-fa-base-uncased-sentiment-snappfood/config.json -qO /content/bert-fa-base-uncased-sentiment-snappfood/config.json
!wget https://cdn.huggingface.co/HooshvareLab/bert-fa-base-uncased-sentiment-snappfood/tf_model.h5 -qO /content/bert-fa-base-uncased-sentiment-snappfood/tf_model.h5
!wget https://cdn.huggingface.co/HooshvareLab/bert-fa-base-uncased-sentiment-snappfood/vocab.txt -qO /content/bert-fa-base-uncased-sentiment-snappfood/vocab.txt

!ls /content/bert-fa-base-uncased-sentiment-snappfood

Output

config.json  tf_model.h5  vocab.txt

Step 1

from transformers import TFBertForSequenceClassification
from transformers import AutoConfig
from transformers import AutoTokenizer

import tensorflow as tf
import numpy as np

Step 2

config = AutoConfig.from_pretrained('/content/bert-fa-base-uncased-sentiment-snappfood/')
tokenizer = AutoTokenizer.from_pretrained('/content/bert-fa-base-uncased-sentiment-snappfood/')

model = TFBertForSequenceClassification.from_pretrained('/content/bert-fa-base-uncased-sentiment-snappfood/')
model.summary()

Output

All model checkpoint weights were used when initializing TFBertForSequenceClassification.

All the weights of TFBertForSequenceClassification were initialized from the model checkpoint at /content/bert-fa-base-uncased-sentiment-snappfood/.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertForSequenceClassification for predictions without further training.
Model: "tf_bert_for_sequence_classification"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
bert (TFBertMainLayer)       multiple                  162841344 
_________________________________________________________________
dropout_37 (Dropout)         multiple                  0         
_________________________________________________________________
classifier (Dense)           multiple                  1538      
=================================================================
Total params: 162,842,882
Trainable params: 162,842,882
Non-trainable params: 0
_________________________________________________________________

Step 3

prompt = 'این خوراک بسیار خوب است'

inputs = tokenizer.encode(prompt, return_tensors="tf", max_length=128, padding=True, truncation=True)
logits = model(inputs)[0]
outputs = tf.keras.backend.softmax(logits)
prediction = tf.argmax(outputs, axis=1)
prediction = prediction[0].numpy()
scores = outputs[0].numpy()

labels = config.id2label
print(scores)
print(labels[prediction])

Output

[0.9952093  0.00479068]
HAPPY

Hello Mehrdad
I will help you again
The accuracy of the model for 170,000 data is 89%
What I think is not very good: the amount of loss_validation is that when the number of epochs increases, the amount of loss_valitiona increases.

I used 45300 test data for the model predicate
Unfortunately, I did not have a good prediction and recognizes many negative sentences as positive?

MinaRezaee · 2020-10-20T07:04:23Z

hello mehrdad
can you help me again?
What parameters can I change to get better accuracy?
I changed these parameters to some extent:
MAX_LEN = 64
TRAIN_BATCH_SIZE = 16
VALID_BATCH_SIZE = 16
TEST_BATCH_SIZE = 16
EPOCHS = 10
EEVERY_EPOCH = 500
LEARNING_RATE = 2e-5
CLIP = 0.0

But my lossـvalidation value increased and the accuracy decreased

How can I increase my accuracy to have a better forecast(predict)?

MinaRezaee · 2020-10-20T07:36:38Z

Epoch 1/10
3260/3260 [==============================] - 1011s 310ms/step - loss: 0.3203 - accuracy: 0.8780 - val_loss: 0.2650 - val_accuracy: 0.9039
Epoch 2/10
3260/3260 [==============================] - 1012s 310ms/step - loss: 0.1934 - accuracy: 0.9294 - val_loss: 0.2776 - val_accuracy: 0.9107
Epoch 3/10
3260/3260 [==============================] - 1013s 311ms/step - loss: 0.1207 - accuracy: 0.9580 - val_loss: 0.3280 - val_accuracy: 0.8958

miladfa7 changed the title ~~Fine-Tune our dataset~~ Fine-Tune on our dataset Sep 12, 2020

m3hrdadfi closed this as completed Oct 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine-Tune on our dataset #6

Fine-Tune on our dataset #6

miladfa7 commented Sep 12, 2020

m3hrdadfi commented Sep 13, 2020

tannazhp74 commented Sep 25, 2020

MinaRezaee commented Oct 3, 2020

m3hrdadfi commented Oct 3, 2020

MinaRezaee commented Oct 3, 2020

m3hrdadfi commented Oct 3, 2020 •

edited

Loading

MinaRezaee commented Oct 3, 2020

MinaRezaee commented Oct 3, 2020

m3hrdadfi commented Oct 3, 2020 •

edited

Loading

MinaRezaee commented Oct 4, 2020

m3hrdadfi commented Oct 4, 2020

MinaRezaee commented Oct 4, 2020

Step 0

Step 1

Step 2

Step 3

MinaRezaee commented Oct 5, 2020

MinaRezaee commented Oct 20, 2020

Step 0

Step 1

Step 2

Step 3

MinaRezaee commented Oct 20, 2020

MinaRezaee commented Oct 20, 2020

Fine-Tune on our dataset #6

Fine-Tune on our dataset #6

Comments

miladfa7 commented Sep 12, 2020

m3hrdadfi commented Sep 13, 2020

tannazhp74 commented Sep 25, 2020

MinaRezaee commented Oct 3, 2020

m3hrdadfi commented Oct 3, 2020

MinaRezaee commented Oct 3, 2020

m3hrdadfi commented Oct 3, 2020 • edited Loading

MinaRezaee commented Oct 3, 2020

MinaRezaee commented Oct 3, 2020

m3hrdadfi commented Oct 3, 2020 • edited Loading

MinaRezaee commented Oct 4, 2020

m3hrdadfi commented Oct 4, 2020

Step 0

Step 1

Step 2

Step 3

MinaRezaee commented Oct 4, 2020

Step 0

Step 1

Step 2

Step 3

MinaRezaee commented Oct 5, 2020

label2id: {'negative': 0, 'political': 1, 'positive': 2}

id2label: {0: 'negative', 1: 'political', 2: 'positive'}

MinaRezaee commented Oct 20, 2020

Step 0

Step 1

Step 2

Step 3

MinaRezaee commented Oct 20, 2020

MinaRezaee commented Oct 20, 2020

m3hrdadfi commented Oct 3, 2020 •

edited

Loading

m3hrdadfi commented Oct 3, 2020 •

edited

Loading