Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine-Tune on our dataset #6

Closed
miladfa7 opened this issue Sep 12, 2020 · 16 comments
Closed

Fine-Tune on our dataset #6

miladfa7 opened this issue Sep 12, 2020 · 16 comments

Comments

@miladfa7
Copy link

how to fine-tune ParseBERT Model on our dataset?
Please help me ...
thanks

@miladfa7 miladfa7 changed the title Fine-Tune our dataset Fine-Tune on our dataset Sep 12, 2020
@m3hrdadfi
Copy link
Member

You can use this Colab to fine-tuning your dataset based on the text classification tasks. For other down-stream tasks, I'm afraid to say that you need to be patient. I'll add others soon!

@tannazhp74
Copy link

how to get embedding of ParsBert pretrain model?

@MinaRezaee
Copy link

I used this Bert Persian classification model
Model saved with a config
But I want to load that model separately and predict label sentence with model
It gets the error

raise ValueError ('No model found in config file.')
ValueError: No model found in config file.

How can I add a config file so that I do not get this error?

@m3hrdadfi
Copy link
Member

I used this Bert Persian classification model
Model saved with a config
But I want to load that model separately and predict label sentence with model
It gets the error

raise ValueError ('No model found in config file.')
ValueError: No model found in config file.

How can I add a config file so that I do not get this error?

Did you fine-tune parsbert on your dataset? Which methods did you use (PyTorch, TensorFlow, Script)? Did you save your model except for the script technique (What type of files do you have on your saved model directory)?

@MinaRezaee
Copy link

https://github.com/hooshvare/parsbert/blob/master/notebooks/Taaghche_Sentiment_Analysis.ipynb

I made my model from this link
Yes, I fine-tuned my data and i have 3 labels
I have two saved files in pytorch_model.bin
Named: tf_model.h5 and config.json
I load my model this way
from keras.models import load_model
model = keras.models.load_model ('tf_model.h5')

@m3hrdadfi
Copy link
Member

m3hrdadfi commented Oct 3, 2020

Ok, then. Your model fine-tuned on Transformers you can't load the model just as simple as Keras...load you must load your fine-tuned model using transformers
if you have tf_model.h5 on your saved directory, use this:

from transformers import TFAutoModelForSequenceClassification

tf_model = TFAutoModelForSequenceClassification.from_pretrained(YOURSAVED_DIRECTORY)

otherwise, if you have pytorch_model.bin

from transformers import TFAutoModelForSequenceClassification

tf_model = TFAutoModelForSequenceClassification.from_pretrained(YOURSAVED_DIRECTORY, from_pt=True)

Also, make sure you have config.json and vocab.txt in your directory!

@MinaRezaee
Copy link

With my model, only these two files are saved
Named: tf_model.h5 and config.json
I do not have this file vocab.txt

from transformers import TFAutoModelForSequenceClassification
tf_model = TFAutoModelForSequenceClassification.from_pretrained('tf_model.h5')

This is how I got this error
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

@MinaRezaee
Copy link

With my model, only these two files are saved
Named: tf_model.h5 and config.json
I do not have this file vocab.txt

from transformers import TFAutoModelForSequenceClassification
tf_model = TFAutoModelForSequenceClassification.from_pretrained('tf_model.h5')

This is how I got this error
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

Link used
Nothing called this file saved

@m3hrdadfi
Copy link
Member

m3hrdadfi commented Oct 3, 2020

First of all, you can download vocab.txt from here.

https://cdn.huggingface.co/HooshvareLab/bert-base-parsbert-uncased/vocab.txt

Secondly, you must load the model from the saved directory, not just the h5 model! supposed that I have a directory with the name of bert-fa-cls-base-uncased and it includes:

+ bert-fa-cls-base-uncased
    - config.json
    - vocab.txt
    - tf_model.h5

you need to pass the directory, not the model singly; I mean, load your model using this piece of code:

from transformers import TFAutoModelForSequenceClassification

tf_model = TFAutoModelForSequenceClassification.from_pretrained("./bert-fa-cls-base-uncased/")

@MinaRezaee
Copy link

First of all, you can download vocab.txt from here.

https://cdn.huggingface.co/HooshvareLab/bert-base-parsbert-uncased/vocab.txt

Secondly, you must load the model from the saved directory, not just the h5 model! supposed that I have a directory with the name of bert-fa-cls-base-uncased and it includes:

+ bert-fa-cls-base-uncased
    - config.json
    - vocab.txt
    - tf_model.h5

you need to pass the directory, not the model singly; I mean, load your model using this piece of code:

from transformers import TFAutoModelForSequenceClassification

tf_model = TFAutoModelForSequenceClassification.from_pretrained("./bert-fa-cls-base-uncased/")

please help me
i get input sentence and i want predict label from the save model
But I think the preprocessing part and the pad of my sentence were done wrong
Because I can not predict label

from transformers import BertConfig, BertTokenizer
MODEL_NAME_OR_PATH = 'HooshvareLab/bert-fa-base-uncased'
tokenizer = BertTokenizer.from_pretrained(MODEL_NAME_OR_PATH)
sample_comment= "شعار ما هوش مصنوعی برای همه است"
max_length=32
tokens = tokenizer.tokenize(sample_comment, padding=True, max_length=42)
token_ids = tokenizer.convert_tokens_to_ids(tokens)
from transformers import TFAutoModelForSequenceClassification
tf_model = TFAutoModelForSequenceClassification.from_pretrained("./pytorch_model.bin/")
predictions = tf_model.predict(token_ids)
print(predictions)

@m3hrdadfi
Copy link
Member

The whole process is as simple as you think! but before dive into it, we need to set some grounds

  1. The fine-tuned model saved on a directory in this case bert-fa-base-uncased-sentiment-snappfood
  2. The directory consists of these properties: config.json tf_model.h5 vocab.txt

I'm going to demonstrate the entire steps regarding one of our models base-uncased-sentiment-snappfood the procedure is as follow:

  1. Load the packages
  2. Load the config, tokenizer, and the model
  3. The inference

0 + and a preliminary step regarding the mentioned model, in your case you don't need to this part.

Step 0

!pip install -qU transformers

!mkdir -p /content/bert-fa-base-uncased-sentiment-snappfood
!wget https://s3.amazonaws.com/models.huggingface.co/bert/HooshvareLab/bert-fa-base-uncased-sentiment-snappfood/config.json -qO /content/bert-fa-base-uncased-sentiment-snappfood/config.json
!wget https://cdn.huggingface.co/HooshvareLab/bert-fa-base-uncased-sentiment-snappfood/tf_model.h5 -qO /content/bert-fa-base-uncased-sentiment-snappfood/tf_model.h5
!wget https://cdn.huggingface.co/HooshvareLab/bert-fa-base-uncased-sentiment-snappfood/vocab.txt -qO /content/bert-fa-base-uncased-sentiment-snappfood/vocab.txt

!ls /content/bert-fa-base-uncased-sentiment-snappfood

Output

config.json  tf_model.h5  vocab.txt

Step 1

from transformers import TFBertForSequenceClassification
from transformers import AutoConfig
from transformers import AutoTokenizer

import tensorflow as tf
import numpy as np

Step 2

config = AutoConfig.from_pretrained('/content/bert-fa-base-uncased-sentiment-snappfood/')
tokenizer = AutoTokenizer.from_pretrained('/content/bert-fa-base-uncased-sentiment-snappfood/')

model = TFBertForSequenceClassification.from_pretrained('/content/bert-fa-base-uncased-sentiment-snappfood/')
model.summary()

Output

All model checkpoint weights were used when initializing TFBertForSequenceClassification.

All the weights of TFBertForSequenceClassification were initialized from the model checkpoint at /content/bert-fa-base-uncased-sentiment-snappfood/.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertForSequenceClassification for predictions without further training.
Model: "tf_bert_for_sequence_classification"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
bert (TFBertMainLayer)       multiple                  162841344 
_________________________________________________________________
dropout_37 (Dropout)         multiple                  0         
_________________________________________________________________
classifier (Dense)           multiple                  1538      
=================================================================
Total params: 162,842,882
Trainable params: 162,842,882
Non-trainable params: 0
_________________________________________________________________

Step 3

prompt = 'این خوراک بسیار خوب است'

inputs = tokenizer.encode(prompt, return_tensors="tf", max_length=128, padding=True, truncation=True)
logits = model(inputs)[0]
outputs = tf.keras.backend.softmax(logits)
prediction = tf.argmax(outputs, axis=1)
prediction = prediction[0].numpy()
scores = outputs[0].numpy()

labels = config.id2label
print(scores)
print(labels[prediction])

Output

[0.9952093  0.00479068]
HAPPY

@MinaRezaee
Copy link

The whole process is as simple as you think! but before dive into it, we need to set some grounds

  1. The fine-tuned model saved on a directory in this case bert-fa-base-uncased-sentiment-snappfood
  2. The directory consists of these properties: config.json tf_model.h5 vocab.txt

I'm going to demonstrate the entire steps regarding one of our models base-uncased-sentiment-snappfood the procedure is as follow:

  1. Load the packages
  2. Load the config, tokenizer, and the model
  3. The inference

0 + and a preliminary step regarding the mentioned model, in your case you don't need to this part.

Step 0

!pip install -qU transformers

!mkdir -p /content/bert-fa-base-uncased-sentiment-snappfood
!wget https://s3.amazonaws.com/models.huggingface.co/bert/HooshvareLab/bert-fa-base-uncased-sentiment-snappfood/config.json -qO /content/bert-fa-base-uncased-sentiment-snappfood/config.json
!wget https://cdn.huggingface.co/HooshvareLab/bert-fa-base-uncased-sentiment-snappfood/tf_model.h5 -qO /content/bert-fa-base-uncased-sentiment-snappfood/tf_model.h5
!wget https://cdn.huggingface.co/HooshvareLab/bert-fa-base-uncased-sentiment-snappfood/vocab.txt -qO /content/bert-fa-base-uncased-sentiment-snappfood/vocab.txt

!ls /content/bert-fa-base-uncased-sentiment-snappfood

Output

config.json  tf_model.h5  vocab.txt

Step 1

from transformers import TFBertForSequenceClassification
from transformers import AutoConfig
from transformers import AutoTokenizer

import tensorflow as tf
import numpy as np

Step 2

config = AutoConfig.from_pretrained('/content/bert-fa-base-uncased-sentiment-snappfood/')
tokenizer = AutoTokenizer.from_pretrained('/content/bert-fa-base-uncased-sentiment-snappfood/')

model = TFBertForSequenceClassification.from_pretrained('/content/bert-fa-base-uncased-sentiment-snappfood/')
model.summary()

Output

All model checkpoint weights were used when initializing TFBertForSequenceClassification.

All the weights of TFBertForSequenceClassification were initialized from the model checkpoint at /content/bert-fa-base-uncased-sentiment-snappfood/.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertForSequenceClassification for predictions without further training.
Model: "tf_bert_for_sequence_classification"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
bert (TFBertMainLayer)       multiple                  162841344 
_________________________________________________________________
dropout_37 (Dropout)         multiple                  0         
_________________________________________________________________
classifier (Dense)           multiple                  1538      
=================================================================
Total params: 162,842,882
Trainable params: 162,842,882
Non-trainable params: 0
_________________________________________________________________

Step 3

prompt = 'این خوراک بسیار خوب است'

inputs = tokenizer.encode(prompt, return_tensors="tf", max_length=128, padding=True, truncation=True)
logits = model(inputs)[0]
outputs = tf.keras.backend.softmax(logits)
prediction = tf.argmax(outputs, axis=1)
prediction = prediction[0].numpy()
scores = outputs[0].numpy()

labels = config.id2label
print(scores)
print(labels[prediction])

Output

[0.9952093  0.00479068]
HAPPY

Thank you very much for your help

@MinaRezaee
Copy link

Hello Mehrdad
I will help you again
The accuracy of the model for 170,000 data is 89%
What I think is not very good: the amount of loss_validation is that when the number of epochs increases, the amount of loss_valitiona increases.

I used 45300 test data for the model predicate
Unfortunately, I did not have a good prediction and recognizes many negative sentences as positive

اصلاح طلبی باید راهبرد راهگشایی برای برونرفت حاکمیت ازین بن بست سیاسی چهل ساله که ریشه همه یا کمینه بیشتر مشکلات کنونی کشوراست ارائه کند وانهم تناقض وتنافر بزرگ حاکمیتی یعنی جمهوریت وولایت مطلقه است تا موضع وسمت وسوی خود را شفاف وبوضوح بیان نکند ازاصلاح طلبی فقط همان نامش را یدک میکشد,positive,political

label sentence is a positive but predict political

my model have 3 labels :

label2id: {'negative': 0, 'political': 1, 'positive': 2}

id2label: {0: 'negative', 1: 'political', 2: 'positive'}

How do you think I can improve the accuracy and floss of the model to have a better prediction?

@MinaRezaee
Copy link

The whole process is as simple as you think! but before dive into it, we need to set some grounds

  1. The fine-tuned model saved on a directory in this case bert-fa-base-uncased-sentiment-snappfood
  2. The directory consists of these properties: config.json tf_model.h5 vocab.txt

I'm going to demonstrate the entire steps regarding one of our models base-uncased-sentiment-snappfood the procedure is as follow:

  1. Load the packages
  2. Load the config, tokenizer, and the model
  3. The inference

0 + and a preliminary step regarding the mentioned model, in your case you don't need to this part.

Step 0

!pip install -qU transformers

!mkdir -p /content/bert-fa-base-uncased-sentiment-snappfood
!wget https://s3.amazonaws.com/models.huggingface.co/bert/HooshvareLab/bert-fa-base-uncased-sentiment-snappfood/config.json -qO /content/bert-fa-base-uncased-sentiment-snappfood/config.json
!wget https://cdn.huggingface.co/HooshvareLab/bert-fa-base-uncased-sentiment-snappfood/tf_model.h5 -qO /content/bert-fa-base-uncased-sentiment-snappfood/tf_model.h5
!wget https://cdn.huggingface.co/HooshvareLab/bert-fa-base-uncased-sentiment-snappfood/vocab.txt -qO /content/bert-fa-base-uncased-sentiment-snappfood/vocab.txt

!ls /content/bert-fa-base-uncased-sentiment-snappfood

Output

config.json  tf_model.h5  vocab.txt

Step 1

from transformers import TFBertForSequenceClassification
from transformers import AutoConfig
from transformers import AutoTokenizer

import tensorflow as tf
import numpy as np

Step 2

config = AutoConfig.from_pretrained('/content/bert-fa-base-uncased-sentiment-snappfood/')
tokenizer = AutoTokenizer.from_pretrained('/content/bert-fa-base-uncased-sentiment-snappfood/')

model = TFBertForSequenceClassification.from_pretrained('/content/bert-fa-base-uncased-sentiment-snappfood/')
model.summary()

Output

All model checkpoint weights were used when initializing TFBertForSequenceClassification.

All the weights of TFBertForSequenceClassification were initialized from the model checkpoint at /content/bert-fa-base-uncased-sentiment-snappfood/.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertForSequenceClassification for predictions without further training.
Model: "tf_bert_for_sequence_classification"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
bert (TFBertMainLayer)       multiple                  162841344 
_________________________________________________________________
dropout_37 (Dropout)         multiple                  0         
_________________________________________________________________
classifier (Dense)           multiple                  1538      
=================================================================
Total params: 162,842,882
Trainable params: 162,842,882
Non-trainable params: 0
_________________________________________________________________

Step 3

prompt = 'این خوراک بسیار خوب است'

inputs = tokenizer.encode(prompt, return_tensors="tf", max_length=128, padding=True, truncation=True)
logits = model(inputs)[0]
outputs = tf.keras.backend.softmax(logits)
prediction = tf.argmax(outputs, axis=1)
prediction = prediction[0].numpy()
scores = outputs[0].numpy()

labels = config.id2label
print(scores)
print(labels[prediction])

Output

[0.9952093  0.00479068]
HAPPY

Hello Mehrdad
I will help you again
The accuracy of the model for 170,000 data is 89%
What I think is not very good: the amount of loss_validation is that when the number of epochs increases, the amount of loss_valitiona increases.

I used 45300 test data for the model predicate
Unfortunately, I did not have a good prediction and recognizes many negative sentences as positive?

@MinaRezaee
Copy link

hello mehrdad
can you help me again?
What parameters can I change to get better accuracy?
I changed these parameters to some extent:
MAX_LEN = 64
TRAIN_BATCH_SIZE = 16
VALID_BATCH_SIZE = 16
TEST_BATCH_SIZE = 16
EPOCHS = 10
EEVERY_EPOCH = 500
LEARNING_RATE = 2e-5
CLIP = 0.0

But my lossـvalidation value increased and the accuracy decreased

How can I increase my accuracy to have a better forecast(predict)?

@MinaRezaee
Copy link

Epoch 1/10
3260/3260 [==============================] - 1011s 310ms/step - loss: 0.3203 - accuracy: 0.8780 - val_loss: 0.2650 - val_accuracy: 0.9039
Epoch 2/10
3260/3260 [==============================] - 1012s 310ms/step - loss: 0.1934 - accuracy: 0.9294 - val_loss: 0.2776 - val_accuracy: 0.9107
Epoch 3/10
3260/3260 [==============================] - 1013s 311ms/step - loss: 0.1207 - accuracy: 0.9580 - val_loss: 0.3280 - val_accuracy: 0.8958

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants