Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting all negative predictions when fine-tune my data #154

Closed
yanfan0531 opened this issue Nov 21, 2018 · 14 comments
Closed

Getting all negative predictions when fine-tune my data #154

yanfan0531 opened this issue Nov 21, 2018 · 14 comments

Comments

@yanfan0531
Copy link

Hi, I'm following the fine-tuning codes on my own dataset, which is a sentence pair classification task. All the parameters are the same as the example code. However, I get all negative predictions when doing the evaluation. Any ideas of what happened?

The code:

export BERT_BASE_DIR=/home/fy/uncased_L-12_H-768_A-12
export GLUE_DIR=/home/fy/glue_data
export TRAINED_CLASSIFIER=/tmp/ml_output/

python run_concept_classifier.py \
  --task_name=MRPC \
  --do_eval=true \
  --data_dir=$GLUE_DIR/ml_concept \
  --vocab_file=$BERT_BASE_DIR/vocab.txt \
  --bert_config_file=$BERT_BASE_DIR/bert_config.json \
  --init_checkpoint=$TRAINED_CLASSIFIER \
  --max_seq_length=512 \
  --output_dir=/tmp/ml_output/

The evaluation result:
INFO:tensorflow:***** Eval results *****
INFO:tensorflow: eval_accuracy = 0.64309764
INFO:tensorflow: eval_fn = 106.0
INFO:tensorflow: eval_fp = 0.0
INFO:tensorflow: eval_loss = 1.2086438
INFO:tensorflow: eval_precision = 0.0
INFO:tensorflow: eval_recall = 0.0
INFO:tensorflow: eval_tn = 191.0
INFO:tensorflow: eval_tp = 0.0
INFO:tensorflow: global_step = 83
INFO:tensorflow: loss = 1.1816607

@monanahe
Copy link

Do you shuffle your data?

@yanfan0531
Copy link
Author

Do you shuffle your data?

Thanks for reminding. I forgot to shuffle the data. I'll update the result once the model's trained on the new dataset. Thanks again!

@yanfan0531
Copy link
Author

Do you shuffle your data?

The model performs perfectly after shuffling the dataset. Thanks! I am closing it.

INFO:tensorflow: eval_accuracy = 0.97306395
INFO:tensorflow: eval_fn = 5.0
INFO:tensorflow: eval_fp = 3.0
INFO:tensorflow: eval_loss = 0.13963467
INFO:tensorflow: eval_precision = 0.97115386
INFO:tensorflow: eval_recall = 0.9528302
INFO:tensorflow: eval_tn = 188.0
INFO:tensorflow: eval_tp = 101.0
INFO:tensorflow: global_step = 83
INFO:tensorflow: loss = 0.13657574

@tiru1930
Copy link

Even after shuffling my model predicting only positive i,e 2 column has high probability, My data set has question and answer pairs with their label, i,e label is 1 if question and answer matches ,else 0.

Am doing anything wrong here,tired with different sequence lengths

@anubhavpnp
Copy link

@yanfan0531 Does shuffling the dataset mean that all the examples of a class not be in a serial order ..i.e.all class o examples followed by all class 1 examples and so on. Instead we should have examples in train.tsv mixed e.g. few of class 0 and then of class 1 ,then few of class 0 and so on. Actually, my bert classifier is always predicting one particular class for all examples in test dataset and my training dataset examples are in a serial order, not mixed order

@yanfan0531
Copy link
Author

@anubhavpnp Hi, you are right about the shuffling. You should try mixing the order of labels in training data, and it solved my problem.

@anubhavpnp
Copy link

Thanks for your help @yanfan0531 ..Shuffling solved my issue and bert is now giving correct predictions. I read a bit more and shuffling is usually a pre requisite for training a model in neural networks.

@adrianog
Copy link

How do we get probabilities [0.0,1.0] ? Do we need to add an additional softmax as per #322?

@anubhavpnp
Copy link

How do we get probabilities [0.0,1.0] ? Do we need to add an additional softmax as per #322?

I did not add any additional softmax layer..My bert prediction classifier is working fine.. @adrianog

@adrianog
Copy link

Are the probabilities returned for each class in the range [0.0,1.0]? Not even the example colab returns normalised probabilities.

@anubhavpnp
Copy link

yes..it is between 0 and 1

@adrianog
Copy link

Thanks for confirming. Is your code the same as on the example colab ?

@anubhavpnp
Copy link

I am using code which is here..https://github.com/winstarwang/rasa_nlu_bert

@adrianog
Copy link

I don't see any difference with the official google bert model function, so I'm not too sure as to why probabilities are not in the [0,1] range in the example colab... And in my code.

Relevant bits:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants