Getting all negative predictions when fine-tune my data #154

yanfan0531 · 2018-11-21T07:26:01Z

Hi, I'm following the fine-tuning codes on my own dataset, which is a sentence pair classification task. All the parameters are the same as the example code. However, I get all negative predictions when doing the evaluation. Any ideas of what happened?

The code:

export BERT_BASE_DIR=/home/fy/uncased_L-12_H-768_A-12
export GLUE_DIR=/home/fy/glue_data
export TRAINED_CLASSIFIER=/tmp/ml_output/

python run_concept_classifier.py \
  --task_name=MRPC \
  --do_eval=true \
  --data_dir=$GLUE_DIR/ml_concept \
  --vocab_file=$BERT_BASE_DIR/vocab.txt \
  --bert_config_file=$BERT_BASE_DIR/bert_config.json \
  --init_checkpoint=$TRAINED_CLASSIFIER \
  --max_seq_length=512 \
  --output_dir=/tmp/ml_output/

The evaluation result:
INFO:tensorflow:***** Eval results *****
INFO:tensorflow: eval_accuracy = 0.64309764
INFO:tensorflow: eval_fn = 106.0
INFO:tensorflow: eval_fp = 0.0
INFO:tensorflow: eval_loss = 1.2086438
INFO:tensorflow: eval_precision = 0.0
INFO:tensorflow: eval_recall = 0.0
INFO:tensorflow: eval_tn = 191.0
INFO:tensorflow: eval_tp = 0.0
INFO:tensorflow: global_step = 83
INFO:tensorflow: loss = 1.1816607

The text was updated successfully, but these errors were encountered:

monanahe · 2018-11-21T07:44:59Z

Do you shuffle your data?

yanfan0531 · 2018-11-21T08:07:51Z

Do you shuffle your data?

Thanks for reminding. I forgot to shuffle the data. I'll update the result once the model's trained on the new dataset. Thanks again!

yanfan0531 · 2018-11-21T12:42:12Z

Do you shuffle your data?

The model performs perfectly after shuffling the dataset. Thanks! I am closing it.

INFO:tensorflow: eval_accuracy = 0.97306395
INFO:tensorflow: eval_fn = 5.0
INFO:tensorflow: eval_fp = 3.0
INFO:tensorflow: eval_loss = 0.13963467
INFO:tensorflow: eval_precision = 0.97115386
INFO:tensorflow: eval_recall = 0.9528302
INFO:tensorflow: eval_tn = 188.0
INFO:tensorflow: eval_tp = 101.0
INFO:tensorflow: global_step = 83
INFO:tensorflow: loss = 0.13657574

tiru1930 · 2018-12-12T07:03:17Z

Even after shuffling my model predicting only positive i,e 2 column has high probability, My data set has question and answer pairs with their label, i,e label is 1 if question and answer matches ,else 0.

Am doing anything wrong here,tired with different sequence lengths

anubhavpnp · 2019-05-04T03:04:50Z

@yanfan0531 Does shuffling the dataset mean that all the examples of a class not be in a serial order ..i.e.all class o examples followed by all class 1 examples and so on. Instead we should have examples in train.tsv mixed e.g. few of class 0 and then of class 1 ,then few of class 0 and so on. Actually, my bert classifier is always predicting one particular class for all examples in test dataset and my training dataset examples are in a serial order, not mixed order

yanfan0531 · 2019-05-04T04:58:33Z

@anubhavpnp Hi, you are right about the shuffling. You should try mixing the order of labels in training data, and it solved my problem.

anubhavpnp · 2019-05-05T05:26:06Z

Thanks for your help @yanfan0531 ..Shuffling solved my issue and bert is now giving correct predictions. I read a bit more and shuffling is usually a pre requisite for training a model in neural networks.

adrianog · 2019-06-18T11:52:00Z

How do we get probabilities [0.0,1.0] ? Do we need to add an additional softmax as per #322?

anubhavpnp · 2019-06-18T12:00:54Z

How do we get probabilities [0.0,1.0] ? Do we need to add an additional softmax as per #322?

I did not add any additional softmax layer..My bert prediction classifier is working fine.. @adrianog

adrianog · 2019-06-18T12:38:38Z

Are the probabilities returned for each class in the range [0.0,1.0]? Not even the example colab returns normalised probabilities.

anubhavpnp · 2019-06-18T12:45:29Z

yes..it is between 0 and 1

adrianog · 2019-06-18T13:00:23Z

Thanks for confirming. Is your code the same as on the example colab ?

anubhavpnp · 2019-06-18T14:27:14Z

I am using code which is here..https://github.com/winstarwang/rasa_nlu_bert

adrianog · 2019-06-18T20:21:13Z

I don't see any difference with the official google bert model function, so I'm not too sure as to why probabilities are not in the [0,1] range in the example colab... And in my code.

Relevant bits:

yanfan0531 closed this as completed Nov 21, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting all negative predictions when fine-tune my data #154

Getting all negative predictions when fine-tune my data #154

yanfan0531 commented Nov 21, 2018

monanahe commented Nov 21, 2018

yanfan0531 commented Nov 21, 2018

yanfan0531 commented Nov 21, 2018

tiru1930 commented Dec 12, 2018

anubhavpnp commented May 4, 2019

yanfan0531 commented May 4, 2019

anubhavpnp commented May 5, 2019

adrianog commented Jun 18, 2019

anubhavpnp commented Jun 18, 2019

adrianog commented Jun 18, 2019

anubhavpnp commented Jun 18, 2019

adrianog commented Jun 18, 2019

anubhavpnp commented Jun 18, 2019

adrianog commented Jun 18, 2019

Getting all negative predictions when fine-tune my data #154

Getting all negative predictions when fine-tune my data #154

Comments

yanfan0531 commented Nov 21, 2018

monanahe commented Nov 21, 2018

yanfan0531 commented Nov 21, 2018

yanfan0531 commented Nov 21, 2018

tiru1930 commented Dec 12, 2018

anubhavpnp commented May 4, 2019

yanfan0531 commented May 4, 2019

anubhavpnp commented May 5, 2019

adrianog commented Jun 18, 2019

anubhavpnp commented Jun 18, 2019

adrianog commented Jun 18, 2019

anubhavpnp commented Jun 18, 2019

adrianog commented Jun 18, 2019

anubhavpnp commented Jun 18, 2019

adrianog commented Jun 18, 2019