Reproduce ACE05 nested result #10

iofu728 · 2020-05-26T04:34:32Z

I try to reproduce the high performance of your model in the ACE05 nested set.

The data set from you.

We use the same version pip package.
The config is the same as the log, include the same seed, except the batch_size and n_gpu which we only use one P100.
But I don't think it's a matter.

{
  "bert_frozen": "false",
  "hidden_size": 768,
  "hidden_dropout_prob": 0.2,
  "classifier_sign": "multi_nonlinear",
  "clip_grad": 1,
  "bert_config": {
    "attention_probs_dropout_prob": 0.1,
    "hidden_act": "gelu",
    "hidden_dropout_prob": 0.1,
    "hidden_size": 768,
    "initializer_range": 0.02,
    "intermediate_size": 3072,
    "max_position_embeddings": 512,
    "num_attention_heads": 12,
    "num_hidden_layers": 12,
    "type_vocab_size": 2,
    "vocab_size": 30522
  },
  "config_path": "./config/en_bert_base_uncased.json",
  "data_dir": "./data_preprocess/en_ace05",
  "bert_model": "./uncased_L-12_H-768_A-12",
  "task_name": null,
  "max_seq_length": 160,
  "train_batch_size": 15,
  "dev_batch_size": 16,
  "test_batch_size": 16,
  "checkpoint": 1200,
  "learning_rate": 4e-05,
  "num_train_epochs": 20,
  "warmup_proportion": -1.0,
  "local_rank": -1,
  "gradient_accumulation_steps": 1,
  "seed": 2333,
  "export_model": false,
  "output_dir": "result",
  "data_sign": "ace2005",
  "weight_start": 1.0,
  "weight_end": 1.0,
  "weight_span": 1.0,
  "entity_sign": "nested",
  "n_gpu": 1,
  "dropout": 0.2,
  "entity_threshold": 0.5,
  "data_cache": false
}

After running 20 epoch, I only get 79.53% f1 score in the test set.
It has a big gap near 7% with your result 86.88%.
I don't think it causes by the multi-GPU.
Could you rerun your code in another machine or another seed?
Thx!

The text was updated successfully, but these errors were encountered:

littlesulley · 2020-05-26T05:00:51Z

Hi, thank you for the comment.
batch-size actually matters a lot in this setting. Could you please run the code on two machines to enable apple-to-apple comparison ?

iofu728 · 2020-05-26T05:07:43Z

Hi, thank you for the comment.
batch-size actually matters a lot in this setting. Could you please run the code on two machines to enable apple-to-apple comparison ?

Ok. I'll try it again.

littlesulley · 2020-05-26T05:10:11Z

感谢感谢！

iofu728 · 2020-05-27T14:34:35Z

After expanding the batch_size to 32, the performance has not changed.
Could you provide the predicted result of ACE04/05?
Only need pred_span_triple_lst, gold_span_triple_lst in

mrc-for-flat-nested-ner/metric/mrc_ner_evaluate.py

Line 191 in 759b731

    
           span_precision, span_recall, span_f1 = nest_span_f1.nested_calculate_f1(pred_span_triple_lst, gold_span_triple_lst, dims=2)

And only used for analyzing.
Thx!

littlesulley closed this as completed May 26, 2020

iofu728 mentioned this issue May 30, 2020

Reproduce of nested NER #13

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproduce ACE05 nested result #10

Reproduce ACE05 nested result #10

iofu728 commented May 26, 2020 •

edited

Loading

littlesulley commented May 26, 2020

iofu728 commented May 26, 2020

littlesulley commented May 26, 2020

iofu728 commented May 27, 2020

Reproduce ACE05 nested result #10

Reproduce ACE05 nested result #10

Comments

iofu728 commented May 26, 2020 • edited Loading

littlesulley commented May 26, 2020

iofu728 commented May 26, 2020

littlesulley commented May 26, 2020

iofu728 commented May 27, 2020

iofu728 commented May 26, 2020 •

edited

Loading