You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When trying to train the model with python ./jerex_train.py --config-path configs/docred_joint I get this
message You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
What does it mean? How should I then train the model instead?
O.S: Ubuntu 18.04.4
/jerex$ python ./jerex_train.py --config-path configs/docred_joint
datasets:
train_path: ./data/datasets/docred_joint/train_joint.json
valid_path: ./data/datasets/docred_joint/dev_joint.json
test_path: null
types_path: ./data/datasets/docred_joint/types.json
model:
model_type: joint_multi_instance
encoder_path: bert-base-cased
tokenizer_path: bert-base-cased
mention_threshold: 0.85
coref_threshold: 0.85
rel_threshold: 0.6
prop_drop: 0.1
meta_embedding_size: 25
size_embeddings_count: 30
ed_embeddings_count: 300
token_dist_embeddings_count: 700
sentence_dist_embeddings_count: 50
position_embeddings_count: 700
sampling:
neg_mention_count: 200
neg_coref_count: 200
neg_relation_count: 200
max_span_size: 10
sampling_processes: 8
neg_mention_overlap_ratio: 0.5
lowercase: false
loss:
mention_weight: 1.0
coref_weight: 1.0
entity_weight: 0.25
relation_weight: 1.0
inference:
valid_batch_size: 1
test_batch_size: 1
max_spans: null
max_coref_pairs: null
max_rel_pairs: null
training:
batch_size: 1
min_epochs: 20
max_epochs: 20
lr: 5.0e-05
lr_warmup: 0.1
weight_decay: 0.01
max_grad_norm: 1.0
accumulate_grad_batches: 1
max_spans: null
max_coref_pairs: null
max_rel_pairs: null
distribution:
gpus: []
accelerator: ''
prepare_data_per_node: false
misc:
store_predictions: true
store_examples: true
flush_logs_every_n_steps: 1000
log_every_n_steps: 1000
deterministic: false
seed: null
cache_path: null
precision: 32
profiler: null
final_valid_evaluate: true
Parse dataset '/home/marco/PyTorchMatters/EntitiesRelationsExtraction/jerex/data/datasets/docred_joint/train_joint.json': 100%|██████| 3008/3008 [00:41<00:00, 71.72it/s]
Parse dataset '/home/marco/PyTorchMatters/EntitiesRelationsExtraction/jerex/data/datasets/docred_joint/dev_joint.json': 100%|██████████| 300/300 [00:03<00:00, 75.22it/s]
Some weights of the model checkpoint at bert-base-cased were not used when initializing JointMultiInstanceModel: ['cls.predictions.decoder.weight', 'cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.bias', 'bert.pooler.dense.weight', 'cls.predictions.transform.LayerNorm.weight', 'bert.pooler.dense.bias', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing JointMultiInstanceModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing JointMultiInstanceModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of JointMultiInstanceModel were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['entity_classification.entity_classifier.weight', 'relation_classification.pair_linear.weight', 'coreference_resolution.coref_classifier.weight', 'relation_classification.rel_classifier.weight', 'mention_localization.size_embeddings.weight', 'relation_classification.rel_linear.weight', 'mention_localization.linear.weight', 'relation_classification.sentence_distance_embeddings.weight', 'relation_classification.token_distance_embeddings.weight', 'coreference_resolution.coref_linear.bias', 'coreference_resolution.coref_linear.weight', 'mention_localization.mention_classifier.weight', 'entity_classification.linear.bias', 'mention_localization.linear.bias', 'relation_classification.entity_type_embeddings.weight', 'entity_classification.linear.weight', 'relation_classification.rel_linear.bias', 'relation_classification.rel_classifier.bias', 'coreference_resolution.coref_ed_embeddings.weight', 'coreference_resolution.coref_classifier.bias', 'entity_classification.entity_classifier.bias', 'mention_localization.mention_classifier.bias', 'relation_classification.pair_linear.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
| Name | Type | Params
--------------------------------------------------
0 | model | JointMultiInstanceModel | 113 M
--------------------------------------------------
113 M Trainable params
0 Non-trainable params
113 M Total params
455.954 Total estimated model params size (MB)
The text was updated successfully, but these errors were encountered:
this is just a remark by the Huggingface library - no need to worry. We are using the BERT implementation of Huggingface internally. You are doing everything correctly here. When executing the train code (as you do), you train JEREX (and fine-tune into BERT) on a down-stream task (end-to-end relation extraction) and you can then use the model for prediction.
When trying to train the model with
python ./jerex_train.py --config-path configs/docred_joint
I get thismessage
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference
.What does it mean? How should I then train the model instead?
O.S: Ubuntu 18.04.4
The text was updated successfully, but these errors were encountered: