[QUESTION] Train my own Metric: #115

sdlmw · 2023-02-23T09:23:29Z

HI

I downloaded the experiment file and tried to train the model myself. But always get the error below below .
However, I did not find the reason, excuse me, what caused this problem?

Code

ranking_metric:
  class_path: comet.models.RankingMetric
  init_args:
    nr_frozen_epochs: 0.3
    keep_embeddings_frozen: True
    optimizer: AdamW
    encoder_learning_rate: 5.0e-06
    learning_rate: 1.5e-05
    layerwise_decay: 0.95
    encoder_model: XLM-RoBERTa
    pretrained_model: xlm-roberta-base
    pool: avg
    layer: mix
    dropout: 0.1
    batch_size: 4
    train_data: 
      - /MT-work/COMET/data/apequest/train.csv
    validation_data:
      - /MT-work/COMET/data/apequest/test.csv      
trainer: /MT-work/COMET/configs/trainer.yaml
early_stopping: /MT-work/COMET/configs/early_stopping.yaml
model_checkpoint: /MT-work/COMET/configs/model_checkpoint.yaml

comet-train: error: Parser key "ranking_metric": Problem with given class_path "comet.models.RankingMetric":
  - Parser key "train_data": Value "['/MT-work/COMET/data/apequest/train.csv']" does not validate against any of the types in typing.Union[str, NoneType]:
    - Expected a <class 'str'> but got "['/MT-work/COMET/data/apequest/train.csv']"
    - Expected a <class 'NoneType'> but got "['/MT-work/COMET/data/apequest/train.csv']"

What have you tried?

What's your environment?

OS: Ubuntu 18.04
Packaging [e.g. pip, conda] Pip and conda
Version [e.g. 0.5.2.1] COMET: 1.1.3 Python：3.8

The text was updated successfully, but these errors were encountered:

ricardorei · 2023-02-23T17:05:44Z

There is a mismatch between unbabel-comet==1.1.3 and the current master branch.

If you are using version 1.1.3 you can't pass a list of training files.. the config is just:

ranking_metric:
  class_path: comet.models.RankingMetric
  init_args:
    nr_frozen_epochs: 0.3
    keep_embeddings_frozen: True
    optimizer: AdamW
    encoder_learning_rate: 5.0e-06
    learning_rate: 1.5e-05
    layerwise_decay: 0.95
    encoder_model: XLM-RoBERTa
    pretrained_model: xlm-roberta-base
    pool: avg
    layer: mix
    dropout: 0.1
    batch_size: 4
    train_data: /MT-work/COMET/data/apequest/train.csv
    validation_data:
      - /MT-work/COMET/data/apequest/test.csv      
trainer: /MT-work/COMET/configs/trainer.yaml
early_stopping: /MT-work/COMET/configs/early_stopping.yaml
model_checkpoint: /MT-work/COMET/configs/model_checkpoint.yaml

sdlmw · 2023-02-27T08:58:30Z

Hi @ricardorei

Thanks for the explanation.

I just pulled the latest version.

git clone https://github.com/Unbabel/COMET

The error has not changed

ricardorei · 2023-03-07T10:42:47Z

Hi @sdlmw I just tested the code on master and everything is working fine.

Here is my configs:

ranking_metric:
  class_path: comet.models.RankingMetric
  init_args:
    nr_frozen_epochs: 0.3
    keep_embeddings_frozen: True
    optimizer: AdamW
    encoder_learning_rate: 1.0e-06
    learning_rate: 1.5e-05
    layerwise_decay: 0.95
    encoder_model: XLM-RoBERTa
    pretrained_model: xlm-roberta-base
    pool: avg
    layer: mix
    layer_transformation: sparsemax
    layer_norm: False
    dropout: 0.1
    batch_size: 4
    train_data: 
      - tests/data/ranking_data.csv
    validation_data:
      - tests/data/ranking_data.csv
      
trainer: ../trainer.yaml
early_stopping: ../early_stopping.yaml
model_checkpoint: ../model_checkpoint.yaml

and for the trainer.yaml:

class_path: pytorch_lightning.trainer.trainer.Trainer
init_args:
  accelerator: gpu
  devices: 1
  accumulate_grad_batches: 4
  amp_backend: native
  amp_level: null
  auto_lr_find: False
  auto_scale_batch_size: False
  auto_select_gpus: False
  benchmark: null
  check_val_every_n_epoch: 1
  default_root_dir: null
  deterministic: False
  fast_dev_run: False
  gradient_clip_val: 1.0
  gradient_clip_algorithm: norm
  limit_train_batches: 1.0
  limit_val_batches: 1.0
  limit_test_batches: 1.0
  limit_predict_batches: 1.0
  log_every_n_steps: 50
  profiler: null
  overfit_batches: 0
  plugins: null
  precision: 16
  max_epochs: 4
  min_epochs: 1
  max_steps: -1
  min_steps: null
  max_time: null
  num_nodes: 1
  num_sanity_val_steps: 10
  reload_dataloaders_every_n_epochs: 0
  replace_sampler_ddp: True
  sync_batchnorm: False
  detect_anomaly: False
  tpu_cores: null
  track_grad_norm: -1
  val_check_interval: 1.0
  enable_model_summary: True
  move_metrics_to_cpu: True
  multiple_trainloader_mode: max_size_cycle

ricardorei · 2023-03-07T10:57:53Z

note that the data I am using is in the tests folder. Make sure that the data you are using for the ranking model is in the same shape

sdlmw added the question Further information is requested label Feb 23, 2023

ricardorei closed this as completed Apr 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QUESTION] Train my own Metric: #115

[QUESTION] Train my own Metric: #115

sdlmw commented Feb 23, 2023 •

edited

Loading

ricardorei commented Feb 23, 2023

sdlmw commented Feb 27, 2023

ricardorei commented Mar 7, 2023

ricardorei commented Mar 7, 2023

[QUESTION] Train my own Metric: #115

[QUESTION] Train my own Metric: #115

Comments

sdlmw commented Feb 23, 2023 • edited Loading

Code

What have you tried?

What's your environment?

ricardorei commented Feb 23, 2023

sdlmw commented Feb 27, 2023

ricardorei commented Mar 7, 2023

ricardorei commented Mar 7, 2023

sdlmw commented Feb 23, 2023 •

edited

Loading