Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training data and scripts used for wmt22-cometkiwi-da #217

Open
rohitk-cognizant opened this issue May 2, 2024 · 3 comments
Open

Training data and scripts used for wmt22-cometkiwi-da #217

rohitk-cognizant opened this issue May 2, 2024 · 3 comments
Labels
question Further information is requested

Comments

@rohitk-cognizant
Copy link

Hi Team,

Can you share the training data and training scripts used for wmt22-cometkiwi-da. We want it reference for training with our own sample reference data.

@rohitk-cognizant rohitk-cognizant added the question Further information is requested label May 2, 2024
@ricardorei
Copy link
Collaborator

Hi @rohitk-cognizant,

To train wmt22-cometkiwi-da you just have to run:

comet-train --cfg configs/models/{your_model_config}.yaml

Your configs should be something like this:

unified_metric:
  class_path: comet.models.UnifiedMetric
  init_args:
    nr_frozen_epochs: 0.3
    keep_embeddings_frozen: True
    optimizer: AdamW
    encoder_learning_rate: 1.0e-06
    learning_rate: 1.5e-05
    layerwise_decay: 0.95
    encoder_model: XLM-RoBERTa
    pretrained_model: microsoft/infoxlm-large
    sent_layer: mix
    layer_transformation: sparsemax
    word_layer: 24
    loss: mse
    dropout: 0.1
    batch_size: 16
    train_data: 
      - TRAIN_DATA.csv
    validation_data: 
      - VALIDATION_DATA.csv
    hidden_sizes:
      - 3072
      - 1024
    activations: Tanh
    input_segments:
      - mt
      - src
    word_level_training: False
    
trainer: ../trainer.yaml
early_stopping: ../early_stopping.yaml
model_checkpoint: ../model_checkpoint.yaml

@rohitk-cognizant
Copy link
Author

rohitk-cognizant commented May 5, 2024

Hi @ricardorei ,

Thanks for the update. Can I use the same training parameters mentioned in master branch trainer.yaml file?

@ricardorei
Copy link
Collaborator

ricardorei commented May 5, 2024

Hmm maybe you should change them a bit. For example to train on a single GPU (which is usually faster) and with precision 16 use this:

  accelerator: gpu
  devices: 1
  # strategy: ddp # Comment this line for distributed training
  precision: 16

You might also want to consider reducing the accumulate_grad_batches to 2 instead of 8

  accumulate_grad_batches: 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants