Skip to content

GaMS-Team/LoRA-Translation-Example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LoRA-Translation-Example

This is a repository with example scripts for LoRA based SFT translation tuning. The repository contains scripts for the following two frameworks:

  • Transformers + DeepSpeed: script is provided in transformers_training dir
  • Megatron-Bridge: script is provided in megatron_bridge_training dir

Data

The data used in experiment is provided in the data dir. The same dataset is used by both libraries, but the data is provided in different formats:

  • megatron subdir contains data in "standard messages" format, which is used by Megatron-Bridge.
  • transformers subdir contains data in "prompt-completion" format, which is used in Transformers script.

Environment

We use official NeMo container version 26.02 for Megatron-Bridge. Warning: we made two modifications:

  • /opt/Megatron-Bridge/src/megatron/bridge/data/datasets/utils.py: we added loss masking based on chat template and messages instead of generation keyword.
  • /opt/Megatron-Bridge/src/megatron/bridge/models/gemma/gemma3_provider.py: we use the Gemma 3 version from the main branch, which contains some bug fixes. Both modifications are binded into the container in the sbatch script.

For the Transformers + DeepSpeed we provide a Singularity recipe to build a image. The recipe is located at singularity/transformers_deepspeed_recipe.def.

Hardware

The scripts are prepared for the LEONARDO Booster partition.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors