Skip to content

Latest commit

History

History
17 lines (9 loc) 路 1.45 KB

PARAMETERS.md

File metadata and controls

17 lines (9 loc) 路 1.45 KB

Here's a short description of each argument used in the different commands:

  • --huggingface_read_token: Your Hugging Face authentication token for read access. It allows you to download datasets and models from Hugging Face.

  • --huggingface_push_token: Your Hugging Face authentication token for write access. It's used for uploading models to your Hugging Face account.

  • --dataset_name: The name of the dataset you wish to use for training. Example: 'mozilla-foundation/common_voice_16_1'. This should match the dataset's identifier on the Hugging Face Datasets Hub.

  • --language_abbr: The abbreviation of the language for the dataset you're using. Example: 'sw' for Swahili. This is used to specify the language variant of the dataset if it supports multiple languages.

  • --model_id: Identifier for the pre-trained model you wish to fine-tune. Example: 'openai/whisper-small'. This should match the model's identifier on the Hugging Face Model Hub.

  • --processing_task: Specifies the task for which the model is being trained. Example: 'transcribe'. This defines the objective of the model training, such as transcribing audio to text.

  • --wandb_api_key: Your Weights & Biases (W&B) API key. This is used for logging and tracking the training process if you're using W&B for experiment tracking.

  • --use_peft: Add this flag to fine-tune using PEFT method and omit it to do full fine-tuning. PEFT only works on a notbeook with GPU-support.