Skip to content

Deep-unlearning/Finetune-Parakeet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

6 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Finetune Parakeet for ASR with Transformers ๐Ÿค—

This repository fine-tunes the Parakeet CTC models speech model on conversational speech datasets using the Hugging Face transformers and datasets libraries.

Installation

Step 1: Clone the repository

git clone https://github.com/Deep-unlearning/Finetune-Parakeet.git
cd Finetune-Parakeet

Step 2: Set up environment

Choose your preferred package manager:

๐Ÿ“ฆ Using UV (recommended)

Install uv

uv venv .venv --python 3.10 && source .venv/bin/activate
uv pip install -r requirements.txt
๐Ÿ Using pip
python -m venv .venv --python 3.10 && source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

Dataset Preparation

If you want to swap to a different dataset, ensure after loading you still have:

  • an audio column (cast to Audio(sampling_rate=16000)), and
  • a text column (the reference transcription).

If your dataset uses different column names, map them to audio and text before returning.

Training

Run the training script:

uv run train.py

Logs and checkpoints will be saved under the outputs/ directory by default.

Training with LoRA

You can also run the training script with LoRA:

uv run train_lora.py

Happy fine-tuning Parakeet! ๐Ÿš€

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages