# üöÄ ASR Training - SUPER EASY MODE

**Just run each cell in order! That's it!** ‚¨áÔ∏è

No configuration needed - everything is automatic!

---

## ‚úÖ Step 1: Select GPU Runtime

**Before starting:**
1. Click **Runtime** ‚Üí **Change runtime type**
2. Select **GPU** (T4 is fine, L4 is better)
3. Click **Save**

Then run the cells below! ‚¨áÔ∏è

In [None]:
# ‚úÖ STEP 2: Install Everything (takes ~2 minutes)
# Just click the play button ‚ñ∂Ô∏è and wait!

print("üì¶ Installing required packages...")
print("This will take about 2 minutes. Please wait...\n")

!pip install -q transformers datasets accelerate torchaudio librosa jiwer evaluate psutil soundfile

print("\n‚úÖ Installation complete!")

In [None]:
# ‚úÖ STEP 3: Clone the training system
# Just click play ‚ñ∂Ô∏è

!git clone https://github.com/NursultanMRX/ASR-training-new-l.git
%cd ASR-training-new-l

print("‚úÖ Code downloaded!")

In [None]:
# ‚úÖ STEP 4: Check your GPU
# Click play ‚ñ∂Ô∏è to see what GPU you have

import sys
import torch
import psutil

print("System Information:")
print(f"Python: {sys.version.split()[0]}")
print(f"PyTorch: {torch.__version__}")
print(f"CUDA Available: {torch.cuda.is_available()}")

if torch.cuda.is_available():
    print(f"\nüéÆ GPU: {torch.cuda.get_device_name(0)}")
    print(f"   Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
    print(f"   CUDA Version: {torch.version.cuda}")
    print("\n‚úÖ Perfect! You have a GPU ready for training!")
else:
    print("\n‚ö†Ô∏è WARNING: No GPU detected!")
    print("Go to Runtime ‚Üí Change runtime type ‚Üí Select GPU")

print(f"\nüíæ RAM: {psutil.virtual_memory().total / 1e9:.2f} GB")

In [None]:
# ‚úÖ STEP 5: Login to HuggingFace
# Click play ‚ñ∂Ô∏è and paste your token when asked
# Get your token from: https://huggingface.co/settings/tokens

from huggingface_hub import login

print("üîë Please paste your HuggingFace token below")
print("Get it from: https://huggingface.co/settings/tokens\n")

login()

print("\n‚úÖ Logged in successfully!")

In [None]:
# ‚úÖ STEP 6: Activate Keep-Alive (prevents disconnection)
# Click play ‚ñ∂Ô∏è

import sys
sys.path.append('src')

from colab_keeper import activate_colab_keepalive

print("üîí Activating Colab Keep-Alive...\n")
keeper = activate_colab_keepalive()

print("\n‚úÖ Keep-alive active! Your session won't disconnect.")
print("You can minimize this tab now.")

In [None]:
# ‚úÖ STEP 7: START TRAINING! üöÄ
# Click play ‚ñ∂Ô∏è and let it run!
# This will take several hours depending on your GPU.

# Default settings (you can change these if you want):
DATASET = "nickoo004/karakalpak-speech-60h-production-v2"
MODEL = "facebook/wav2vec2-xls-r-1b"
OUTPUT_NAME = "wav2vec2-xls-r-1b-karakalpak-colab"
YOUR_HF_USERNAME = "nickoo004"  # ‚Üê Change this to YOUR username!

print("üöÄ Starting ASR Training...")
print(f"   Dataset: {DATASET}")
print(f"   Model: {MODEL}")
print(f"   Output: {OUTPUT_NAME}")
print("\nThis will take several hours. You can:")
print("  - Minimize the tab (keep-alive will keep it running)")
print("  - Check progress periodically")
print("  - The model will auto-save to your HuggingFace account\n")
print("="*60)

# Import and run training
from optimized_training import train_asr_model

train_asr_model(
    dataset_repo=DATASET,
    base_model=MODEL,
    output_name=OUTPUT_NAME,
    hf_username=YOUR_HF_USERNAME,
    num_epochs=20,
    target_batch_size=32,
    learning_rate=3e-4,
    use_deepspeed=False,  # Set to True for 2x memory savings
    push_to_hub=True
)

print("\nüéâ Training Complete!")
print(f"Your model is available at: https://huggingface.co/{YOUR_HF_USERNAME}/{OUTPUT_NAME}")

---

## üìä OPTIONAL: Monitor Training

While training runs, you can check progress with TensorBoard:

In [None]:
# OPTIONAL: View training graphs
%load_ext tensorboard
%tensorboard --logdir outputs/

---

## ‚ùì Troubleshooting

### If you get disconnected:
Just re-run **only Step 7** - it will auto-resume from the last checkpoint!

### If you get OOM (Out of Memory) error:
1. Go back to Step 7
2. Change `use_deepspeed=False` to `use_deepspeed=True`
3. Re-run Step 7

### If training is too slow:
Make sure you selected a GPU runtime (see Step 1)

### Still having issues?
Check: [TROUBLESHOOTING.md](https://github.com/NursultanMRX/ASR-training-new-l/blob/main/TROUBLESHOOTING.md)

---

## üéØ Expected Training Time

| GPU | Time |
|-----|------|
| T4 (Free) | ~24-36 hours |
| L4 (Colab Pro) | ~12-18 hours |
| A100 (Colab Pro+) | ~6-8 hours |

---

**That's it! Super easy! üöÄ**

Made with ‚ù§Ô∏è for the ASR community