# üéØ LSB Temporal Action Detection - Google Colab

Sistema de detecci√≥n temporal de acciones para Lengua de Se√±as Boliviana (LSB)

---

## üìã Antes de empezar:

1. **Configura GPU**: Runtime ‚Üí Change runtime type ‚Üí **T4 GPU**
2. **Duraci√≥n**: ~2-4 horas para entrenamiento completo
3. **Espacio**: ~10-15GB necesarios

---

### ‚öôÔ∏è Hardware Recomendado:
- **GPU**: T4 (16GB VRAM) ‚úÖ RECOMENDADO
- **RAM**: 12GB High-RAM

**¬øPor qu√© T4 GPU y no TPU?**
- Video Swin Transformer optimizado para CUDA
- PyTorch nativo (TPU requiere JAX/TensorFlow)
- Mejor soporte para transformers y timm
- 16GB VRAM suficiente para videos

---

## üîß 1. Setup Inicial

In [None]:
# Verificar GPU disponible
!nvidia-smi

import torch
print(f"\n{'='*60}")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA version: {torch.version.cuda}")
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.2f} GB")
print(f"{'='*60}")

## üì¶ 2. Clonar Repositorio

In [None]:
# Clonar repositorio
!git clone https://github.com/borysinho/LSB-Temporal-Action-Detection.git
%cd LSB-Temporal-Action-Detection

# Ver estructura
!ls -la

## üì• 3. Instalar Dependencias

In [None]:
# Instalar PyTorch con CUDA (optimizado para Colab T4)
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

In [None]:
# Instalar resto de dependencias
!pip install -r requirements.txt -q

print("‚úÖ Instalaci√≥n completada")

## üíæ 4. Conectar Google Drive (Opcional pero Recomendado)

Para:
- Guardar checkpoints
- Cargar tus videos
- Persistir resultados

In [None]:
from google.colab import drive
drive.mount('/content/drive')

# Crear directorios en Drive
import os
DRIVE_PATH = '/content/drive/MyDrive/LSB_TAD'
os.makedirs(f"{DRIVE_PATH}/checkpoints", exist_ok=True)
os.makedirs(f"{DRIVE_PATH}/logs", exist_ok=True)
os.makedirs(f"{DRIVE_PATH}/data", exist_ok=True)

print(f"‚úÖ Google Drive montado en: {DRIVE_PATH}")

## üìÅ 5. Preparar Datos

### Opci√≥n A: Subir desde tu Drive
Sube tus videos a `MyDrive/LSB_TAD/data/videos/`

### Opci√≥n B: Descargar desde URL
Si tienes tus datos en la nube

In [None]:
# Configurar rutas de datos
import yaml
from pathlib import Path

# Opci√≥n A: Usar datos desde Google Drive
USE_DRIVE = True

if USE_DRIVE:
    VIDEO_DIR = f"{DRIVE_PATH}/data/videos"
    CHECKPOINT_DIR = f"{DRIVE_PATH}/checkpoints"
    LOG_DIR = f"{DRIVE_PATH}/logs"
else:
    # Opci√≥n B: Usar almacenamiento local de Colab (se pierde al desconectar)
    VIDEO_DIR = "/content/data/videos"
    CHECKPOINT_DIR = "/content/checkpoints"
    LOG_DIR = "/content/logs"
    os.makedirs(VIDEO_DIR, exist_ok=True)
    os.makedirs(CHECKPOINT_DIR, exist_ok=True)
    os.makedirs(LOG_DIR, exist_ok=True)

print(f"üìÇ Videos: {VIDEO_DIR}")
print(f"üíæ Checkpoints: {CHECKPOINT_DIR}")
print(f"üìä Logs: {LOG_DIR}")

In [None]:
# OPCIONAL: Descargar datos de ejemplo (si tienes URL)
# Descomenta y ajusta seg√∫n tu caso

# !wget -O /tmp/lsb_videos.zip "TU_URL_AQUI"
# !unzip /tmp/lsb_videos.zip -d {VIDEO_DIR}
# !rm /tmp/lsb_videos.zip

print("‚ö†Ô∏è Aseg√∫rate de tener videos en:", VIDEO_DIR)

## üè∑Ô∏è 6. Preparar Anotaciones Temporales

Crear archivo de anotaciones con timestamps de se√±as

In [None]:
# Crear anotaciones temporales usando el script optimizado
!python scripts/prepare_annotations_fast.py \
    --videos_dir {VIDEO_DIR} \
    --segments_dir {VIDEO_DIR} \
    --output_dir data/annotations \
    --train_ratio 0.7 \
    --val_ratio 0.15 \
    --test_ratio 0.15

print("‚úÖ Anotaciones preparadas")

## ‚öôÔ∏è 7. Configuraci√≥n del Modelo

Ajustar configuraci√≥n para Colab

In [None]:
# Cargar y modificar configuraci√≥n
with open('config.yaml', 'r') as f:
    config = yaml.safe_load(f)

# Ajustes para Colab T4 GPU (16GB VRAM)
config['data']['videos_dir'] = VIDEO_DIR
config['training']['batch_size'] = 4  # Ajustar seg√∫n memoria disponible
config['training']['num_epochs'] = 100  # Puedes reducir para pruebas
config['training']['num_workers'] = 2  # Colab tiene 2 CPUs
config['training']['checkpoint_dir'] = CHECKPOINT_DIR
config['training']['log_dir'] = LOG_DIR

# Activar mixed precision para T4
config['training']['mixed_precision'] = True  # FP16 para mejor rendimiento

# Guardar configuraci√≥n modificada
with open('config_colab.yaml', 'w') as f:
    yaml.dump(config, f, default_flow_style=False)

print("‚úÖ Configuraci√≥n ajustada para Colab T4 GPU")
print(f"   - Batch size: {config['training']['batch_size']}")
print(f"   - Epochs: {config['training']['num_epochs']}")
print(f"   - Mixed Precision: {config['training']['mixed_precision']}")

## üöÄ 8. Entrenamiento

### Opci√≥n A: Entrenar desde cero

In [None]:
# Entrenar desde cero
!python scripts/train.py \
    --config config_colab.yaml \
    --gpu 0 \
    --experiment_name "lsb_tad_v1"

print("‚úÖ Entrenamiento completado")

### Opci√≥n B: Reanudar entrenamiento

In [None]:
# Reanudar desde checkpoint (√∫til si Colab se desconecta)
CHECKPOINT_PATH = f"{CHECKPOINT_DIR}/epoch_50.pth"  # Ajustar al √∫ltimo checkpoint

!python scripts/train.py \
    --config config_colab.yaml \
    --resume {CHECKPOINT_PATH} \
    --gpu 0

print("‚úÖ Entrenamiento reanudado")

## üìä 9. Monitoreo con TensorBoard

In [None]:
%load_ext tensorboard
%tensorboard --logdir {LOG_DIR}

## üéØ 10. Evaluaci√≥n

In [None]:
# Evaluar modelo en test set
!python scripts/evaluate.py \
    --config config_colab.yaml \
    --checkpoint {CHECKPOINT_DIR}/best_model.pth \
    --split test \
    --output_dir results

# Mostrar m√©tricas
import json
with open('results/metrics.json', 'r') as f:
    metrics = json.load(f)

print("\n" + "="*60)
print("RESULTADOS FINALES")
print("="*60)
for metric, value in metrics.items():
    print(f"{metric}: {value:.4f}")
print("="*60)

## üîÆ 11. Inferencia en Video Nuevo

In [None]:
# Inferencia en un video
import torch
from models.complete_model import build_model
from utils.video_utils import load_video
from utils.visualization import visualize_detections
import cv2
from IPython.display import Video, display

# Cargar modelo
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = build_model(config)
checkpoint = torch.load(f"{CHECKPOINT_DIR}/best_model.pth", map_location=device)
model.load_state_dict(checkpoint['model_state_dict'])
model.to(device)
model.eval()

print("‚úÖ Modelo cargado")

# Procesar video
VIDEO_PATH = f"{VIDEO_DIR}/test_video.mp4"  # Ajustar ruta

with torch.no_grad():
    # Cargar video
    frames = load_video(VIDEO_PATH, num_frames=64)
    frames = frames.unsqueeze(0).to(device)  # (1, 3, T, H, W)
    
    # Inferencia
    outputs = model(frames)
    
    # Post-procesamiento
    detections = outputs['detections']  # [(start, end, class, score), ...]
    
    print(f"\nüéØ Detectadas {len(detections)} se√±as:")
    for i, det in enumerate(detections):
        start, end, class_id, score = det
        print(f"  {i+1}. Se√±a {class_id}: frames {start}-{end} (confianza: {score:.3f})")

# Visualizar resultados
output_path = "/content/output_visualized.mp4"
visualize_detections(VIDEO_PATH, detections, output_path)

# Mostrar video con detecciones
display(Video(output_path, width=640, embed=True))

## üì¶ 12. Inferencia en Lote

In [None]:
# Procesar m√∫ltiples videos
import glob
from tqdm import tqdm

input_dir = f"{VIDEO_DIR}/test_videos"
output_dir = "/content/predictions"
os.makedirs(output_dir, exist_ok=True)

video_files = glob.glob(f"{input_dir}/*.mp4")

results = {}

for video_path in tqdm(video_files, desc="Procesando videos"):
    video_name = Path(video_path).stem
    
    with torch.no_grad():
        frames = load_video(video_path, num_frames=64)
        frames = frames.unsqueeze(0).to(device)
        
        outputs = model(frames)
        detections = outputs['detections']
        
        results[video_name] = detections

# Guardar resultados
import pickle
with open(f"{output_dir}/batch_predictions.pkl", 'wb') as f:
    pickle.dump(results, f)

print(f"‚úÖ {len(results)} videos procesados")
print(f"   Resultados guardados en: {output_dir}/batch_predictions.pkl")

## üíæ 13. Exportar Modelo

In [None]:
# Exportar a ONNX para producci√≥n
dummy_input = torch.randn(1, 3, 64, 224, 224).to(device)

torch.onnx.export(
    model,
    dummy_input,
    f"{CHECKPOINT_DIR}/model.onnx",
    export_params=True,
    opset_version=14,
    do_constant_folding=True,
    input_names=['video'],
    output_names=['detections'],
    dynamic_axes={
        'video': {0: 'batch_size'},
        'detections': {0: 'batch_size'}
    }
)

print(f"‚úÖ Modelo exportado a ONNX: {CHECKPOINT_DIR}/model.onnx")

## üì• 14. Descargar Resultados

In [None]:
# Comprimir resultados para descarga
from google.colab import files
import shutil

# Crear ZIP con resultados importantes
shutil.make_archive('/content/lsb_tad_results', 'zip', '.', 
                    base_dir=None)

# Agregar a ZIP:
!zip -r /content/lsb_tad_results.zip \
    {CHECKPOINT_DIR}/best_model.pth \
    {CHECKPOINT_DIR}/model.onnx \
    results/ \
    config_colab.yaml

# Descargar
files.download('/content/lsb_tad_results.zip')

print("‚úÖ Descarga iniciada")

## üí° 15. Tips y Troubleshooting

### ‚ö†Ô∏è Si te quedas sin memoria (CUDA OOM):
```python
# Reducir batch size
config['training']['batch_size'] = 2

# Reducir resoluci√≥n de entrada
config['data']['input_size'] = 192  # en vez de 224

# Reducir longitud de clips
config['data']['clip_length'] = 32  # en vez de 64
```

### üîÑ Si Colab se desconecta:
- Los checkpoints est√°n en Google Drive (si usaste `USE_DRIVE=True`)
- Simplemente ejecuta la celda de "Reanudar entrenamiento"

### üìä Monitorear uso de GPU:
```python
!watch -n 1 nvidia-smi
```

### üöÄ Acelerar entrenamiento:
1. **Mixed Precision (FP16)**: Ya activado en config
2. **Gradient Accumulation**: Si batch size es muy peque√±o
3. **DataLoader workers**: Ajustar `num_workers`

### üìà Mejorar precisi√≥n:
1. Aumentar epochs
2. Usar data augmentation m√°s agresivo
3. Ajustar learning rate
4. Usar learning rate scheduler

## üìà 16. Monitoreo en Tiempo Real

In [None]:
# Monitorear progreso de entrenamiento
import matplotlib.pyplot as plt
import pandas as pd
from IPython.display import clear_output
import time

def plot_training_progress(log_file):
    """Graficar progreso en tiempo real."""
    try:
        df = pd.read_csv(log_file)
        
        fig, axes = plt.subplots(2, 2, figsize=(15, 10))
        
        # Loss
        axes[0, 0].plot(df['epoch'], df['train_loss'], label='Train')
        axes[0, 0].plot(df['epoch'], df['val_loss'], label='Val')
        axes[0, 0].set_title('Loss')
        axes[0, 0].set_xlabel('Epoch')
        axes[0, 0].legend()
        axes[0, 0].grid(True)
        
        # mAP
        axes[0, 1].plot(df['epoch'], df['val_map'])
        axes[0, 1].set_title('mAP@0.5')
        axes[0, 1].set_xlabel('Epoch')
        axes[0, 1].grid(True)
        
        # Learning Rate
        axes[1, 0].plot(df['epoch'], df['lr'])
        axes[1, 0].set_title('Learning Rate')
        axes[1, 0].set_xlabel('Epoch')
        axes[1, 0].set_yscale('log')
        axes[1, 0].grid(True)
        
        # GPU Memory
        if 'gpu_memory' in df.columns:
            axes[1, 1].plot(df['epoch'], df['gpu_memory'])
            axes[1, 1].set_title('GPU Memory (GB)')
            axes[1, 1].set_xlabel('Epoch')
            axes[1, 1].grid(True)
        
        plt.tight_layout()
        plt.show()
        
    except Exception as e:
        print(f"Error al graficar: {e}")

# Actualizar cada 30 segundos durante entrenamiento
LOG_FILE = f"{LOG_DIR}/training.csv"

# Descomentar para monitoreo en tiempo real (ejecutar en celda separada)
# while True:
#     clear_output(wait=True)
#     plot_training_progress(LOG_FILE)
#     time.sleep(30)

## üßπ 17. Limpieza (Opcional)

In [None]:
# Limpiar archivos temporales para liberar espacio
!rm -rf /tmp/*
!rm -rf ~/.cache/pip
!rm -rf data/cache/*

# Ver espacio disponible
!df -h /content

print("‚úÖ Limpieza completada")

---

## üéì Recursos Adicionales

- **Repositorio**: https://github.com/borysinho/LSB-Temporal-Action-Detection
- **Paper Video Swin**: [Video Swin Transformer](https://arxiv.org/abs/2106.13230)
- **Documentaci√≥n PyTorch**: https://pytorch.org/docs/

---

## üìû Soporte

¬øProblemas o preguntas? Abre un issue en GitHub.

---

**Creado con ‚ù§Ô∏è para la comunidad LSB**