# SSH Key

Creamos SSH Key para poder usar GitHub sin necesidad de ingresar usuario y contraseña cada vez.

In [1]:
# Verificar GPU disponible
!nvidia-smi || echo "No se detectó GPU o nvidia-smi no está disponible."

Tue Jan 13 00:29:26 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   34C    P8              9W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

In [None]:
%%bash
set -e
GIT_EMAIL=${GIT_EMAIL:-"joaquin.arroyo100@gmail.com"}

# Genera llave efímera para esta sesión
if [ -f ~/.ssh/id_ed25519 ]; then
  echo "Ya existe ~/.ssh/id_ed25519; se reutiliza."
else
  ssh-keygen -t ed25519 -C "$GIT_EMAIL" -f ~/.ssh/id_ed25519 -N ""
fi

echo "Llave pública (pégala en GitHub → SSH Keys):"
cat ~/.ssh/id_ed25519.pub

In [None]:
# Clonar solo si falta, listar y actualizar (SSH)
import os
import subprocess

GIT_USER = "Joaquin Arroyo"
GIT_EMAIL = "joaquin.arroyo100@gmail.com"
GITHUB_USER = "joaquinarroyo"
REPO_NAME = "Tesina"
repo_path = REPO_NAME

os.environ["GIT_USER"] = GIT_USER
os.environ["GIT_EMAIL"] = GIT_EMAIL
os.environ["GITHUB_USER"] = GITHUB_USER
os.environ["REPO_NAME"] = REPO_NAME
os.environ["REPO_PATH"] = repo_path

# Agregar GitHub a known_hosts para evitar "Host key verification failed"
print("Configurando SSH para GitHub...")
subprocess.run("mkdir -p ~/.ssh && ssh-keyscan -t rsa github.com >> ~/.ssh/known_hosts 2>/dev/null", shell=True)

cwd_name = os.path.basename(os.getcwd())

if cwd_name == REPO_NAME:
    print("Ya estás dentro del repo; se hace git pull.")
    !git remote set-url origin git@github.com:{GITHUB_USER}/{REPO_NAME}.git
    !git pull --rebase
else:
    if os.path.exists(repo_path):
        print(f"La carpeta {repo_path} ya existe; se hace git pull.")
        %cd {repo_path}
        !git remote set-url origin git@github.com:{GITHUB_USER}/{REPO_NAME}.git
        !git pull --rebase
    else:
        !git clone git@github.com:{GITHUB_USER}/{REPO_NAME}.git
        %cd {repo_path}

print(f"Contenido del directorio {repo_path}:")
%ls

In [None]:
print("Instalando dependencias...")
!pip install -r requirements.txt

## Plan de Experimentos Secuenciales

Fase de baseline para contrastar luego con MPS. Ejecuta en orden para construir datos de referencia.

### 1. E1 Sequential - Validación del Setup
**MNIST + LeNet-5** - Rápido, ideal para verificar que todo funciona (métricas, energía, plots)

In [None]:
!python -m experiments.sequential --exp E1 --out ./runs --repeat 3 --seed 42 --gpu-index 0

Ejecutando E1 | condition=sequential | seed=42 | pid=1668
Dataset: mnist | Modelo: simplecnn | bs=256 | epochs=3
Salida: runs/E1_mnist_simplecnn_sequential_20260109_013023_981992_pid1668
100% 9.91M/9.91M [00:00<00:00, 20.6MB/s]
100% 28.9k/28.9k [00:00<00:00, 486kB/s]
100% 1.65M/1.65M [00:00<00:00, 4.54MB/s]
100% 4.54k/4.54k [00:00<00:00, 16.0MB/s]
[mnist | simplecnn] epoch 1/3 train_acc=0.891 val_acc=0.971 train_thr=2938.7 samp/s
[mnist | simplecnn] epoch 2/3 train_acc=0.975 val_acc=0.983 train_thr=7355.2 samp/s
[mnist | simplecnn] epoch 3/3 train_acc=0.983 val_acc=0.984 train_thr=7909.6 samp/s
Ejecutando E1 | condition=sequential | seed=43 | pid=1668
Dataset: mnist | Modelo: simplecnn | bs=256 | epochs=3
Salida: runs/E1_mnist_simplecnn_sequential_20260109_013110_466931_pid1668
[mnist | simplecnn] epoch 1/3 train_acc=0.894 val_acc=0.970 train_thr=8043.1 samp/s
[mnist | simplecnn] epoch 2/3 train_acc=0.974 val_acc=0.981 train_thr=7080.5 samp/s
[mnist | simplecnn] epoch 3/3 train_acc=0.9

### 2. E3 Sequential - Baseline Robusto
**CIFAR-10 + ResNet-18** - Complejidad media, ideal para ver el impacto de MPS después

In [None]:
!python -m experiments.sequential --exp E3 --out ./runs --repeat 3 --seed 100 --gpu-index 0

Ejecutando E3 | condition=sequential | seed=100 | pid=2349
Dataset: cifar10 | Modelo: resnet18 | bs=128 | epochs=5
Salida: runs/E3_cifar10_resnet18_sequential_20260109_013217_679903_pid2349
100% 170M/170M [00:04<00:00, 40.9MB/s] 
[cifar10 | resnet18] epoch 1/5 train_acc=0.437 val_acc=0.423 train_thr=1176.2 samp/s
[cifar10 | resnet18] epoch 2/5 train_acc=0.573 val_acc=0.583 train_thr=2558.0 samp/s
[cifar10 | resnet18] epoch 3/5 train_acc=0.636 val_acc=0.603 train_thr=2585.8 samp/s
[cifar10 | resnet18] epoch 4/5 train_acc=0.672 val_acc=0.639 train_thr=2628.7 samp/s
[cifar10 | resnet18] epoch 5/5 train_acc=0.699 val_acc=0.678 train_thr=2491.3 samp/s
Ejecutando E3 | condition=sequential | seed=101 | pid=2349
Dataset: cifar10 | Modelo: resnet18 | bs=128 | epochs=5
Salida: runs/E3_cifar10_resnet18_sequential_20260109_013439_692412_pid2349
[cifar10 | resnet18] epoch 1/5 train_acc=0.437 val_acc=0.372 train_thr=2511.7 samp/s
[cifar10 | resnet18] epoch 2/5 train_acc=0.576 val_acc=0.576 train_thr

### 3. E6 Sequential - Modelo Pesado
**CIFAR-10 + VGG-16** - Mayor consumo energético, útil para análisis de saturación con MPS

In [None]:
!python -m experiments.sequential --exp E6 --out ./runs --repeat 2 --seed 200 --gpu-index 0

Ejecutando E6 | condition=sequential | seed=200 | pid=4150
Dataset: flowers102 | Modelo: resnet50 | bs=32 | epochs=5
Salida: runs/E6_flowers102_resnet50_sequential_20260109_013826_820364_pid4150
100% 345M/345M [00:17<00:00, 19.8MB/s] 
100% 502/502 [00:00<00:00, 2.16MB/s]
100% 15.0k/15.0k [00:00<00:00, 47.2MB/s]
[flowers102 | resnet50] epoch 1/5 train_acc=0.005 val_acc=0.005 train_thr=30.9 samp/s
[flowers102 | resnet50] epoch 2/5 train_acc=0.017 val_acc=0.023 train_thr=110.2 samp/s
[flowers102 | resnet50] epoch 3/5 train_acc=0.034 val_acc=0.070 train_thr=108.7 samp/s
[flowers102 | resnet50] epoch 4/5 train_acc=0.054 val_acc=0.086 train_thr=109.9 samp/s
[flowers102 | resnet50] epoch 5/5 train_acc=0.075 val_acc=0.126 train_thr=124.4 samp/s
Ejecutando E6 | condition=sequential | seed=201 | pid=4150
Dataset: flowers102 | Modelo: resnet50 | bs=32 | epochs=5
Salida: runs/E6_flowers102_resnet50_sequential_20260109_014045_992804_pid4150
[flowers102 | resnet50] epoch 1/5 train_acc=0.012 val_acc=

### 4. E2 Sequential - Dataset Intermedio
**Fashion-MNIST + LeNet-5** - Valida si el tipo de datos afecta eficiencia

In [None]:
!python -m experiments.sequential --exp E2 --out ./runs --repeat 3 --seed 300 --gpu-index 0

Ejecutando E2 | condition=sequential | seed=300 | pid=5269
Dataset: cifar10 | Modelo: mobilenet_v3_small | bs=256 | epochs=5
Salida: runs/E2_cifar10_mobilenet_v3_small_sequential_20260109_014212_423570_pid5269
[cifar10 | mobilenet_v3_small] epoch 1/5 train_acc=0.294 val_acc=0.100 train_thr=1138.3 samp/s
[cifar10 | mobilenet_v3_small] epoch 2/5 train_acc=0.400 val_acc=0.100 train_thr=2604.2 samp/s
[cifar10 | mobilenet_v3_small] epoch 3/5 train_acc=0.451 val_acc=0.382 train_thr=2512.9 samp/s
[cifar10 | mobilenet_v3_small] epoch 4/5 train_acc=0.478 val_acc=0.479 train_thr=2634.2 samp/s
[cifar10 | mobilenet_v3_small] epoch 5/5 train_acc=0.500 val_acc=0.475 train_thr=2654.8 samp/s
Ejecutando E2 | condition=sequential | seed=301 | pid=5269
Dataset: cifar10 | Modelo: mobilenet_v3_small | bs=256 | epochs=5
Salida: runs/E2_cifar10_mobilenet_v3_small_sequential_20260109_014428_015126_pid5269
[cifar10 | mobilenet_v3_small] epoch 1/5 train_acc=0.296 val_acc=0.100 train_thr=2592.1 samp/s
[cifar10 |

### Commit y Push de Resultados Secuenciales

In [14]:
!git config user.name "Joaquin Arroyo"
!git config user.email "joaquin.arroyo100@gmail.com"
!git add runs/
!git commit -m "Resultados baseline secuencial: E1, E3, E6, E2" || echo "No hay cambios para commitear."
!git push origin main

[main 6e23b3c] Resultados baseline secuencial: E1, E3, E6, E2
 111 files changed, 34041 insertions(+), 1 deletion(-)
 create mode 100644 runs/E1_mnist_simplecnn_sequential_20260109_013023_981992_pid1668/acc_vs_epoch.png
 create mode 100644 runs/E1_mnist_simplecnn_sequential_20260109_013023_981992_pid1668/epochs.csv
 create mode 100644 runs/E1_mnist_simplecnn_sequential_20260109_013023_981992_pid1668/gpu_power_over_time.png
 create mode 100644 runs/E1_mnist_simplecnn_sequential_20260109_013023_981992_pid1668/gpu_samples.csv
 create mode 100644 runs/E1_mnist_simplecnn_sequential_20260109_013023_981992_pid1668/gpu_util_over_time.png
 create mode 100644 runs/E1_mnist_simplecnn_sequential_20260109_013023_981992_pid1668/loss_vs_epoch.png
 create mode 100644 runs/E1_mnist_simplecnn_sequential_20260109_013023_981992_pid1668/run.json
 create mode 100644 runs/E1_mnist_simplecnn_sequential_20260109_013023_981992_pid1668/summary_row.csv
 create mode 100644 runs/E1_mnist_simplecnn_sequential_202601

## Plan de Experimentos con MPS (Multi-Process Service)

NVIDIA MPS permite que múltiples procesos CUDA compartan una GPU de forma eficiente. 
Esto es útil para medir:
- **Throughput agregado**: ¿Entrenar 2+ modelos en paralelo es más eficiente que secuencial?
- **Consumo energético**: ¿Se aprovecha mejor la GPU compartiendo recursos?
- **Saturación**: ¿Cuántos procesos paralelos saturan la GPU?

### 0. Inicializar MPS
Antes de correr experimentos MPS, necesitamos iniciar el daemon MPS.

In [13]:
# Iniciar NVIDIA MPS daemon, instalar nvtop y task-spooler
!source setup_mps.sh
!apt install nvtop task-spooler -y -qq
print("✅ MPS iniciado, nvtop y tsp instalados")

MPS iniciado
nvtop is already the newest version (1.2.2-1).
The following NEW packages will be installed:
  task-spooler
0 upgraded, 1 newly installed, 0 to remove and 41 not upgraded.
Need to get 34.4 kB of archives.
After this operation, 95.2 kB of additional disk space will be used.
Selecting previously unselected package task-spooler.
(Reading database ... 121694 files and directories currently installed.)
Preparing to unpack .../task-spooler_1.0.1+dfsg1-1_amd64.deb ...
Unpacking task-spooler (1.0.1+dfsg1-1) ...
Setting up task-spooler (1.0.1+dfsg1-1) ...
Processing triggers for man-db (2.10.2-1) ...
✅ MPS iniciado, nvtop y tsp instalados


### 1. MPS E1 x2 - Validación MPS Básico
**2 instancias de MNIST + SimpleCNN** - Verifica que MPS funciona y mide speedup vs secuencial

In [14]:
# Encolar MPS E1 x2
!tsp python -m experiments.mps --exp E1 --parallel 2 --out ./runs --seed 1000 --mps-group MPS_E1_x2

0


### 2. MPS E1 x3 - Escalabilidad
**3 instancias de MNIST + SimpleCNN** - ¿Escala linealmente el throughput?

In [15]:
# Encolar MPS E1 x3
!tsp python -m experiments.mps --exp E1 --parallel 3 --out ./runs --seed 2000 --mps-group MPS_E1_x3

1


### 3. MPS E3 x2 - Modelo Mediano en Paralelo
**2 instancias de CIFAR-10 + ResNet-18** - Comparar con baseline E3 secuencial

In [16]:
# Encolar MPS E3 x2
!tsp python -m experiments.mps --exp E3 --parallel 2 --out ./runs --seed 3000 --mps-group MPS_E3_x2

2


### 4. MPS E2 x2 - Modelo Liviano en Paralelo
**2 instancias de CIFAR-10 + MobileNet-v3** - Modelo eficiente, ¿mejor aprovechamiento con MPS?

In [17]:
# Encolar MPS E2 x2
!tsp python -m experiments.mps --exp E2 --parallel 2 --out ./runs --seed 4000 --mps-group MPS_E2_x2

3


### 5. MPS Heterogéneo: E1 + E3
**MNIST + CIFAR-10 en paralelo** - ¿Qué pasa cuando modelos de diferente tamaño compiten?

In [18]:
# Encolar MPS E1+E3 heterogéneo
!tsp python -m experiments.mps --exp E1 E3 --out ./runs --seed 5000 --mps-group MPS_E1_E3

4


### 6. MPS E6 x2 - Saturación con Modelo Pesado
**2 instancias de Flowers102 + ResNet-50** - Punto de saturación con modelo memory-bound

In [19]:
# Encolar MPS E6 x2 (cuidado: puede saturar memoria)
!tsp python -m experiments.mps --exp E6 --parallel 2 --out ./runs --seed 6000 --mps-group MPS_E6_x2

5


In [23]:
# Ver cola de MPS
!tsp -l

ID   State      Output               E-Level  Times(r/u/s)   Command [run=0/1]
0    finished   /tmp/ts-out.0U5c1f   0        0.07/0.04/0.01 python -m experiments.mps --exp E1 --parallel 2 --out ./runs --seed 1000 --mps-group MPS_E1_x2
1    finished   /tmp/ts-out.e8bV0e   0        0.09/0.07/0.01 python -m experiments.mps --exp E1 --parallel 3 --out ./runs --seed 2000 --mps-group MPS_E1_x3
2    finished   /tmp/ts-out.rdGtc4   0        0.08/0.06/0.01 python -m experiments.mps --exp E3 --parallel 2 --out ./runs --seed 3000 --mps-group MPS_E3_x2
3    finished   /tmp/ts-out.iWA90z   0        0.06/0.05/0.01 python -m experiments.mps --exp E2 --parallel 2 --out ./runs --seed 4000 --mps-group MPS_E2_x2
4    finished   /tmp/ts-out.5Le0oC   0        0.07/0.05/0.01 python -m experiments.mps --exp E1 E3 --out ./runs --seed 5000 --mps-group MPS_E1_E3
5    finished   /tmp/ts-out.iJb5bi   0        0.06/0.05/0.01 python -m experiments.mps --exp E6 --parallel 2 --out ./runs --seed 6000 --mps-group MPS_E

In [None]:
# Probar MPS E1 x2 directamente (sin tsp)
!python -m experiments.mps --exp E1 --parallel 2 --out ./runs --seed 1000 --mps-group MPS_E1_x2

[?1l>[32m[7m    PID USER DEV    TYPE  GPU [36m       GPU MEM[32m    CPU  HOST MEM Command           [24;1H[m[mF2[36m[7mSetup   [m[mF6[36m[7mSort    [m[mF9[36m[7mKill    [m[mF10[36m[7mQuit    [m[mF12[36m[7mSave Config                       [4h [4l[24;56H[m[mmq[36mq[33mq[36mq[33mq[36mq[33mq[36mq[33mq[36mq[33mq[36mq[33mq[36mq[33mq[36mq[33mq[36mq[33mq[36mq[33mq[36mq[33mq[36mq[33mq[36mq[33mq[36mq[33mq[36mq[33mq[36mq[33mq[36mq[33mq[mx[17;4Hmqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqj[18;31H[24;1H[2J[?47l8

### Commit y Push de Resultados MPS

In [None]:
!git add runs/ experiments/mps.py
!git commit -m "Resultados MPS: E1x2, E1x3, E3x2, E2x2, E1+E3, E6x2" || echo "No hay cambios para commitear."
!git push origin main

### Detener MPS (opcional)
Ejecutar al final de la sesión si quieres liberar recursos.

In [None]:
# Detener MPS daemon
!echo quit | nvidia-cuda-mps-control 2>/dev/null || echo "MPS no estaba corriendo"