# GANNs with friends - Google Colab worker

Run this notebook to participate in distributed GAN training using Google Colab's free GPU.

## Setup environment

Install required packages and clone the repository.

In [None]:
# Clone repository if not already present
import os
if not os.path.exists('GANNs-with-friends'):
    !git clone https://github.com/gperdrizet/GANNs-with-friends.git
    %cd GANNs-with-friends
else:
    %cd GANNs-with-friends
    !git pull

In [None]:
# Install dependencies
# Note: You may see warnings about numpy version conflicts with Colab's pre-installed
# packages (jax, opencv, etc.). These can be safely ignored - we don't use those packages.
!pip install --quiet -r requirements.txt

In [None]:
# Check GPU availability
# If this shows "No GPU", go to Runtime > Change runtime type > T4 GPU
import torch
if torch.cuda.is_available():
    print(f'GPU: {torch.cuda.get_device_name(0)}')
    print(f'Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB')
else:
    print('No GPU detected! Enable GPU in Runtime > Change runtime type')

## Download CelebA dataset

Downloads the CelebA face dataset from Hugging Face (~1.3 GB compressed).

The dataset is loaded directly from the zip file without extraction - this is much faster on cloud VMs like Colab where disk I/O can be slow.

In [None]:
# Download dataset from Hugging Face if not already present
import sys
import shutil
sys.path.insert(0, 'src')

from utils import load_config, ensure_dataset_available

# Create config from template if it doesn't exist yet
if not os.path.exists('config.yaml'):
    shutil.copy('config.yaml.template', 'config.yaml')
    print('Created config.yaml from template')

# Load config to get HuggingFace repo and dataset path
config = load_config('config.yaml')

# Download if needed (reads directly from zip, no extraction required)
ensure_dataset_available(config)

## Configure worker

The config file was created from the template. Review and customize your settings.

**The default config should work out of the box** - it's pre-configured with the database credentials for the class training session.

**Optional customization:**
- Your name (shown on the dashboard leaderboard)
- Batch size (tune for your GPU memory - Colab T4 can usually handle 64)

In [None]:
# Verify config exists (should have been created in the download step)
if os.path.exists('config.yaml'):
    print('config.yaml exists - edit it with your settings before running the worker')
else:
    # Create from template if somehow missing
    import shutil
    shutil.copy('config.yaml.template', 'config.yaml')
    print('Created config.yaml from template')

In [None]:
# Display current config
!cat config.yaml

### Edit config.yaml (optional)

Use the file browser in the left sidebar to open and edit `config.yaml`.

**Worker section** (customize to identify yourself):
```yaml
worker:
  name: YourName          # Shows on dashboard leaderboard
  batch_size: 64          # Colab T4 can handle 64; use 32 if you get memory errors
```

**Database section** (already configured - only change if coordinator gives different credentials):
```yaml
database:
  host: perdrizet.org
  port: 54321
  database: distributed_gan
  user: admin
  password: <provided>
```

## Run worker

Start the training worker. It will:
1. Connect to the database and register your worker
2. Download the current model weights
3. Process work units and upload gradients
4. Repeat until training completes or you stop it

Your progress will appear on the coordinator's dashboard.

In [None]:
!python src/worker.py