# U-Net Training on Google Colab

This notebook sets up and trains the U-Net segmentation model using your Google Drive dataset.

## Prerequisites
- Your dataset is stored in Google Drive at `sentinel2_datasets/` with the structure:
  - `sentinel2_datasets/train/train_images/`
  - `sentinel2_datasets/train/train_masks/`
  - `sentinel2_datasets/val/val_images/`
  - `sentinel2_datasets/val/val_masks/`
  - `sentinel2_datasets/test/test_images/`
  - `sentinel2_datasets/test/test_masks/`

## Setup
1. Mount Google Drive
2. Get the training code (choose one option):

   **Option A: Clone Repository (Recommended)**:
   ```bash
   !git clone https://github.com/ns530/skycrop.git
   %cd skycrop/ml-training
   ```

   **Option B: Upload to Drive (Alternative)**:
   Upload the ml-training folder to your Google Drive and navigate to it

3. Install dependencies
4. Run training

In [None]:
# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Navigate to your project directory
# Choose the appropriate option based on how you got the code:

# Option A: If you cloned the repository
# import os
# os.chdir('/content/skycrop/ml-training')

# Option B: If you uploaded to Drive
import os
os.chdir('/content/drive/MyDrive/SkyCrop/ml-training')  # Adjust this path if needed
print("Current directory:", os.getcwd())

In [None]:
# Install dependencies
!pip install -r requirements.txt

In [None]:
# Set environment variables
import os
os.environ['DATA_DIR'] = '/content/drive/MyDrive'
os.environ['RUNS_DIR'] = '/content/drive/MyDrive/runs'  # Optional: save runs to Drive

# Verify dataset path
dataset_path = '/content/drive/MyDrive/sentinel2_datasets'
if os.path.exists(dataset_path):
    print(f"Dataset found at: {dataset_path}")
    print("Contents:", os.listdir(dataset_path))
else:
    print(f"Dataset not found at: {dataset_path}")
    print("Please ensure your dataset is at the correct location.")

In [None]:
# Run training
!python train_unet.py --config config.yaml

## Notes
- Training may take several hours depending on your dataset size and Colab runtime.
- Checkpoints and logs will be saved to the runs directory.
- If you encounter memory issues, reduce batch_size in config.yaml.
- For GPU acceleration, ensure Runtime > Change runtime type > GPU.