# üöÄ Crack Detection - Complete Training Pipeline

Questo notebook installa tutto automaticamente e esegue il training completo.

**Basta eseguire le celle in sequenza!**

---

## üìã Cosa fa questo notebook:

1. ‚úÖ Verifica ambiente (GPU, Python)
2. ‚úÖ Installa tutte le dipendenze
3. ‚úÖ Configura credenziali Kaggle
4. ‚úÖ Scarica dataset (2.1 GB)
5. ‚úÖ Training completo (50 epoche)
6. ‚úÖ Inference + Evaluation

**Durata totale: ~2-3 ore**

## ‚öôÔ∏è Step 1: Verifica Ambiente

In [None]:
# Verifica Python e GPU
import sys
import subprocess

print(f"Python version: {sys.version}")
print(f"Python executable: {sys.executable}")

# Verifica GPU
try:
    import torch
    print(f"\n‚úÖ PyTorch gi√† installato: {torch.__version__}")
    print(f"CUDA available: {torch.cuda.is_available()}")
    if torch.cuda.is_available():
        print(f"GPU: {torch.cuda.get_device_name(0)}")
        print(f"Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.2f} GB")
except ImportError:
    print("\n‚ö†Ô∏è PyTorch non ancora installato (verr√† installato dopo)")

## üì¶ Step 2: Installa Dipendenze

**Questo installer√† automaticamente tutte le librerie necessarie:**
- PyTorch (per GPU se disponibile)
- OpenCV
- Matplotlib, Pillow
- Tensorboard, tqdm
- Kaggle CLI

In [None]:
# Installa tutte le dipendenze
!pip install --quiet --upgrade pip
!pip install --quiet torch torchvision --index-url https://download.pytorch.org/whl/cu121
!pip install --quiet opencv-python matplotlib Pillow tqdm tensorboard numpy kaggle

print("‚úÖ Tutte le dipendenze installate!")

## üì• Step 3: Clone Repository GitHub

In [None]:
# Clone repository (o aggiorna se esiste)
import os
from pathlib import Path

REPO_URL = "https://github.com/Biobay/DeepLearningHard_ISWM.git"
PROJECT_DIR = Path("/content/DeepLearningHard_ISWM")  # Per Google Colab
# Per Jupyter locale: PROJECT_DIR = Path.home() / "DeepLearningHard_ISWM"

if PROJECT_DIR.exists():
    print("Repository gi√† esistente, pulling updates...")
    !cd {PROJECT_DIR} && git pull
else:
    print("Cloning repository...")
    !git clone {REPO_URL} {PROJECT_DIR}

# Vai nella directory del progetto
os.chdir(PROJECT_DIR)
print(f"\n‚úÖ Working directory: {os.getcwd()}")

## üîë Step 4: Configura Credenziali Kaggle

**IMPORTANTE:** Sostituisci con le tue credenziali!

Ottienile da: https://www.kaggle.com/settings ‚Üí API ‚Üí Create New Token

In [None]:
# ‚ö†Ô∏è SOSTITUISCI CON LE TUE CREDENZIALI KAGGLE!
KAGGLE_USERNAME = "mariomastrulli"  # ‚Üê Cambia con il tuo
KAGGLE_KEY = "KGAT_08037a2cf26b2f7ffa2612c5b6764b04"   # ‚Üê Cambia con la tua

# Setup credenziali
import json
kaggle_dir = Path.home() / ".kaggle"
kaggle_dir.mkdir(exist_ok=True)

kaggle_config = {
    "username": KAGGLE_USERNAME,
    "key": KAGGLE_KEY
}

kaggle_file = kaggle_dir / "kaggle.json"
with open(kaggle_file, 'w') as f:
    json.dump(kaggle_config, f)

# Set permissions
os.chmod(kaggle_file, 0o600)

print("‚úÖ Credenziali Kaggle configurate!")

## üì• Step 5: Scarica Dataset da Kaggle

**Questo scaricher√† 2.1 GB - pu√≤ richiedere 5-10 minuti**

In [None]:
# Scarica dataset da Kaggle
import zipfile
import shutil

KAGGLE_DATASET = "lakshaymiddha/crack-segmentation-dataset"
dataset_path = PROJECT_DIR / "dataset"
train_images = dataset_path / "train" / "images"

# Check se gi√† esiste
if train_images.exists() and list(train_images.glob("*.jpg")):
    print(f"‚úÖ Dataset gi√† presente ({len(list(train_images.glob('*.jpg')))} immagini)")
else:
    print("üì• Downloading dataset da Kaggle (2.1 GB)...")
    !kaggle datasets download -d {KAGGLE_DATASET}
    
    # Trova zip
    zip_file = list(PROJECT_DIR.glob("*.zip"))[0]
    print(f"üì¶ Extracting {zip_file.name}...")
    
    with zipfile.ZipFile(zip_file, 'r') as zip_ref:
        zip_ref.extractall(PROJECT_DIR)
    
    zip_file.unlink()
    
    # Organizza struttura
    print("üìÅ Organizing dataset structure...")
    dataset_path.mkdir(exist_ok=True)
    
    mappings = {
        'train_images': dataset_path / 'train' / 'images',
        'train_masks': dataset_path / 'train' / 'masks',
        'test_images': dataset_path / 'test' / 'images',
        'test_masks': dataset_path / 'test' / 'masks',
    }
    
    for src_name, dest_path in mappings.items():
        src_path = PROJECT_DIR / src_name
        if src_path.exists():
            dest_path.parent.mkdir(parents=True, exist_ok=True)
            shutil.move(str(src_path), str(dest_path))
    
    print("‚úÖ Dataset scaricato e organizzato!")

# Verifica
train_count = len(list((dataset_path / "train" / "images").glob("*.jpg")))
test_count = len(list((dataset_path / "test" / "images").glob("*.jpg")))
print(f"\nüìä Training images: {train_count}")
print(f"üìä Test images: {test_count}")

## üìÅ Step 6: Setup Directory

In [None]:
# Crea directory per output
directories = ['models', 'checkpoints', 'predictions', 'runs']

for dir_name in directories:
    dir_path = PROJECT_DIR / dir_name
    dir_path.mkdir(exist_ok=True)

print("‚úÖ Directory create!")

## üöÄ Step 7: TRAINING (50 Epoche)

**Questo richieder√† ~2-3 ore con GPU**

Verr√† eseguito `train_cloud.py` che include:
- Autoencoder convoluzionale
- MSE Loss per ricostruzione
- Checkpoints automatici ogni 5 epoche
- Resume automatico se interrotto

In [None]:
# Training completo
print("üöÄ Starting training...")
print("=" * 60)

!python train_cloud.py --resume

print("\n‚úÖ Training completato!")

## üîÆ Step 8: INFERENCE

Genera maschere di predizione per tutte le immagini test

In [None]:
# Inference - genera maschere predette
print("üîÆ Running inference...")
!python inference.py
print("\n‚úÖ Inference completata!")

## üìä Step 9: EVALUATION

Calcola metriche IoU, Dice, F1-score e ottimizza threshold

In [None]:
# Evaluation - calcola metriche
print("üìä Running evaluation...")
!python evaluate.py
print("\n‚úÖ Evaluation completata!")

## ‚úÖ COMPLETATO!

### üìÅ Risultati disponibili in:

- **Modello addestrato**: `models/best_autoencoder.pth`
- **Maschere predette**: `predictions/*.jpg`
- **Visualizzazioni**: `results_visualization.png`, `threshold_optimization.png`

### üíæ Per scaricare i risultati:

#### Su Google Colab:
```python
from google.colab import files
files.download('models/best_autoencoder.pth')
```

#### Su Jupyter locale:
I file sono gi√† nella directory del progetto!

In [None]:
# Mostra summary risultati
print("="*60)
print("üìä SUMMARY FINALE")
print("="*60)

# Model size
model_path = PROJECT_DIR / "models" / "best_autoencoder.pth"
if model_path.exists():
    size_mb = model_path.stat().st_size / 1024 / 1024
    print(f"\n‚úÖ Modello: {size_mb:.2f} MB")

# Predictions count
predictions = list((PROJECT_DIR / "predictions").glob("*.jpg"))
print(f"‚úÖ Predizioni generate: {len(predictions)}")

# Visualizations
viz_files = list(PROJECT_DIR.glob("*.png"))
print(f"‚úÖ Visualizzazioni: {len(viz_files)}")

print("\n" + "="*60)
print("üéâ TUTTO COMPLETATO!")
print("="*60)