# üéØ Guia Pr√°tico - Segmenta√ß√£o Sem√¢ntica
## Para Prova Pr√°tica de Deep Learning

Este notebook cont√©m templates e exemplos prontos para usar em provas pr√°ticas de segmenta√ß√£o sem√¢ntica de imagens.

### O que √© Segmenta√ß√£o Sem√¢ntica?
- **Classifica√ß√£o pixel a pixel**: Cada pixel da imagem √© classificado em uma categoria
- **Exemplo**: Separar foreground (gato/cachorro) de background em uma imagem
- **Aplica√ß√µes**: Remo√ß√£o de fundo, segmenta√ß√£o m√©dica, detec√ß√£o de objetos, etc.


## üìö 1. IMPORTS ESSENCIAIS


In [None]:
# Imports b√°sicos
import numpy as np
import matplotlib.pyplot as plt
import os
import glob
import tarfile
import urllib.request
import random

# TensorFlow/Keras
from tensorflow import keras
from keras import layers
from keras.utils import load_img, img_to_array, array_to_img

# Para visualiza√ß√£o
plt.style.use('default')


## üì• 2. DOWNLOAD E CARREGAMENTO DE DADOS


In [None]:
# üî• Fun√ß√£o para baixar e extrair Oxford-IIIT Pets Dataset
def download_pets_dataset(data_dir="data"):
    """
    Baixa e extrai o dataset Oxford-IIIT Pets para segmenta√ß√£o sem√¢ntica
    """
    os.makedirs(data_dir, exist_ok=True)
    
    # URLs do dataset
    images_url = 'http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz'
    annotations_url = 'http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz'
    
    print("üì• Baixando images.tar.gz...")
    images_path = os.path.join(data_dir, 'images.tar.gz')
    urllib.request.urlretrieve(images_url, images_path)
    
    print("üì• Baixando annotations.tar.gz...")
    annotations_path = os.path.join(data_dir, 'annotations.tar.gz')
    urllib.request.urlretrieve(annotations_url, annotations_path)
    
    # Extrair arquivos
    print("üì¶ Extraindo images.tar.gz...")
    with tarfile.open(images_path, 'r:gz') as tar:
        tar.extractall(data_dir)
    
    print("üì¶ Extraindo annotations.tar.gz...")
    with tarfile.open(annotations_path, 'r:gz') as tar:
        tar.extractall(data_dir)
    
    print("‚úÖ Download e extra√ß√£o conclu√≠dos!")
    return data_dir

# Executar: download_pets_dataset()


In [None]:
# üî• Fun√ß√£o para preparar caminhos de imagens e m√°scaras
def prepare_dataset_paths(input_dir, target_dir):
    """
    Prepara listas de caminhos de imagens e m√°scaras correspondentes
    
    Args:
        input_dir: Diret√≥rio com imagens originais
        target_dir: Diret√≥rio com m√°scaras de segmenta√ß√£o
    
    Returns:
        input_img_paths: Lista de caminhos das imagens
        target_img_paths: Lista de caminhos das m√°scaras
    """
    # Obter todas as imagens JPG
    input_img_paths = sorted(glob.glob(os.path.join(input_dir, "*.jpg")))
    
    # Criar caminhos correspondentes das m√°scaras
    target_img_paths = []
    for img_path in input_img_paths:
        filename = os.path.basename(img_path)
        name_without_ext = os.path.splitext(filename)[0]
        mask_path = os.path.join(target_dir, name_without_ext + ".png")
        target_img_paths.append(mask_path)
    
    print(f"‚úÖ Encontradas {len(input_img_paths)} imagens")
    print(f"üìÅ Primeiras 3 imagens: {input_img_paths[:3]}")
    print(f"üìÅ Primeiras 3 m√°scaras: {target_img_paths[:3]}")
    
    return input_img_paths, target_img_paths


In [None]:
# üî• Fun√ß√£o para carregar e pr√©-processar dataset completo
def load_segmentation_dataset(input_img_paths, target_img_paths, img_size=(200, 200), 
                             val_split=0.15, shuffle_seed=1337):
    """
    Carrega dataset completo de segmenta√ß√£o sem√¢ntica
    
    Args:
        input_img_paths: Lista de caminhos das imagens
        target_img_paths: Lista de caminhos das m√°scaras
        img_size: Tamanho para redimensionar (altura, largura)
        val_split: Propor√ß√£o para valida√ß√£o (ex: 0.15 = 15%)
        shuffle_seed: Seed para embaralhamento
    
    Returns:
        train_input_imgs, train_targets, val_input_imgs, val_targets
    """
    # Embaralhar mantendo correspond√™ncia
    random.Random(shuffle_seed).shuffle(input_img_paths)
    random.Random(shuffle_seed).shuffle(target_img_paths)
    
    # Fun√ß√µes auxiliares
    def path_to_input_image(path):
        return img_to_array(load_img(path, target_size=img_size))
    
    def path_to_target(path):
        img = img_to_array(
            load_img(path, target_size=img_size, color_mode="grayscale")
        )
        img = img.astype("uint8") - 1  # Labels: 0, 1, 2 (foreground, background, contour)
        return img
    
    # Carregar todas as imagens
    num_imgs = len(input_img_paths)
    input_imgs = np.zeros((num_imgs,) + img_size + (3,), dtype="float32")
    targets = np.zeros((num_imgs,) + img_size + (1,), dtype="uint8")
    
    print(f"üìä Carregando {num_imgs} imagens...")
    for i in range(num_imgs):
        if (i + 1) % 1000 == 0:
            print(f"  Processadas {i + 1}/{num_imgs} imagens...")
        input_imgs[i] = path_to_input_image(input_img_paths[i])
        targets[i] = path_to_target(target_img_paths[i])
    
    # Dividir em treino e valida√ß√£o
    num_val_samples = int(num_imgs * val_split)
    train_input_imgs = input_imgs[:-num_val_samples]
    train_targets = targets[:-num_val_samples]
    val_input_imgs = input_imgs[-num_val_samples:]
    val_targets = targets[-num_val_samples:]
    
    print(f"‚úÖ Dataset carregado:")
    print(f"   Treino: {len(train_input_imgs)} imagens")
    print(f"   Valida√ß√£o: {len(val_input_imgs)} imagens")
    
    return train_input_imgs, train_targets, val_input_imgs, val_targets


## üèóÔ∏è 3. MODELOS DE SEGMENTA√á√ÉO SEM√ÇNTICA


### üîç Arquitetura U-Net Style

**Conceito Principal:**
- **Encoder (Downsampling)**: Reduz tamanho, aumenta filtros (captura features)
- **Decoder (Upsampling)**: Aumenta tamanho, reduz filtros (reconstr√≥i m√°scara)

**Por que n√£o usar MaxPooling?**
- MaxPooling perde informa√ß√£o de localiza√ß√£o espacial
- Em segmenta√ß√£o, precisamos manter localiza√ß√£o precisa
- **Solu√ß√£o**: Usar convolu√ß√µes com `strides=2` (downsample) e `Conv2DTranspose` com `strides=2` (upsample)


In [None]:
# üî• TEMPLATE 1: Modelo b√°sico de segmenta√ß√£o sem√¢ntica (U-Net style)
def create_segmentation_model(img_size, num_classes):
    """
    Cria modelo b√°sico para segmenta√ß√£o sem√¢ntica
    
    Args:
        img_size: Tupla (altura, largura) da imagem
        num_classes: N√∫mero de classes (ex: 3 para foreground/background/contour)
    
    Returns:
        Modelo Keras compilado
    """
    inputs = keras.Input(shape=img_size + (3,))
    
    # Rescalar imagens para [0, 1]
    x = layers.Rescaling(1./255)(inputs)
    
    # ENCODER (Downsampling) - Captura features
    x = layers.Conv2D(64, 3, strides=2, activation="relu", padding="same")(x)
    x = layers.Conv2D(64, 3, activation="relu", padding="same")(x)
    
    x = layers.Conv2D(128, 3, strides=2, activation="relu", padding="same")(x)
    x = layers.Conv2D(128, 3, activation="relu", padding="same")(x)
    
    x = layers.Conv2D(256, 3, strides=2, activation="relu", padding="same")(x)
    x = layers.Conv2D(256, 3, activation="relu", padding="same")(x)
    
    # DECODER (Upsampling) - Reconstr√≥i m√°scara
    x = layers.Conv2DTranspose(256, 3, activation="relu", padding="same")(x)
    x = layers.Conv2DTranspose(256, 3, activation="relu", padding="same", strides=2)(x)
    
    x = layers.Conv2DTranspose(128, 3, activation="relu", padding="same")(x)
    x = layers.Conv2DTranspose(128, 3, activation="relu", padding="same", strides=2)(x)
    
    x = layers.Conv2DTranspose(64, 3, activation="relu", padding="same")(x)
    x = layers.Conv2DTranspose(64, 3, activation="relu", padding="same", strides=2)(x)
    
    # Camada final: classifica√ß√£o por pixel
    outputs = layers.Conv2D(num_classes, 3, activation="softmax", padding="same")(x)
    
    model = keras.Model(inputs=inputs, outputs=outputs)
    return model


In [None]:
### üî• TEMPLATE 2: Modelo mais profundo (opcional)


In [None]:
# Template de modelo mais profundo (para imagens maiores ou problemas mais complexos)
def create_deep_segmentation_model(img_size, num_classes):
    """Modelo mais profundo com mais camadas"""
    inputs = keras.Input(shape=img_size + (3,))
    x = layers.Rescaling(1./255)(inputs)
    
    # ENCODER mais profundo
    x = layers.Conv2D(32, 3, strides=2, activation="relu", padding="same")(x)
    x = layers.Conv2D(32, 3, activation="relu", padding="same")(x)
    
    x = layers.Conv2D(64, 3, strides=2, activation="relu", padding="same")(x)
    x = layers.Conv2D(64, 3, activation="relu", padding="same")(x)
    
    x = layers.Conv2D(128, 3, strides=2, activation="relu", padding="same")(x)
    x = layers.Conv2D(128, 3, activation="relu", padding="same")(x)
    
    x = layers.Conv2D(256, 3, strides=2, activation="relu", padding="same")(x)
    x = layers.Conv2D(256, 3, activation="relu", padding="same")(x)
    
    # DECODER
    x = layers.Conv2DTranspose(256, 3, activation="relu", padding="same")(x)
    x = layers.Conv2DTranspose(256, 3, activation="relu", padding="same", strides=2)(x)
    
    x = layers.Conv2DTranspose(128, 3, activation="relu", padding="same")(x)
    x = layers.Conv2DTranspose(128, 3, activation="relu", padding="same", strides=2)(x)
    
    x = layers.Conv2DTranspose(64, 3, activation="relu", padding="same")(x)
    x = layers.Conv2DTranspose(64, 3, activation="relu", padding="same", strides=2)(x)
    
    outputs = layers.Conv2D(num_classes, 3, activation="softmax", padding="same")(x)
    
    model = keras.Model(inputs=inputs, outputs=outputs)
    return model


## üé® 4. VISUALIZA√á√ÉO DE M√ÅSCARAS E RESULTADOS


In [None]:
# üî• Fun√ß√£o para visualizar m√°scaras de segmenta√ß√£o
def display_target(target_array):
    """
    Visualiza uma m√°scara de segmenta√ß√£o (normalizada para visualiza√ß√£o)
    
    Args:
        target_array: Array com valores 0, 1, 2 (foreground, background, contour)
    """
    # Converter: (0,1,2) -> (0, 127, 254) para visualiza√ß√£o
    normalized_array = (target_array.astype("uint8") - 1) * 127
    if len(normalized_array.shape) == 3:
        normalized_array = normalized_array[:, :, 0]
    
    plt.axis("off")
    plt.imshow(normalized_array, cmap='gray')
    plt.title("Segmentation Mask")

# Exemplo de uso:
# display_target(targets[0])


In [None]:
# üî• Fun√ß√£o para visualizar predi√ß√µes do modelo
def display_mask(pred_mask):
    """
    Visualiza m√°scara predita pelo modelo
    
    Args:
        pred_mask: Array de predi√ß√£o com shape (H, W, num_classes) ou (H, W)
    """
    # Se for tensor com m√∫ltiplas classes, pegar argmax
    if len(pred_mask.shape) == 3:
        mask = np.argmax(pred_mask,ÊïôËÇ≤‰∏é=-1)
    else:
        mask = pred_mask
    
    # Normalizar para visualiza√ß√£o: (0,1,2) -> (0, 127, 254)
    mask = mask * 127
    
    plt.axis("off")
    plt.imshow(mask, cmap='gray')
    plt.title("Predicted Mask")


## üöÄ 5. WORKFLOW COMPLETO DE TREINAMENTO


In [None]:
# üî• Workflow completo de treinamento de segmenta√ß√£o sem√¢ntica
def complete_segmentation_workflow(img_size=(200, 200), num_classes=3, epochs=50, batch_size=64):
    """
    Workflow completo para treinamento de modelo de segmenta√ß√£o sem√¢ntica
    
    Passos:
    1. Baixar dataset (se necess√°rio)
    2. Preparar caminhos
    3. Carregar dados
    4. Criar modelo
    5. Compilar
    6. Treinar
    7. Avaliar e visualizar
    """
    
    # 1. Preparar caminhos (assumindo que dataset j√° foi baixado)
    input_dir = "data/images"
    target_dir = "data/annotations/trimaps"
    input_img_paths, target_img_paths = prepare_dataset_paths(input_dir, target_dir)
    
    # 2. Carregar dataset
    train_input_imgs, train_targets, val_input_imgs, val_targets = load_segmentation_dataset(
        input_img_paths, target_img_paths, img_size=img_size
    )
    
    # 3. Criar modelo
    model = create_segmentation_model(img_size, num_classes)
    model.summary()
    
    # 4. Compilar
    model.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy")
    
    # 5. Callbacks
    callbacks = [
        keras.callbacks.ModelCheckpoint(
            "segmentation_model.keras",
            save_best_only=True,
            monitor="val_loss"
        ),
        keras.callbacks.EarlyStopping(
            monitor="val_loss",
            patience=10,
            restore_best_weights=True
        )
    ]
    
    # 6. Treinar
    print("üöÄ Iniciando treinamento...")
    history = model.fit(
        train_input_imgs, train_targets,
        epochs=epochs,
        batch_size=batch_size,
        validation_data=(val_input_imgs, val_targets),
        callbacks=callbacks
    )
    
    # 7. Visualizar resultados de algumas imagens
    print("\nüìä Visualizando resultados...")
    model_best = keras.models.load_model("segmentation_model.keras")
    
    # Visualizar algumas predi√ß√µes
    for i in range(min(3, len(val_input_imgs))):
        test_image = val_input_imgs[i]
        true_mask = val_targets[i]
        
        # Fazer predi√ß√£o
        pred_mask = model_best.predict(np.expand_dims(test_image, axis=0))[0]
        
        # Visualizar
        compare_segmentation(test_image, true_mask, pred_mask, title_prefix=f"Exemplo {i+1}: ")
    
    # 8. Plotar hist√≥rico
    epochs_range = range(1, len(history.history["loss"]) + 1)
    loss = history.history["loss"]
    val_loss = history.history["val_loss"]
    
    plt.figure(figsize=(10, 5))
    plt.plot(epochs_range, loss, "bo", label="Training loss")
    plt.plot(epochs_range, val_loss, "b", label="Validation loss")
    plt.title("Training and Validation Loss")
    plt.xlabel("Epochs")
    plt.ylabel("Loss")
    plt.legend()
    plt.grid(True)
    plt.show()
    
    return model_best, history

# Executar: model, history = complete_segmentation_workflow()


## üìã 6. DICAS R√ÅPIDAS PARA PROVA


### ‚ö° Comandos Essenciais

```python
# 1. Verificar shapes dos dados
print("Input shape:", train_input_imgs.shape)  # (N, H, W, 3)
print("Target shape:", train_targets.shape)    # (N, H, W, 1)

# 2. Verificar valores das m√°scaras
print("Valores √∫nicos na m√°scara:", np.unique(train_targets))  # Deve ser [0, 1, 2]

# 3. Fazer predi√ß√£o em uma imagem
pred = model.predict(np.expand_dims(test_image, axis=0))[0]
predicted_mask = np.argmax(pred, axis=-1)

# 4. Verificar shape da sa√≠da
print("Output shape:", model.output_shape)  # Deve ser (None, H, W, num_classes)
```

### üéØ Checklist para Prova

- [ ] Dataset carregado corretamente (imagens e m√°scaras)
- [ ] M√°scaras t√™m valores corretos (0, 1, 2 para 3 classes)
- [ ] Modelo criado com Encoder-Decoder (Conv2D + Conv2DTranspose)
- [ ] √öltima camada usa `Conv2D(num_classes, activation="softmax")`
- [ ] Loss: `sparse_categorical_crossentropy` (para targets inteiros)
- [ ] Modelo compilado e treinado
- [ ] Resultados visualizados corretamente

### üîß Troubleshooting Comum

**Erro: Shape mismatch**
- Verificar se input_shape do modelo corresponde ao tamanho das imagens
- Verificar se num_classes corresponde ao n√∫mero de classes nas m√°scaras

**Erro: Loss n√£o diminui**
- Verificar se m√°scaras est√£o normalizadas corretamente (0, 1, 2)
- Tentar learning rate menor
- Verificar se dados est√£o sendo carregados corretamente

**Predi√ß√µes muito ruins**
- Verificar se modelo tem camadas suficientes
- Aumentar n√∫mero de √©pocas
- Verificar overfitting (val_loss > train_loss)

### üí° Diferen√ßas Importantes: Classifica√ß√£o vs Segmenta√ß√£o

| Classifica√ß√£o | Segmenta√ß√£o |
|--------------|-------------|
| Output: (batch, num_classes) | Output: (batch, H, W, num_classes) |
| Usa MaxPooling | Usa Conv2D com strides=2 |
| Flatten + Dense | Conv2DTranspose para upsampling |
| Categorical Crossentropy | Sparse Categorical Crossentropy |



In [None]:
# üî• Fun√ß√£o completa para comparar imagem original, m√°scara verdadeira e predi√ß√£o
def compare_segmentation(image, true_mask, pred_mask=None, title_prefix=""):
    """
    Visualiza imagem original, m√°scara verdadeira e predi√ß√£o lado a lado
    
    Args:
        image: Imagem original (array)
        true_mask: M√°scara verdadeira (ground truth)
        pred_mask: M√°scara predita (opcional)
    """
    num_cols = 3 if pred_mask is not None else 2
    
    plt.figure(figsize=(15, 5))
    
    # Imagem original
    plt.subplot(1, num_cols, 1)
    plt.axis("off")
    if len(image.shape) == 3 and image.shape[2] == 3:
        plt.imshow(image.astype("uint8"))
    else:
        plt.imshow(array_to_img(image))
    plt.title(f"{title_prefix}Original Image")
    
    # M√°scara verdadeira
    plt.subplot(1, num_cols, 2)
    display_target(true_mask)
    plt.title(f"{title_prefix}True Mask")
    
    # Predi√ß√£o (se fornecida)
    if pred_mask is not None:
        plt.subplot(1, num_cols, 3)
        display_mask(pred_mask)
        plt.title procurou"{title_prefix}Predicted Mask")
    
    plt.tight_layout()
    plt.show()

# Exemplo de uso:
# compare_segmentation(val_input_imgs[0], val_targets[0], pred_mask)
