### Pós-graduação em Ciência de Dados e Machine Learning

#### Disciplina: **Introdução a Redes Neurais**

#### Projeto Final para disciplina Introdução a Redes Neurais

<BR>
    
### **Integrantes**:
#### Nome:  Gustavo Gomes Balbino
#### RA: 
---
#### Nome:  Gustavo Lopes Urio Fonseca
#### RA: 52400113
---

# Classificação de Tumores de Mama em Imagens de Ultrassom

**Objetivos**  
1. **Pré-processamento**: usar máscaras para recortar região de tumor.  
2. **Modelagem**: ResNet50 pré-treinada + blocos customizados (PDFBlock + SEBlock).  
3. **Treino**: otimizar usando loss binária, métricas AUC e acurácia.  
4. **Avaliação**: matriz de confusão, AUC e relatório de resultados.

**Métricas de Sucesso**  
- Acurácia ≥ X%  
- AUC ≥ Y%  
- Boa separação das classes (confusion matrix bem balanceada)

---
### **Passo 1:**
#### Vamos começar fazendo o download e importando as bibliotecas necessárias:

In [1]:
!pip install --quiet tensorflow matplotlib pandas scikit-learn


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.0.1[0m[39;49m -> [0m[32;49m25.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [2]:
import os, glob
os.environ["CUDA_VISIBLE_DEVICES"] = ""
import numpy  as np
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.utils          import shuffle as skshuffle

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, callbacks, Input, Model
from tensorflow.keras.applications import ResNet50

print("Dispositivos visíveis:", tf.config.list_physical_devices())

tf.__version__

2025-04-28 18:54:05.914455: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1745877245.927906  295391 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1745877245.931334  295391 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1745877245.942153  295391 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1745877245.942185  295391 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1745877245.942187  295391 computation_placer.cc:177] computation placer alr

Dispositivos visíveis: [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]


2025-04-28 18:54:08.949260: E external/local_xla/xla/stream_executor/cuda/cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected


'2.19.0'

---
### **Passo 2:**
#### Uma breve visualização do 'dataset.csv'

In [3]:
df = pd.read_csv('dataset.csv', sep=',')
df.head(10)

Unnamed: 0,Level,Name,Type,Description,File Count
0,0,.,directory,Root directory,2
1,1,images,directory,Contains all ultrasound images,811
2,1,masks,directory,Contains corresponding segmentation masks,811
3,1,Benign,directory,Contains benign tumor data,2
4,2,Benign/images,directory,Ultrasound images of benign tumors,358
5,2,Benign/masks,directory,Segmentation masks for benign tumors,358
6,1,Malignant,directory,Contains malignant tumor data,2
7,2,Malignant/images,directory,Ultrasound images of malignant tumors,453
8,2,Malignant/masks,directory,Segmentation masks for malignant tumors,453


##### Abaixo vamos listar imagens, máscaras e labels

In [4]:
BASE_DIR = "BUS_UC/BUS_UC/BUS_UC"
IMG_PATTERN = "*.png"

# diretórios
dirs = {
    0: (os.path.join(BASE_DIR, "Benign",    "images"),
        os.path.join(BASE_DIR, "Benign",    "masks")),
    1: (os.path.join(BASE_DIR, "Malignant", "images"),
        os.path.join(BASE_DIR, "Malignant", "masks")),
}

all_images, all_masks, all_labels = [], [], []
for label, (img_dir, mask_dir) in dirs.items():
    imgs  = sorted(glob.glob(os.path.join(img_dir,  IMG_PATTERN)))
    masks = sorted(glob.glob(os.path.join(mask_dir, IMG_PATTERN)))
    all_images += imgs
    all_masks  += masks
    all_labels += [label] * len(imgs)

print(f"Total: {len(all_images)} imagens, {len(all_masks)} máscaras")
print("Distribuição:", pd.Series(all_labels).value_counts().to_dict())

Total: 811 imagens, 811 máscaras
Distribuição: {1: 453, 0: 358}


---
### **Passo 3:**
#### Vamos agora embaralhar essas listas e separar para treino, validação e teste.

In [5]:
imgs, masks, labs = skshuffle(all_images, all_masks, all_labels, random_state=42)
X_train, X_rem, M_train, M_rem, y_train, y_rem = train_test_split(
    imgs, masks, labs,
    test_size=0.30, stratify=labs, random_state=42
)
X_val, X_test, M_val, M_test, y_val, y_test = train_test_split(
    X_rem, M_rem, y_rem,
    test_size=0.50, stratify=y_rem, random_state=42
)

print(f"Train: {len(X_train)} | Val: {len(X_val)} | Test: {len(X_test)}")

Train: 567 | Val: 122 | Test: 122


In [6]:
IMG_SIZE = (224, 224)

def parse_image_mask(img_path, mask_path, label):
    # lê e normaliza imagem
    img  = tf.io.read_file(img_path)
    img  = tf.image.decode_png(img, channels=3)
    img  = tf.image.resize(img, IMG_SIZE) / 255.0
    
    # lê máscara e redimensiona
    mask = tf.io.read_file(mask_path)
    mask = tf.image.decode_png(mask, channels=1)
    mask = tf.image.resize(mask, IMG_SIZE)
    
    # aplica máscara para focar no tumor
    img = img * mask
    
    return img, label

In [7]:
def make_dataset(imgs, masks, labels, batch_size=16, shuffle=True):
    ds = tf.data.Dataset.from_tensor_slices((imgs, masks, labels))
    if shuffle:
        ds = ds.shuffle(len(imgs))
    ds = ds.map(lambda i,m,l: parse_image_mask(i,m,l),
                num_parallel_calls=tf.data.AUTOTUNE)
    ds = ds.batch(batch_size).prefetch(tf.data.AUTOTUNE)
    return ds

train_ds = make_dataset(X_train, M_train, y_train)
val_ds   = make_dataset(X_val,   M_val,   y_val,   shuffle=False)
test_ds  = make_dataset(X_test,  M_test,  y_test,  shuffle=False)

# verificação rápida
for imgs, labs in train_ds.take(1):
    print("Batch imagens:", imgs.shape, "Batch labels:", labs.numpy())

Batch imagens: (16, 224, 224, 3) Batch labels: [0 1 0 1 0 0 1 1 1 0 0 0 1 1 1 1]


2025-04-28 18:54:09.268278: I tensorflow/core/framework/local_rendezvous.cc:407] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence


---
### **Passo 4:**
#### Vamos definir o modelo e os blocos customizados

In [8]:
class SEBlock(layers.Layer):
    """Squeeze-and-Excitation: recalibra dinamicamente a importância dos canais."""
    def __init__(self, channels, reduction=16, **kwargs):
        super().__init__(**kwargs)
        self.global_pool = layers.GlobalAveragePooling2D()
        self.fc1         = layers.Dense(channels // reduction, activation="relu")
        self.fc2         = layers.Dense(channels,          activation="sigmoid")
        self.reshape     = layers.Reshape((1, 1, channels))
        self.multiply    = layers.Multiply()

    def call(self, x):
        se = self.global_pool(x)      # [B, C]
        se = self.fc1(se)             # [B, C/r]
        se = self.fc2(se)             # [B, C]
        se = self.reshape(se)         # [B, 1, 1, C]
        return self.multiply([x, se])

class PDFBlock(layers.Layer):
    """Pyramid-Dilated Fusion: múltiplas convoluções dilatadas + projeção 1×1."""
    def __init__(self, out_channels, kernel_sizes, dilations, **kwargs):
        super().__init__(**kwargs)
        assert len(kernel_sizes) == len(dilations), "kernel_sizes e dilations devem ter mesmo tamanho"
        self.branches = []
        for k, d in zip(kernel_sizes, dilations):
            self.branches.append(
                layers.Conv2D(
                    filters=out_channels,
                    kernel_size=k,
                    padding="same",
                    dilation_rate=d,
                    activation="relu"
                )
            )
        self.project = layers.Conv2D(filters=out_channels, kernel_size=1, activation="relu")

    def call(self, x):
        feats = [branch(x) for branch in self.branches]
        x_cat = tf.concat(feats, axis=-1)
        return self.project(x_cat)

In [9]:
# 1) Backbone ResNet50 sem top layer, retendo mapa espacial
backbone = ResNet50(
    include_top=False,
    weights="imagenet",
    input_shape=(*IMG_SIZE, 3)
)
backbone.trainable = False  # congelado no início

# 2) Construção do grafo
inp = Input(shape=(*IMG_SIZE, 3), name="input_image")
x   = backbone(inp)                            # [B, H', W', C=2048]

# 3) Aplicar PDFBlock (multi‐escala)
x = PDFBlock(
    out_channels=512,
    kernel_sizes=[1, 3, 5, 7],
    dilations=[1, 2, 3, 4]
)(x)  # agora [B, H', W', 512]

# 4) Recalibração de canais via SEBlock
x = SEBlock(channels=512, reduction=16)(x)      # [B, H', W', 512]

# 5) Agregação e cabeça de classificação
x = layers.GlobalAveragePooling2D(name="gap")(x)  # [B, 512]
x = layers.Dense(256, activation="relu", name="fc1")(x)
x = layers.Dropout(0.5, name="dropout")(x)
out = layers.Dense(1, activation="sigmoid", name="output")(x)

model_clf = Model(inputs=inp, outputs=out, name="ResNet50_PDF_SE_Clf")

# 6) Compilação do modelo
model_clf.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-4),
    loss="binary_crossentropy",
    metrics=["accuracy", tf.keras.metrics.AUC(name="auc")]
)

model_clf.summary()

---
### **Passo 5:**
#### Treino e monitoramento

In [10]:
callbacks_list = [
    callbacks.EarlyStopping(
        monitor="val_loss",
        patience=5,
        restore_best_weights=True,
        verbose=1
    ),
    callbacks.ModelCheckpoint(
        "best_model.h5",
        monitor="val_auc",
        mode="max",
        save_best_only=True,
        verbose=1
    ),
    callbacks.TensorBoard(
        log_dir="logs",
        histogram_freq=1
    )
]

In [None]:
history = model_clf.fit(
    train_ds,
    validation_data=val_ds,
    epochs=30,
    callbacks=callbacks_list
)

Epoch 1/30
[1m11/36[0m [32m━━━━━━[0m[37m━━━━━━━━━━━━━━[0m [1m16:23[0m 39s/step - accuracy: 0.5373 - auc: 0.5294 - loss: 0.7248