# 🎉 Out-of-Distribution (OOD) with PCA using Deep Features from the Latent Space

The goal of this notebook is to understand the depths of using Principal Component Analysis in order to perform OOD tasks using deep features from the latent space

## 📝 Plan of action

### ♻️ Preprocessing phase

In order to achieve our goal, we need to understand how the dataset is structured.

For this notebook, we are going to use the CBIR 15 dataset, that contains images of different places, such as an office, a bedroom, a mountain, etc. Note that there are some places that are similar one to another, i.e. a bedroom and a living room.

Thus, in order to extract the features of the images we have to preprocess those images:

- Get the images that are located in data/CBIR_15-scene and fit them to a dataframe using Pandas
  - Locate the "Labels.txt" file: it shows where the indexes of the images from each category starts
- Create the dataset with this information with two columns: the path to the image and its category
- Transform all of the images in the same size (in this case, we are going with 256x256)
  
Now, in order to extract the features, it's necessary to divide the reshaped images into patches of 32x32 pixels. This is good to perform processing tasks to avoid waiting long periods of time.

After all the preprocess, we should separate the images into two different foldes: one contains the patches of the training images that is going to give us their principal components and dimensions, and the other is the patches of the test images, that is going to be tested to fit into those dimensions and we'll get an OOD score afterwards.

### 🏋🏽‍♂️ Training phase

With the images that are stored inside the "patches_train" folder, the first thing we are going to do is _normalize_ all of the images to find the correct maximum covariance and transforming all the variables into the same scale.

Next, we should then apply the PCA with all the components. As we have patches of 32x32, we'll be having 1024 features, hence components. Then we plot a graph to see how many components truly contributes for the most variance of the data - and give us more information about it. We're going to take the threshold of 95% of variance in this notebook.

After getting the PCA with components that describe 95% of the variance, it's time to test our images and see how far of the residual space their data can be found.

### ⚗️ Test phase and results

In this phase, we take the test images and normalize then with the same scale of each PCA. This is important to maintain consistency throughout the final results and measure the norms in the new dimension properly.

After that, we calculate the norm of the projection of the given data into the orthogonal space of the principal component and divide it by the norm of the data in relation to the origin. This is the OOD score.

We calculate the mean of the score for each category and get the minimal one. The current environment is the smallest.


--------------------------

First of all, we need to understand which libraries we are going to use:

- os: Deals with the operation system interface such as finding the relative and absolute path of files inside a project and reading/writing files for example.
- sys: This module provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter.
- numpy: NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.
- pandas: Pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
- matplotlib: Deals with plotting graphs to visualize data in a graphical way.
- sklearn: Scikit-learn provides dozens of built-in machine learning algorithms and models, called estimators.

In [1]:
import os
import sys
import cv2
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA


I'd suggest to use a conda virtual environment in order to avoid messing up your base kernel environment and causing dependency errors in the future.

After you successfully installed all the modules, it's time to import our custom modules that are going to deal with:

- Creation of our dataframe using pandas
- Separation of our dataset into patches of 32x32 in folders of training and test

In [2]:

sys.path.append(os.path.abspath('..'))

from dataframe_generator import *
from images_standardizing import *

In [3]:
import tarfile

def extract_tgz(tgz_path, extract_to):
    if not os.path.exists(extract_to):
        os.makedirs(extract_to)
    
    with tarfile.open(tgz_path, 'r:gz') as tar:
        tar.extractall(path=extract_to)
        print(f"Arquivos extraídos para {extract_to}")

tgz_path = '../CBIR_15-Scene.tgz'
extract_to = '../data/'

extract_tgz(tgz_path, extract_to)

  tar.extractall(path=extract_to)


Arquivos extraídos para ../data/


In [4]:
df = create_dataframe()
df

                             image_path category
0        ../data/CBIR_15-Scene/00/1.jpg  Bedroom
1        ../data/CBIR_15-Scene/00/2.jpg  Bedroom
2        ../data/CBIR_15-Scene/00/3.jpg  Bedroom
3        ../data/CBIR_15-Scene/00/4.jpg  Bedroom
4        ../data/CBIR_15-Scene/00/5.jpg  Bedroom
...                                 ...      ...
4480  ../data/CBIR_15-Scene/14/4481.jpg    Store
4481  ../data/CBIR_15-Scene/14/4482.jpg    Store
4482  ../data/CBIR_15-Scene/14/4483.jpg    Store
4483  ../data/CBIR_15-Scene/14/4484.jpg    Store
4484  ../data/CBIR_15-Scene/14/4485.jpg    Store

[4485 rows x 2 columns]


Unnamed: 0,image_path,category
0,../data/CBIR_15-Scene/00/1.jpg,Bedroom
1,../data/CBIR_15-Scene/00/2.jpg,Bedroom
2,../data/CBIR_15-Scene/00/3.jpg,Bedroom
3,../data/CBIR_15-Scene/00/4.jpg,Bedroom
4,../data/CBIR_15-Scene/00/5.jpg,Bedroom
...,...,...
4480,../data/CBIR_15-Scene/14/4481.jpg,Store
4481,../data/CBIR_15-Scene/14/4482.jpg,Store
4482,../data/CBIR_15-Scene/14/4483.jpg,Store
4483,../data/CBIR_15-Scene/14/4484.jpg,Store


## ☝️ Part I: Comparing two different environments

### ♻️ Preprocessing phase

Now we start our experiments to understand if our idea work, however this time we are going to understand what happens with our approach using two different environments.

In our case, I'm going to take the **Coast** and **Office** environments arbitrarily.


In [5]:
train_categories = ['Coast', 'Office']

df_different = df[df['category'].isin(train_categories)]
df_different

Unnamed: 0,image_path,category
1267,../data/CBIR_15-Scene/05/1268.jpg,Coast
1268,../data/CBIR_15-Scene/05/1269.jpg,Coast
1269,../data/CBIR_15-Scene/05/1270.jpg,Coast
1270,../data/CBIR_15-Scene/05/1271.jpg,Coast
1271,../data/CBIR_15-Scene/05/1272.jpg,Coast
...,...,...
4165,../data/CBIR_15-Scene/13/4166.jpg,Office
4166,../data/CBIR_15-Scene/13/4167.jpg,Office
4167,../data/CBIR_15-Scene/13/4168.jpg,Office
4168,../data/CBIR_15-Scene/13/4169.jpg,Office


It's time to separate our dataset into train and test. We should use the built-in function of sklearn to do this:

In [6]:
X = df_different['image_path'].tolist()
y = df_different['category'].tolist()
unique_categories = list(df_different['category'].unique())
print(f"Unique categories: {unique_categories}")

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=10)

standard_size = (256, 256)

Unique categories: ['Coast', 'Office']


Making sure that everything went well, we plot the grid of all the patches from the first image of our training set

This is exactly what the module that's inside our "image_patching.py" do. So we now, need to save everything into the subfolders by calling that function:

In [7]:
create_images_set(X_train, X_test, y_train, y_test, output_dir_train='images_train', output_dir_test='images_test', standard_size=standard_size)

Now, we should load our patches for training:

In [8]:
training_images_by_category = load_images_by_category('images_train', unique_categories, image_size=(256, 256))
print(training_images_by_category['Coast'].shape)

(348, 256, 256)


### Centering images

Now, we need to center the images to make the neural network more efficient. We are not normalizing it to avoid information loss.

In [9]:
def center_images(images):
    num_images, height, width = images.shape
    flattened_images = images.reshape((num_images, -1))
    
    mean = np.mean(flattened_images, axis=0)
    
    centered_flattened_images = flattened_images - mean
    
    centered_images = centered_flattened_images.reshape((num_images, height, width))
    return centered_images


centralized_images_by_category = {}
scalers_by_category = {}
for category, images in training_images_by_category.items():
    print(images.shape)
    centralized_images = center_images(images)
    centralized_images_by_category[category] = centralized_images
    print(f"Category {category}, images shape: {centralized_images.shape}")


(348, 256, 256)
Category Coast, images shape: (348, 256, 256)
(209, 256, 256)
Category Office, images shape: (209, 256, 256)


In [10]:
def check_centralization(images):
    mean = np.mean(images, axis=(0, 1, 2))
    return mean

for category, images in centralized_images_by_category.items():
    mean = check_centralization(images)
    print(f"Mean pixel values after centralization for category {category}: {mean}")


Mean pixel values after centralization for category Coast: -1.013238024772844e-15
Mean pixel values after centralization for category Office: 5.664793461532378e-15


Given the values close to zero, it means that the pixels for each color channel are correctly centralized.

### 🏋🏽‍♂️ Training phase

With everything preprocessed, we now need to train our neural network. In this notebook, I chose the VGG16 because it's a well-known neural network that is often used por computer vision applications.

I'm using no weights, because the underlining goal of this research is to use the results from this work in a unsupervised environment.

Now, we get the before last layer's output to extract our latent features from the neural network.

This result means that we extracted 348 images with 4096 features each of the Coast category and 209 images with 4096 features each of the Office category.

Now we have to reduce the dimensonality. In order to do that, we should use PCA techniques. But before that, we should centralize the features now.

In [11]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
import matplotlib.pyplot as plt
from torch.utils.data import DataLoader, TensorDataset
from torchvision import models
from sklearn.decomposition import PCA

# Clear GPU cache before running the model
torch.cuda.empty_cache()

# Check if GPU is available and set the device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'Using device: {device}')

# UNet model modified to output features
class UNet(nn.Module):
    def __init__(self, in_channels=1, out_channels=1):
        super(UNet, self).__init__()

        # Encoder
        self.encoder1 = self.contracting_block(in_channels, 64)
        self.encoder2 = self.contracting_block(64, 128)
        self.encoder3 = self.contracting_block(128, 256)
        self.encoder4 = self.contracting_block(256, 512)

        # Decoder
        self.upconv4 = nn.ConvTranspose2d(512, 256, kernel_size=2, stride=2)
        self.decoder4 = self.expansive_block(512, 256)
        self.upconv3 = nn.ConvTranspose2d(256, 128, kernel_size=2, stride=2)
        self.decoder3 = self.expansive_block(256, 128)
        self.upconv2 = nn.ConvTranspose2d(128, 64, kernel_size=2, stride=2)
        self.decoder2 = self.expansive_block(128, 64)
        self.decoder1 = self.final_block(64, out_channels)

    def contracting_block(self, in_channels, out_channels):
        block = nn.Sequential(
            nn.Conv2d(kernel_size=3, in_channels=in_channels, out_channels=out_channels, padding=1),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(),
            nn.Conv2d(kernel_size=3, in_channels=out_channels, out_channels=out_channels, padding=1),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(),
            nn.Dropout(0.5)
        )
        return block

    def expansive_block(self, in_channels, out_channels):
        block = nn.Sequential(
            nn.Conv2d(kernel_size=3, in_channels=in_channels, out_channels=out_channels, padding=1),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(),
            nn.Conv2d(kernel_size=3, in_channels=out_channels, out_channels=out_channels, padding=1),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(),
            nn.Dropout(0.5)
        )
        return block

    def final_block(self, in_channels, out_channels):
        block = nn.Sequential(
            nn.Conv2d(kernel_size=3, in_channels=in_channels, out_channels=out_channels, padding=1),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(),
            nn.Conv2d(kernel_size=3, in_channels=out_channels, out_channels=out_channels, padding=1),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(),
            nn.Conv2d(kernel_size=1, in_channels=out_channels, out_channels=out_channels),
            nn.Sigmoid()  # Adiciona ativação sigmoid para garantir que a saída fique entre [0, 1]
        )
        return block

    def crop_and_concat(self, upsampled, bypass):
        _, _, H, W = upsampled.size()
        _, _, H_b, W_b = bypass.size()
        if H_b != H or W_b != W:
            bypass = F.interpolate(bypass, size=(H, W), mode='bilinear', align_corners=True)
        return torch.cat((upsampled, bypass), 1)

    def forward(self, x):
        # Encoder path
        enc1 = self.encoder1(x)
        enc2 = self.encoder2(F.max_pool2d(enc1, kernel_size=2, stride=2))
        enc3 = self.encoder3(F.max_pool2d(enc2, kernel_size=2, stride=2))
        enc4 = self.encoder4(F.max_pool2d(enc3, kernel_size=2, stride=2))

        # Decoder path
        dec4 = self.crop_and_concat(self.upconv4(enc4), enc3)
        dec4 = self.decoder4(dec4)
        dec3 = self.crop_and_concat(self.upconv3(dec4), enc2)
        dec3 = self.decoder3(dec3)
        dec2 = self.crop_and_concat(self.upconv2(dec3), enc1)
        dec2 = self.decoder2(dec2)
        dec1 = self.decoder1(dec2)

        return dec1, enc4  # Retorna tanto a imagem reconstruída quanto as features da penúltima camada

# VGG-based perceptual loss (Feature Extractor)
class VGGPerceptualLoss(nn.Module):
    def __init__(self):
        super(VGGPerceptualLoss, self).__init__()
        vgg = models.vgg19(pretrained=True).features
        self.slice1 = nn.Sequential(*vgg[:5])   # conv1_2
        self.slice2 = nn.Sequential(*vgg[5:10]) # conv2_2
        self.slice3 = nn.Sequential(*vgg[10:19])# conv3_4
        self.slice4 = nn.Sequential(*vgg[19:28])# conv4_4
        for p in self.parameters():
            p.requires_grad = False

    def forward(self, x):
        # Converte imagens grayscale para 3 canais duplicando o único canal
        x = x.repeat(1, 3, 1, 1)
        h_relu1 = self.slice1(x)
        h_relu2 = self.slice2(h_relu1)
        h_relu3 = self.slice3(h_relu2)
        h_relu4 = self.slice4(h_relu3)
        return [h_relu1, h_relu2, h_relu3, h_relu4]

# Função de treinamento do modelo
def train_unet(unet, data_loader, num_epochs=20):
    criterion = nn.MSELoss()
    optimizer = torch.optim.Adam(unet.parameters(), lr=0.0001)

    for epoch in range(num_epochs):
        unet.train()
        epoch_loss = 0
        for images, in data_loader:
            images = images.to(device).float()
            
            # Normalização para intervalo [0, 1]
            images /= 255.0

            optimizer.zero_grad()
            reconstructed_images, _ = unet(images)
            loss = criterion(reconstructed_images, images)
            loss.backward()
            optimizer.step()

            epoch_loss += loss.item()

        print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {epoch_loss/len(data_loader):.4f}')

# Inicialização dos modelos UNet e VGG para Perceptual Loss
unet = UNet().to(device)
vgg_loss = VGGPerceptualLoss().to(device)

# Defina o DataLoader aqui
categories = centralized_images_by_category.keys()

for category in categories:
    images = centralized_images_by_category[category]

    # Adiciona a dimensão do canal e cria o DataLoader
    images = np.expand_dims(images, axis=1)  # Adiciona dimensão do canal
    dataset = TensorDataset(torch.tensor(images, dtype=torch.float32))
    loader = DataLoader(dataset, batch_size=1, shuffle=True)

    # Treinando o modelo U-Net por 10 épocas (ou ajuste conforme necessário)
    train_unet(unet, loader, num_epochs=20)

    # Visualizando as primeiras imagens de cada categoria
    unet.eval()
    vgg_loss.eval()
    with torch.no_grad():
        for i in range(3):  # Exibe 3 imagens de cada categoria
            image = centralized_images_by_category[category][i]
            
            # Normalização da imagem antes de passar pela UNet
            image_tensor = torch.tensor(image / 255.0).unsqueeze(0).unsqueeze(0).float().to(device)

            # Processa com UNet
            reconstructed_image, unet_features = unet(image_tensor)

            # Extrai features com o modelo VGG
            vgg_features = vgg_loss(image_tensor)
            vgg_features_combined = torch.cat([f.view(-1) for f in vgg_features], dim=0).cpu().numpy()

            # Extrai as features da U-Net e as converte para numpy
            unet_features_combined = unet_features.view(-1).cpu().numpy()

            # Converte o tensor para numpy para visualização
            reconstructed_image_np = reconstructed_image.squeeze().cpu().numpy()

            # Visualiza as imagens
            plt.figure(figsize=(20, 5))
            
            # Imagem original
            plt.subplot(1, 4, 1)
            plt.imshow(image, cmap='gray')
            plt.title(f'Original {category} Image {i+1}')
            plt.axis('off')

            # Imagem reconstruída
            plt.subplot(1, 4, 2)
            plt.imshow(reconstructed_image_np, cmap='gray')
            plt.title(f'Reconstructed {category} Image {i+1}')
            plt.axis('off')

            # UNet Features
            plt.subplot(1, 4, 3)
            plt.plot(unet_features_combined)
            plt.title(f'UNet Features {category} Image {i+1}')
            plt.axis('on')

            # VGG Features
            plt.subplot(1, 4, 4)
            plt.plot(vgg_features_combined)
            plt.title(f'VGG Features {category} Image {i+1}')
            plt.axis('on')

            plt.tight_layout()
            plt.show()


Using device: cuda




Epoch [1/20], Loss: 0.0937


KeyboardInterrupt: 

In [None]:
# import torch
# import torch.nn as nn
# import torch.optim as optim
# from torchvision import models
# from torch.utils.data import DataLoader, TensorDataset
# import numpy as np
# import matplotlib.pyplot as plt

# # Verifica o dispositivo disponível (GPU ou CPU)
# device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
# print(f'Using device: {device}')

# # Define o modelo de autoencoder convolucional simples
# class ConvAutoencoder(nn.Module):
#     def __init__(self):
#         super(ConvAutoencoder, self).__init__()
#         # Encoder
#         self.encoder = nn.Sequential(
#             nn.Conv2d(1, 64, kernel_size=3, stride=2, padding=1),  # [B, 64, 112, 112]
#             nn.ReLU(True),
#             nn.Conv2d(64, 128, kernel_size=3, stride=2, padding=1),  # [B, 128, 56, 56]
#             nn.ReLU(True),
#             nn.Conv2d(128, 256, kernel_size=3, stride=2, padding=1),  # [B, 256, 28, 28]
#             nn.ReLU(True)
#         )
#         # Decoder
#         self.decoder = nn.Sequential(
#             nn.ConvTranspose2d(256, 128, kernel_size=3, stride=2, padding=1, output_padding=1),  # [B, 128, 56, 56]
#             nn.ReLU(True),
#             nn.ConvTranspose2d(128, 64, kernel_size=3, stride=2, padding=1, output_padding=1),  # [B, 64, 112, 112]
#             nn.ReLU(True),
#             nn.ConvTranspose2d(64, 1, kernel_size=3, stride=2, padding=1, output_padding=1),  # [B, 1, 224, 224]
#             nn.Sigmoid()  # Saída no intervalo [0, 1]
#         )

#     def forward(self, x):
#         x = self.encoder(x)
#         x = self.decoder(x)
#         return x

# # Define a classe de VGG Perceptual Loss
# class VGGPerceptualLoss(nn.Module):
#     def __init__(self):
#         super(VGGPerceptualLoss, self).__init__()
#         vgg = models.vgg19(pretrained=True).features
#         self.slice1 = nn.Sequential(*vgg[:5])   # Conv1_2
#         self.slice2 = nn.Sequential(*vgg[5:10]) # Conv2_2
#         self.slice3 = nn.Sequential(*vgg[10:19])# Conv3_4
#         self.slice4 = nn.Sequential(*vgg[19:28])# Conv4_4
#         for p in self.parameters():
#             p.requires_grad = False

#     def forward(self, x):
#         # Converter para 3 canais duplicando o único canal (no caso de grayscale)
#         x = x.repeat(1, 3, 1, 1)
#         h_relu1 = self.slice1(x)
#         h_relu2 = self.slice2(h_relu1)
#         h_relu3 = self.slice3(h_relu2)
#         h_relu4 = self.slice4(h_relu3)
#         return [h_relu1, h_relu2, h_relu3, h_relu4]

# # Função de perda perceptual combinada com MSE
# def perceptual_loss_function(vgg_loss, recon_images, orig_images):
#     # Extrai as características da imagem original e reconstruída
#     orig_features = vgg_loss(orig_images)
#     recon_features = vgg_loss(recon_images)
    
#     # Calcula a perda perceptual como MSE das características intermediárias
#     perceptual_loss = 0
#     for orig_f, recon_f in zip(orig_features, recon_features):
#         perceptual_loss += nn.functional.mse_loss(recon_f, orig_f)
    
#     return perceptual_loss

# # Função de treinamento
# def train_autoencoder(autoencoder, vgg_loss, data_loader, num_epochs=20):
#     # Usa Adam como otimizador
#     optimizer = optim.Adam(autoencoder.parameters(), lr=0.0001)
    
#     for epoch in range(num_epochs):
#         autoencoder.train()
#         epoch_loss = 0
        
#         for images, in data_loader:
#             images = images.to(device).float()
#             images /= 255.0  # Normaliza para [0, 1]

#             # Passa as imagens pelo autoencoder
#             recon_images = autoencoder(images)
            
#             # Calcula a perda perceptual
#             perceptual_loss = perceptual_loss_function(vgg_loss, recon_images, images)
#             mse_loss = nn.functional.mse_loss(recon_images, images)
#             total_loss = perceptual_loss + mse_loss

#             # Otimiza o modelo
#             optimizer.zero_grad()
#             total_loss.backward()
#             optimizer.step()

#             epoch_loss += total_loss.item()

#         print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {epoch_loss/len(data_loader):.4f}')

# # Inicializa o autoencoder e a VGG perceptual loss
# autoencoder = ConvAutoencoder().to(device)
# vgg_loss = VGGPerceptualLoss().to(device)

# # Defina o DataLoader aqui
# categories = centralized_images_by_category.keys()

# for category in categories:
#     images = centralized_images_by_category[category]

#     # Adiciona a dimensão do canal e cria o DataLoader
#     images = np.expand_dims(images, axis=1)  # Adiciona dimensão do canal
#     dataset = TensorDataset(torch.tensor(images, dtype=torch.float32))
#     loader = DataLoader(dataset, batch_size=1, shuffle=True)

#     # Treina o modelo de autoencoder
#     train_autoencoder(autoencoder, vgg_loss, loader, num_epochs=20)

#     # Avalia e visualiza algumas imagens reconstruídas
#     autoencoder.eval()
#     with torch.no_grad():
#         for i in range(3):  # Exibe 3 imagens de cada categoria
#             image = centralized_images_by_category[category][i]
#             image_tensor = torch.tensor(image / 255.0).unsqueeze(0).unsqueeze(0).float().to(device)

#             # Reconstrói a imagem
#             reconstructed_image = autoencoder(image_tensor)

#             # Converte para numpy para visualização
#             reconstructed_image_np = reconstructed_image.squeeze().cpu().numpy()

#             # Visualiza as imagens
#             plt.figure(figsize=(10, 5))
#             plt.subplot(1, 2, 1)
#             plt.imshow(image, cmap='gray')
#             plt.title(f'Original {category} Image {i+1}')
#             plt.axis('off')

#             plt.subplot(1, 2, 2)
#             plt.imshow(reconstructed_image_np, cmap='gray')
#             plt.title(f'Reconstructed {category} Image {i+1}')
#             plt.axis('off')

#             plt.show()


In [None]:
from sklearn.decomposition import PCA
import numpy as np

# Função para aplicar PCA reduzida
def apply_reduced_pca(features_by_category, n_components=1024, number_variance=0.95):
    pca_by_category = {}
    num_components_reduced_dict = {}
    
    for category, features in features_by_category.items():
        all_features = np.vstack(features)  # Agrupa todas as features da categoria em uma matriz
        
        # Verifica se há features para processar
        if all_features.size == 0:
            continue
        
        # Ajusta dinamicamente o número de componentes baseado em min(n_samples, n_features)
        n_samples, n_features = all_features.shape
        max_components = min(n_samples, n_features)

        # Ajusta a PCA com o número inicial de componentes
        pca = PCA(n_components=max_components)
        pca.fit(all_features)
        
        # Determina o número de componentes que retêm a variância desejada
        cumulative_variance = np.cumsum(pca.explained_variance_ratio_)
        num_components_reduced = np.where(cumulative_variance >= number_variance)[0][0] + 1
        
        # Reajusta a PCA para o número reduzido de componentes
        pca = PCA(n_components=num_components_reduced)
        pca.fit(all_features)
        
        # Armazena a PCA e o número de componentes reduzidos por categoria
        pca_by_category[category] = pca
        num_components_reduced_dict[category] = num_components_reduced

    # Calcula o número mínimo de componentes entre todas as categorias
    min_num_components = min(num_components_reduced_dict.values())
    return pca_by_category, num_components_reduced_dict, min_num_components


# Extrair features por categoria para UNet e VGG
def extract_features_by_category(unet, vgg_loss, images_by_category):
    unet_features_by_category = {}
    vgg_features_by_category = {}

    for category, images in images_by_category.items():
        unet_features_by_category[category] = []
        vgg_features_by_category[category] = []
        
        for image in images:
            image_tensor = torch.tensor(image / 255.0).unsqueeze(0).unsqueeze(0).float().to(device)

            # Extrai as features da U-Net
            _, unet_features = unet(image_tensor)
            unet_features_combined = unet_features.view(-1).detach().cpu().numpy()
            unet_features_by_category[category].append(unet_features_combined)

            # Extrai as features da VGG
            vgg_features = vgg_loss(image_tensor)
            vgg_features_combined = torch.cat([f.view(-1).detach() for f in vgg_features], dim=0).cpu().numpy()
            vgg_features_by_category[category].append(vgg_features_combined)
    
    return unet_features_by_category, vgg_features_by_category


# Função principal para calcular a PCA reduzida para U-Net e VGG
def calculate_pca_for_unet_and_vgg(unet, vgg_loss, images_by_category):
    # Extrai as features da U-Net e do VGG por categoria
    unet_features_by_category, vgg_features_by_category = extract_features_by_category(unet, vgg_loss, images_by_category)

    # Aplica PCA reduzida para as features da U-Net e VGG com variância de 95%, 90%, 85%, e 80%
    pca_unet_95, num_components_unet_95, _ = apply_reduced_pca(unet_features_by_category, number_variance=0.95)
    pca_vgg_95, num_components_vgg_95, _ = apply_reduced_pca(vgg_features_by_category, number_variance=0.95)

    pca_unet_90, num_components_unet_90, _ = apply_reduced_pca(unet_features_by_category, number_variance=0.90)
    pca_vgg_90, num_components_vgg_90, _ = apply_reduced_pca(vgg_features_by_category, number_variance=0.90)

    pca_unet_85, num_components_unet_85, _ = apply_reduced_pca(unet_features_by_category, number_variance=0.85)
    pca_vgg_85, num_components_vgg_85, _ = apply_reduced_pca(vgg_features_by_category, number_variance=0.85)

    pca_unet_80, num_components_unet_80, _ = apply_reduced_pca(unet_features_by_category, number_variance=0.80)
    pca_vgg_80, num_components_vgg_80, _ = apply_reduced_pca(vgg_features_by_category, number_variance=0.80)

    # Retorna as PCAs calculadas para U-Net e VGG
    return {
        "unet": {
            "95_variance": pca_unet_95,
            "90_variance": pca_unet_90,
            "85_variance": pca_unet_85,
            "80_variance": pca_unet_80
        },
        "vgg": {
            "95_variance": pca_vgg_95,
            "90_variance": pca_vgg_90,
            "85_variance": pca_vgg_85,
            "80_variance": pca_vgg_80
        },
        "num_components": {
            "unet": {
                "95_variance": num_components_unet_95,
                "90_variance": num_components_unet_90,
                "85_variance": num_components_unet_85,
                "80_variance": num_components_unet_80
            },
            "vgg": {
                "95_variance": num_components_vgg_95,
                "90_variance": num_components_vgg_90,
                "85_variance": num_components_vgg_85,
                "80_variance": num_components_vgg_80
            }
        }
    }

# Aplicando o PCA nas features da U-Net e VGG
pca_results = calculate_pca_for_unet_and_vgg(unet, vgg_loss, centralized_images_by_category)

# Resultados de PCA
unet_pca_95 = pca_results["unet"]["95_variance"]
vgg_pca_95 = pca_results["vgg"]["95_variance"]

num_components_unet_95 = pca_results["num_components"]["unet"]["95_variance"]
num_components_vgg_95 = pca_results["num_components"]["vgg"]["95_variance"]

# Exibindo o número de componentes reduzidos por categoria para U-Net e VGG
print("Number of Components for U-Net (95% Variance):", num_components_unet_95)
print("Number of Components for VGG (95% Variance):", num_components_vgg_95)


The components_ matrix has the shape (n_components, n_features), but when you project the original data into this new principal components space, the data is transformed into a shape matrix (n_samples, n_components).

### Testing phase


In [None]:
def load_and_preprocess_test_images(test_dir, categories, image_size, input_size=(224,224)):
    test_images_by_category = load_images_by_category(test_dir, categories, image_size)
    test_centralized_images_by_category = {}

    for category, images in test_images_by_category.items():
        test_centralized_images = center_images(images)
        test_centralized_images_by_category[category] = test_centralized_images

    return test_centralized_images_by_category

test_preprocessed_images_by_category = load_and_preprocess_test_images('images_test', y, image_size=(224,224), input_size=(224,224))

In [None]:
for category, images in centralized_images_by_category.items():
    mean = check_centralization(images)
    print(f"Mean pixel values after centralization for category {category}: {mean}")

In [None]:
# Função para extrair as features de teste para UNet e VGG
def extract_test_features(test_images_by_category, unet_model, vgg_model):
    unet_model.eval()
    vgg_model.eval()
    
    unet_features_by_category = {}
    vgg_features_by_category = {}

    with torch.no_grad():
        for category, images in test_images_by_category.items():
            unet_features_by_category[category] = []
            vgg_features_by_category[category] = []

            for i, image in enumerate(images):
                # Preprocessar imagem
                image_tensor = torch.tensor(image / 255.0).unsqueeze(0).unsqueeze(0).float().to(device)

                # Extrair features da U-Net
                reconstructed_image, unet_features = unet_model(image_tensor)
                unet_features_combined = unet_features.view(-1).cpu().numpy()
                unet_features_by_category[category].append(unet_features_combined)

                # Extrair features da VGG
                vgg_features = vgg_model(image_tensor)
                vgg_features_combined = torch.cat([f.view(-1) for f in vgg_features], dim=0).cpu().numpy()
                vgg_features_by_category[category].append(vgg_features_combined)

    return unet_features_by_category, vgg_features_by_category

# Carregar as imagens de teste pré-processadas
# Supõe-se que test_preprocessed_images_by_category já foi carregado anteriormente

# Extrair features das imagens de teste
unet_test_features, vgg_test_features = extract_test_features(test_preprocessed_images_by_category, unet, vgg_loss)

# Agora as features estão extraídas para as imagens de teste e estão armazenadas em 'unet_test_features' e 'vgg_test_features'


In [None]:
from sklearn.decomposition import PCA
import numpy as np

# Função para calcular residuals projetando as features no PCA
def project_and_calculate_residuals(features, pca):
    # Certifique-se de que as features são 2D
    if features.ndim == 1:
        features = features.reshape(1, -1)
    
    # Projeta as features e calcula os residuals
    projected_features = pca.inverse_transform(pca.transform(features))
    residuals = features - projected_features
    return residuals

# Função para calcular os OOD scores usando os residuals
def calculate_ood_scores(residuals):
    # O OOD score será a média dos valores absolutos dos residuals
    ood_scores = np.mean(np.abs(residuals), axis=1)
    return np.mean(ood_scores)

# Função para processar os residuals e calcular o OOD score
def process_and_calculate_ood(residuals_list):
    total_residuals = np.vstack(residuals_list)
    return calculate_ood_scores(total_residuals)

# Função para processar as features e calcular OOD após PCA
def process_and_calculate_ood_with_pca(features_list, pca_model):
    residuals_list = []
    
    for features in features_list:
        # Certifique-se de que cada conjunto de features seja um conjunto de amostras (não um único vetor)
        if features.ndim == 1:
            features = features.reshape(1, -1)
        
        # Aplica PCA e calcula os residuals
        residuals = project_and_calculate_residuals(features, pca_model)
        residuals_list.append(residuals)
    
    return process_and_calculate_ood(residuals_list)

# Treinando PCA com as features de treinamento (sem normalização)
def train_pca_on_features(features_list, n_components=0.95):
    # Unindo todas as features de treinamento
    all_features = np.vstack(features_list)
    
    # Aplicando PCA para capturar 95% da variância
    pca_model = PCA(n_components=n_components)
    pca_model.fit(all_features)
    
    return pca_model

# Treinando PCA para as features da categoria 'Coast' e 'Office'
pca_coast = train_pca_on_features(unet_test_features['Coast'], n_components=0.95)
pca_office = train_pca_on_features(unet_test_features['Office'], n_components=0.95)

# Calculando OOD scores projetando nos componentes do PCA
ood_score_coast_train_coast_test = process_and_calculate_ood_with_pca(unet_test_features['Coast'], pca_coast)
ood_score_coast_train_office_test = process_and_calculate_ood_with_pca(unet_test_features['Office'], pca_coast)
ood_score_office_train_office_test = process_and_calculate_ood_with_pca(unet_test_features['Office'], pca_office)
ood_score_office_train_coast_test = process_and_calculate_ood_with_pca(unet_test_features['Coast'], pca_office)

# Exibindo os resultados
print("OOD Scores usando Coast Training Data nos componentes do PCA - Coast Test Data:")
print(f"OOD Score: {ood_score_coast_train_coast_test}")

print("\nOOD Scores usando Coast Training Data nos componentes do PCA - Office Test Data:")
print(f"OOD Score: {ood_score_coast_train_office_test}")

print("\nOOD Scores usando Office Training Data nos componentes do PCA - Office Test Data:")
print(f"OOD Score: {ood_score_office_train_office_test}")

print("\nOOD Scores usando Office Training Data nos componentes do PCA - Coast Test Data:")
print(f"OOD Score: {ood_score_office_train_coast_test}")


In [None]:
# Treinando PCA para as features da categoria 'Coast' e 'Office' com base nas features da VGG
pca_coast = train_pca_on_features(vgg_test_features['Coast'], n_components=0.95)
pca_office = train_pca_on_features(vgg_test_features['Office'], n_components=0.95)

# Calculando OOD scores projetando nos componentes do PCA usando as features da VGG
ood_score_coast_train_coast_test = process_and_calculate_ood_with_pca(vgg_test_features['Coast'], pca_coast)
ood_score_coast_train_office_test = process_and_calculate_ood_with_pca(vgg_test_features['Office'], pca_coast)
ood_score_office_train_office_test = process_and_calculate_ood_with_pca(vgg_test_features['Office'], pca_office)
ood_score_office_train_coast_test = process_and_calculate_ood_with_pca(vgg_test_features['Coast'], pca_office)

# Exibindo os resultados
print("OOD Scores usando Coast Training Data nos componentes do PCA - Coast Test Data (VGG Features):")
print(f"OOD Score: {ood_score_coast_train_coast_test}")

print("\nOOD Scores usando Coast Training Data nos componentes do PCA - Office Test Data (VGG Features):")
print(f"OOD Score: {ood_score_coast_train_office_test}")

print("\nOOD Scores usando Office Training Data nos componentes do PCA - Office Test Data (VGG Features):")
print(f"OOD Score: {ood_score_office_train_office_test}")

print("\nOOD Scores usando Office Training Data nos componentes do PCA - Coast Test Data (VGG Features):")
print(f"OOD Score: {ood_score_office_train_coast_test}")


## ✌️ Part II: Comparing two similar environments

In [None]:
train_categories = ['Bedroom', 'LivingRoom']

df_different = df[df['category'].isin(train_categories)]
df_different

In [None]:
X = df_different['image_path']
y = df_different['category']
(X_train, X_test, y_train, y_test) = train_test_split(X, y, test_size=0.2, random_state=10)

image_size = (224, 224)
unique_categories = list(df_different['category'].unique())
print(f"Unique categories: {unique_categories}")


In [None]:
create_images_set(X_train, X_test, y_train, y_test, output_dir_train='images_train', output_dir_test='images_test', standard_size=standard_size)

In [None]:
training_images_by_category = load_images_by_category('images_train', unique_categories, image_size=(224, 224))

In [None]:
centralized_images_by_category = {}
scalers_by_category = {}
for category, images in training_images_by_category.items():
    centralized_images = center_images(images)
    centralized_images_by_category[category] = centralized_images


In [None]:
def check_centralization(images):
    mean = np.mean(images, axis=(0, 1, 2))
    return mean

for category, images in centralized_images_by_category.items():
    mean = check_centralization(images)
    print(f"Mean pixel values after centralization for category {category}: {mean}")


In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
import matplotlib.pyplot as plt
from torch.utils.data import DataLoader, TensorDataset
from torchvision import models
from sklearn.decomposition import PCA

# Clear GPU cache before running the model
torch.cuda.empty_cache()

# Check if GPU is available and set the device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'Using device: {device}')

# UNet model modified to output features
class UNet(nn.Module):
    def __init__(self, in_channels=1, out_channels=1):
        super(UNet, self).__init__()

        # Encoder
        self.encoder1 = self.contracting_block(in_channels, 64)
        self.encoder2 = self.contracting_block(64, 128)
        self.encoder3 = self.contracting_block(128, 256)
        self.encoder4 = self.contracting_block(256, 512)

        # Decoder
        self.upconv4 = nn.ConvTranspose2d(512, 256, kernel_size=2, stride=2)
        self.decoder4 = self.expansive_block(512, 256)
        self.upconv3 = nn.ConvTranspose2d(256, 128, kernel_size=2, stride=2)
        self.decoder3 = self.expansive_block(256, 128)
        self.upconv2 = nn.ConvTranspose2d(128, 64, kernel_size=2, stride=2)
        self.decoder2 = self.expansive_block(128, 64)
        self.decoder1 = self.final_block(64, out_channels)

    def contracting_block(self, in_channels, out_channels):
        block = nn.Sequential(
            nn.Conv2d(kernel_size=3, in_channels=in_channels, out_channels=out_channels, padding=1),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(),
            nn.Conv2d(kernel_size=3, in_channels=out_channels, out_channels=out_channels, padding=1),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(),
            nn.Dropout(0.5)
        )
        return block

    def expansive_block(self, in_channels, out_channels):
        block = nn.Sequential(
            nn.Conv2d(kernel_size=3, in_channels=in_channels, out_channels=out_channels, padding=1),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(),
            nn.Conv2d(kernel_size=3, in_channels=out_channels, out_channels=out_channels, padding=1),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(),
            nn.Dropout(0.5)
        )
        return block

    def final_block(self, in_channels, out_channels):
        block = nn.Sequential(
            nn.Conv2d(kernel_size=3, in_channels=in_channels, out_channels=out_channels, padding=1),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(),
            nn.Conv2d(kernel_size=3, in_channels=out_channels, out_channels=out_channels, padding=1),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(),
            nn.Conv2d(kernel_size=1, in_channels=out_channels, out_channels=out_channels),
            nn.Sigmoid()  # Adiciona ativação sigmoid para garantir que a saída fique entre [0, 1]
        )
        return block

    def crop_and_concat(self, upsampled, bypass):
        _, _, H, W = upsampled.size()
        _, _, H_b, W_b = bypass.size()
        if H_b != H or W_b != W:
            bypass = F.interpolate(bypass, size=(H, W), mode='bilinear', align_corners=True)
        return torch.cat((upsampled, bypass), 1)

    def forward(self, x):
        # Encoder path
        enc1 = self.encoder1(x)
        enc2 = self.encoder2(F.max_pool2d(enc1, kernel_size=2, stride=2))
        enc3 = self.encoder3(F.max_pool2d(enc2, kernel_size=2, stride=2))
        enc4 = self.encoder4(F.max_pool2d(enc3, kernel_size=2, stride=2))

        # Decoder path
        dec4 = self.crop_and_concat(self.upconv4(enc4), enc3)
        dec4 = self.decoder4(dec4)
        dec3 = self.crop_and_concat(self.upconv3(dec4), enc2)
        dec3 = self.decoder3(dec3)
        dec2 = self.crop_and_concat(self.upconv2(dec3), enc1)
        dec2 = self.decoder2(dec2)
        dec1 = self.decoder1(dec2)

        return dec1, enc4  # Retorna tanto a imagem reconstruída quanto as features da penúltima camada

# VGG-based perceptual loss (Feature Extractor)
class VGGPerceptualLoss(nn.Module):
    def __init__(self):
        super(VGGPerceptualLoss, self).__init__()
        vgg = models.vgg19(pretrained=True).features
        self.slice1 = nn.Sequential(*vgg[:5])   # conv1_2
        self.slice2 = nn.Sequential(*vgg[5:10]) # conv2_2
        self.slice3 = nn.Sequential(*vgg[10:19])# conv3_4
        self.slice4 = nn.Sequential(*vgg[19:28])# conv4_4
        for p in self.parameters():
            p.requires_grad = False

    def forward(self, x):
        # Converte imagens grayscale para 3 canais duplicando o único canal
        x = x.repeat(1, 3, 1, 1)
        h_relu1 = self.slice1(x)
        h_relu2 = self.slice2(h_relu1)
        h_relu3 = self.slice3(h_relu2)
        h_relu4 = self.slice4(h_relu3)
        return [h_relu1, h_relu2, h_relu3, h_relu4]

# Função de treinamento do modelo
def train_unet(unet, data_loader, num_epochs=20):
    criterion = nn.MSELoss()
    optimizer = torch.optim.Adam(unet.parameters(), lr=0.0001)

    for epoch in range(num_epochs):
        unet.train()
        epoch_loss = 0
        for images, in data_loader:
            images = images.to(device).float()
            
            # Normalização para intervalo [0, 1]
            images /= 255.0

            optimizer.zero_grad()
            reconstructed_images, _ = unet(images)
            loss = criterion(reconstructed_images, images)
            loss.backward()
            optimizer.step()

            epoch_loss += loss.item()

        print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {epoch_loss/len(data_loader):.4f}')

# Inicialização dos modelos UNet e VGG para Perceptual Loss
unet = UNet().to(device)
vgg_loss = VGGPerceptualLoss().to(device)

# Defina o DataLoader aqui
categories = centralized_images_by_category.keys()

for category in categories:
    images = centralized_images_by_category[category]

    # Adiciona a dimensão do canal e cria o DataLoader
    images = np.expand_dims(images, axis=1)  # Adiciona dimensão do canal
    dataset = TensorDataset(torch.tensor(images, dtype=torch.float32))
    loader = DataLoader(dataset, batch_size=1, shuffle=True)

    # Treinando o modelo U-Net por 10 épocas (ou ajuste conforme necessário)
    train_unet(unet, loader, num_epochs=20)

    # Visualizando as primeiras imagens de cada categoria
    unet.eval()
    vgg_loss.eval()
    with torch.no_grad():
        for i in range(3):  # Exibe 3 imagens de cada categoria
            image = centralized_images_by_category[category][i]
            
            # Normalização da imagem antes de passar pela UNet
            image_tensor = torch.tensor(image / 255.0).unsqueeze(0).unsqueeze(0).float().to(device)

            # Processa com UNet
            reconstructed_image, unet_features = unet(image_tensor)

            # Extrai features com o modelo VGG
            vgg_features = vgg_loss(image_tensor)
            vgg_features_combined = torch.cat([f.view(-1) for f in vgg_features], dim=0).cpu().numpy()

            # Extrai as features da U-Net e as converte para numpy
            unet_features_combined = unet_features.view(-1).cpu().numpy()

            # Converte o tensor para numpy para visualização
            reconstructed_image_np = reconstructed_image.squeeze().cpu().numpy()

            # Visualiza as imagens
            plt.figure(figsize=(20, 5))
            
            # Imagem original
            plt.subplot(1, 4, 1)
            plt.imshow(image, cmap='gray')
            plt.title(f'Original {category} Image {i+1}')
            plt.axis('off')

            # Imagem reconstruída
            plt.subplot(1, 4, 2)
            plt.imshow(reconstructed_image_np, cmap='gray')
            plt.title(f'Reconstructed {category} Image {i+1}')
            plt.axis('off')

            # UNet Features
            plt.subplot(1, 4, 3)
            plt.plot(unet_features_combined)
            plt.title(f'UNet Features {category} Image {i+1}')
            plt.axis('on')

            # VGG Features
            plt.subplot(1, 4, 4)
            plt.plot(vgg_features_combined)
            plt.title(f'VGG Features {category} Image {i+1}')
            plt.axis('on')

            plt.tight_layout()
            plt.show()


In [None]:
from sklearn.decomposition import PCA
import numpy as np

# Função para aplicar PCA reduzida
def apply_reduced_pca(features_by_category, n_components=1024, number_variance=0.95):
    pca_by_category = {}
    num_components_reduced_dict = {}
    
    for category, features in features_by_category.items():
        all_features = np.vstack(features)  # Agrupa todas as features da categoria em uma matriz
        
        # Verifica se há features para processar
        if all_features.size == 0:
            continue
        
        # Ajusta dinamicamente o número de componentes baseado em min(n_samples, n_features)
        n_samples, n_features = all_features.shape
        max_components = min(n_samples, n_features)

        # Ajusta a PCA com o número inicial de componentes
        pca = PCA(n_components=max_components)
        pca.fit(all_features)
        
        # Determina o número de componentes que retêm a variância desejada
        cumulative_variance = np.cumsum(pca.explained_variance_ratio_)
        num_components_reduced = np.where(cumulative_variance >= number_variance)[0][0] + 1
        
        # Reajusta a PCA para o número reduzido de componentes
        pca = PCA(n_components=num_components_reduced)
        pca.fit(all_features)
        
        # Armazena a PCA e o número de componentes reduzidos por categoria
        pca_by_category[category] = pca
        num_components_reduced_dict[category] = num_components_reduced

    # Calcula o número mínimo de componentes entre todas as categorias
    min_num_components = min(num_components_reduced_dict.values())
    return pca_by_category, num_components_reduced_dict, min_num_components


# Extrair features por categoria para UNet e VGG
def extract_features_by_category(unet, vgg_loss, images_by_category):
    unet_features_by_category = {}
    vgg_features_by_category = {}

    for category, images in images_by_category.items():
        unet_features_by_category[category] = []
        vgg_features_by_category[category] = []
        
        for image in images:
            image_tensor = torch.tensor(image / 255.0).unsqueeze(0).unsqueeze(0).float().to(device)

            # Extrai as features da U-Net
            _, unet_features = unet(image_tensor)
            unet_features_combined = unet_features.view(-1).detach().cpu().numpy()
            unet_features_by_category[category].append(unet_features_combined)

            # Extrai as features da VGG
            vgg_features = vgg_loss(image_tensor)
            vgg_features_combined = torch.cat([f.view(-1).detach() for f in vgg_features], dim=0).cpu().numpy()
            vgg_features_by_category[category].append(vgg_features_combined)
    
    return unet_features_by_category, vgg_features_by_category


# Função principal para calcular a PCA reduzida para U-Net e VGG
def calculate_pca_for_unet_and_vgg(unet, vgg_loss, images_by_category):
    # Extrai as features da U-Net e do VGG por categoria
    unet_features_by_category, vgg_features_by_category = extract_features_by_category(unet, vgg_loss, images_by_category)

    # Aplica PCA reduzida para as features da U-Net e VGG com variância de 95%, 90%, 85%, e 80%
    pca_unet_95, num_components_unet_95, _ = apply_reduced_pca(unet_features_by_category, number_variance=0.95)
    pca_vgg_95, num_components_vgg_95, _ = apply_reduced_pca(vgg_features_by_category, number_variance=0.95)

    pca_unet_90, num_components_unet_90, _ = apply_reduced_pca(unet_features_by_category, number_variance=0.90)
    pca_vgg_90, num_components_vgg_90, _ = apply_reduced_pca(vgg_features_by_category, number_variance=0.90)

    pca_unet_85, num_components_unet_85, _ = apply_reduced_pca(unet_features_by_category, number_variance=0.85)
    pca_vgg_85, num_components_vgg_85, _ = apply_reduced_pca(vgg_features_by_category, number_variance=0.85)

    pca_unet_80, num_components_unet_80, _ = apply_reduced_pca(unet_features_by_category, number_variance=0.80)
    pca_vgg_80, num_components_vgg_80, _ = apply_reduced_pca(vgg_features_by_category, number_variance=0.80)

    # Retorna as PCAs calculadas para U-Net e VGG
    return {
        "unet": {
            "95_variance": pca_unet_95,
            "90_variance": pca_unet_90,
            "85_variance": pca_unet_85,
            "80_variance": pca_unet_80
        },
        "vgg": {
            "95_variance": pca_vgg_95,
            "90_variance": pca_vgg_90,
            "85_variance": pca_vgg_85,
            "80_variance": pca_vgg_80
        },
        "num_components": {
            "unet": {
                "95_variance": num_components_unet_95,
                "90_variance": num_components_unet_90,
                "85_variance": num_components_unet_85,
                "80_variance": num_components_unet_80
            },
            "vgg": {
                "95_variance": num_components_vgg_95,
                "90_variance": num_components_vgg_90,
                "85_variance": num_components_vgg_85,
                "80_variance": num_components_vgg_80
            }
        }
    }



# Aplicando o PCA nas features da U-Net e VGG
pca_results = calculate_pca_for_unet_and_vgg(unet, vgg_loss, centralized_images_by_category)

# Resultados de PCA
unet_pca_95 = pca_results["unet"]["95_variance"]
vgg_pca_95 = pca_results["vgg"]["95_variance"]

num_components_unet_95 = pca_results["num_components"]["unet"]["95_variance"]
num_components_vgg_95 = pca_results["num_components"]["vgg"]["95_variance"]

# Exibindo o número de componentes reduzidos por categoria para U-Net e VGG
print("Number of Components for U-Net (95% Variance):", num_components_unet_95)
print("Number of Components for VGG (95% Variance):", num_components_vgg_95)


In [None]:
def load_and_preprocess_test_images(test_dir, categories, image_size, input_size=(224,224)):
    test_images_by_category = load_images_by_category(test_dir, categories, image_size)
    test_centralized_images_by_category = {}

    for category, images in test_images_by_category.items():
        test_centralized_images = center_images(images)
        test_centralized_images_by_category[category] = test_centralized_images

    return test_centralized_images_by_category

test_preprocessed_images_by_category = load_and_preprocess_test_images('images_test', y, image_size=(224,224), input_size=(224,224))

In [None]:
for category, images in centralized_images_by_category.items():
    mean = check_centralization(images)
    print(f"Mean pixel values after centralization for category {category}: {mean}")

In [None]:
import torch
from sklearn.decomposition import PCA
import numpy as np

# Função para extrair features por categoria da U-Net e VGG
def extract_features_by_category(unet, vgg_loss, images_by_category):
    unet_features_by_category = {}
    vgg_features_by_category = {}

    for category, images in images_by_category.items():
        unet_features_by_category[category] = []
        vgg_features_by_category[category] = []
        
        for image in images:
            image_tensor = torch.tensor(image / 255.0).unsqueeze(0).unsqueeze(0).float().to(device)

            # Extrai as features da U-Net
            _, unet_features = unet(image_tensor)
            unet_features_combined = unet_features.view(-1).detach().cpu().numpy()
            unet_features_by_category[category].append(unet_features_combined)

            # Extrai as features da VGG
            vgg_features = vgg_loss(image_tensor)
            vgg_features_combined = torch.cat([f.view(-1).detach() for f in vgg_features], dim=0).cpu().numpy()
            vgg_features_by_category[category].append(vgg_features_combined)
    
    return unet_features_by_category, vgg_features_by_category

# Simulação de extração de features para imagens
# Suponha que centralized_images_by_category já está definido e contém suas imagens de teste
unet_features_by_category, vgg_features_by_category = extract_features_by_category(unet, vgg_loss, centralized_images_by_category)

# Função para calcular residuals projetando as features no PCA
def project_and_calculate_residuals(features, pca):
    if features.ndim == 1:
        features = features.reshape(1, -1)
    
    projected_features = pca.inverse_transform(pca.transform(features))
    residuals = features - projected_features
    return residuals

# Função para calcular os OOD scores usando os residuals
def calculate_ood_scores(residuals):
    ood_scores = np.mean(np.abs(residuals), axis=1)
    return np.mean(ood_scores)

# Função para processar e calcular OOD após PCA
def process_and_calculate_ood_with_pca(features_list, pca_model):
    residuals_list = []
    
    for features in features_list:
        if features.ndim == 1:
            features = features.reshape(1, -1)
        
        residuals = project_and_calculate_residuals(features, pca_model)
        residuals_list.append(residuals)
    
    return calculate_ood_scores(np.vstack(residuals_list))

# Função para treinar PCA
def train_pca_on_features(features_list, n_components=0.95):
    all_features = np.vstack(features_list)
    
    # Aplicando PCA para capturar 95% da variância
    pca_model = PCA(n_components=n_components)
    pca_model.fit(all_features)
    
    return pca_model

# Agora, usamos as features extraídas para calcular os PCA e OOD scores
pca_bedroom = train_pca_on_features(unet_features_by_category['Bedroom'], n_components=0.95)
pca_livingRoom = train_pca_on_features(unet_features_by_category['LivingRoom'], n_components=0.95)

# Calculando os OOD scores
ood_score_bedroom_train_bedroom_test = process_and_calculate_ood_with_pca(unet_features_by_category['Bedroom'], pca_bedroom)
ood_score_bedroom_train_livingRoom_test = process_and_calculate_ood_with_pca(unet_features_by_category['LivingRoom'], pca_bedroom)
ood_score_livingRoom_train_livingRoom_test = process_and_calculate_ood_with_pca(unet_features_by_category['LivingRoom'], pca_livingRoom)
ood_score_livingRoom_train_bedroom_test = process_and_calculate_ood_with_pca(unet_features_by_category['Bedroom'], pca_livingRoom)

# Exibindo os resultados
print("OOD Scores usando bedroom Training Data nos componentes do PCA - bedroom Test Data:")
print(f"OOD Score: {ood_score_bedroom_train_bedroom_test}")

print("\nOOD Scores usando bedroom Training Data nos componentes do PCA - livingRoom Test Data:")
print(f"OOD Score: {ood_score_bedroom_train_livingRoom_test}")

print("\nOOD Scores usando livingRoom Training Data nos componentes do PCA - livingRoom Test Data:")
print(f"OOD Score: {ood_score_livingRoom_train_livingRoom_test}")

print("\nOOD Scores usando livingRoom Training Data nos componentes do PCA - bedroom Test Data:")
print(f"OOD Score: {ood_score_livingRoom_train_bedroom_test}")


In [None]:
# Treinando PCA para as features da categoria 'bedroom' e 'livingRoom' com base nas features da VGG
pca_bedroom = train_pca_on_features(vgg_features_by_category['Bedroom'], n_components=0.95)
pca_livingRoom = train_pca_on_features(vgg_features_by_category['LivingRoom'], n_components=0.95)

# Calculando OOD scores projetando nos componentes do PCA usando as features da VGG
ood_score_bedroom_train_bedroom_test = process_and_calculate_ood_with_pca(vgg_features_by_category['Bedroom'], pca_bedroom)
ood_score_bedroom_train_livingRoom_test = process_and_calculate_ood_with_pca(vgg_features_by_category['LivingRoom'], pca_bedroom)
ood_score_livingRoom_train_livingRoom_test = process_and_calculate_ood_with_pca(vgg_features_by_category['LivingRoom'], pca_livingRoom)
ood_score_livingRoom_train_bedroom_test = process_and_calculate_ood_with_pca(vgg_features_by_category['Bedroom'], pca_livingRoom)

# Exibindo os resultados
print("OOD Scores usando bedroom Training Data nos componentes do PCA - bedroom Test Data (VGG Features):")
print(f"OOD Score: {ood_score_bedroom_train_bedroom_test}")

print("\nOOD Scores usando bedroom Training Data nos componentes do PCA - livingRoom Test Data (VGG Features):")
print(f"OOD Score: {ood_score_bedroom_train_livingRoom_test}")

print("\nOOD Scores usando livingRoom Training Data nos componentes do PCA - livingRoom Test Data (VGG Features):")
print(f"OOD Score: {ood_score_livingRoom_train_livingRoom_test}")

print("\nOOD Scores usando livingRoom Training Data nos componentes do PCA - bedroom Test Data (VGG Features):")
print(f"OOD Score: {ood_score_livingRoom_train_bedroom_test}")


## All environments

In [None]:
X = df['image_path'].tolist()
y = df['category'].tolist()
unique_categories = list(df['category'].unique())
print(f"Unique categories: {unique_categories}")

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=10)

standard_size = (224, 224)

In [None]:
create_images_set(X_train, X_test, y_train, y_test, output_dir_train='images_train', output_dir_test='images_test', standard_size=standard_size)

In [None]:
all_test_preprocessed_images_by_category = load_and_preprocess_test_images('images_test', y, image_size, input_size)

In [None]:
all_test_features_by_category = extract_features_with_vgg16(model, all_test_preprocessed_images_by_category)

In [None]:
all_training_preprocessed_images_by_category = load_and_preprocess_test_images('images_train', y, image_size, input_size)

In [None]:
all_training_features_by_category = extract_features_with_vgg16(model, all_training_preprocessed_images_by_category)

In [None]:
pca_by_category = {}
explained_variance_by_category = {}

for category, features in all_training_features_by_category.items():
    pca = PCA(n_components=0.95)  
    principal_components = pca.fit_transform(features)
    pca_by_category[category] = pca
    explained_variance_by_category[category] = pca.explained_variance_ratio_
    
    print(f"Category {category}, principal components: {principal_components.shape[1]}")

for category, pca in pca_by_category.items():
    print(f"Category {category}, principal components shape: {pca.components_.shape}")
    print(f"Category {category}, explained variance: {np.sum(explained_variance_by_category[category]) * 100:.2f}%")

In [None]:
def calculate_reconstruction_error(test_features, pca_by_category):
    reconstruction_errors_by_category = {}
    mean_reconstruction_errors_by_category = {}
    
    for category, pca in pca_by_category.items():
        principal_components = pca.transform(test_features)
        reconstructed_features = pca.inverse_transform(principal_components)
        
        reconstruction_error = np.linalg.norm(test_features - reconstructed_features, axis=1)
        reconstruction_errors_by_category[category] = reconstruction_error / np.linalg.norm(test_features)

    for category, errors in reconstruction_errors_by_category.items():
        mean_reconstruction_errors_by_category[category] = np.mean(errors)
    
    best_category = min(mean_reconstruction_errors_by_category, key=mean_reconstruction_errors_by_category.get)

    for category in mean_reconstruction_errors_by_category:
        print(f"Category {category}, mean reconstruction error: {mean_reconstruction_errors_by_category[category]}")
    
    print(f"Best category: {best_category}")
    print("=====================================")

    return mean_reconstruction_errors_by_category, best_category

for category, test_features in all_test_features_by_category.items():
    print(f"Test category: {category}")
    mean_reconstruction_errors, best_category = calculate_reconstruction_error(test_features, pca_by_category)


In [None]:
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

true_categories = []
predicted_categories = []

for true_category, test_features in all_test_features_by_category.items():
    print(f"True category: {true_category}")
    mean_reconstruction_errors, best_category = calculate_reconstruction_error(test_features, pca_by_category)
    true_categories.append(true_category)
    predicted_categories.append(best_category)

labels = list(pca_by_category.keys())
cm = confusion_matrix(true_categories, predicted_categories, labels=labels)

cm_display = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=labels)
cm_display.plot(cmap=plt.cm.Blues, xticks_rotation='vertical')
plt.title('Confusion Matrix')
plt.show()

In [None]:
labels = list(pca_by_category.keys())
confusion_matrix = np.zeros((len(labels), len(labels)))

for true_category, test_features in all_test_features_by_category.items():
    mean_reconstruction_errors, _ = calculate_reconstruction_error(test_features, pca_by_category)
    true_idx = labels.index(true_category)
    
    for pred_category, error in mean_reconstruction_errors.items():
        pred_idx = labels.index(pred_category)
        confusion_matrix[true_idx, pred_idx] = error

max_error = confusion_matrix.max()
confusion_matrix_normalized = confusion_matrix / max_error

fig, ax = plt.subplots(figsize=(10, 10))
cax = ax.matshow(confusion_matrix_normalized, cmap=plt.cm.Blues)
plt.colorbar(cax)

ax.set_xticks(np.arange(len(labels)))
ax.set_yticks(np.arange(len(labels)))
ax.set_xticklabels(labels, rotation=90)
ax.set_yticklabels(labels)

plt.title('Discrimination Matrix with Mean Reconstruction Errors')
plt.xlabel('Latent Space')
plt.ylabel('Projection/Eigen Space')
plt.show()