<a href="https://colab.research.google.com/github/samugatu/CIS-IEEE/blob/main/CloudsClassification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Classificação de Nuvens usando CNN

# part 1: Desafio Kaggle (Rede do Zero com pytorch e tochvision)

**Objetivo**: Contruir e treinar uma rede neural simples e obter uma boa acurácia na classificação.

Bibliotecas utilizadas:

In [32]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
from torchvision import models
import os
from torchvision.datasets import ImageFolder
import zipfile


Definindo as trasnformações:

In [33]:
transform = transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])


Vamos carregar nosso dataset de treino e test:

In [34]:
with zipfile.ZipFile("clouds.zip","r") as zip_ref:
    zip_ref.extractall("clouds")

In [35]:

train_dir = '/content/clouds/clouds/clouds_train'
test_dir = '/content/clouds/clouds/clouds_test'

train_dataset = ImageFolder(train_dir, transform=transform)
test_dataset = ImageFolder(test_dir, transform=transform)

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)

Agora vamos definir nossa rede neural, com duas camadas

In [36]:
class Net(nn.Module):
  def __init__(self, numclasses):
    super(Net, self).__init__()

    self.conv1 = nn.Conv2d(3, 16, 3, padding=1)
    self.relu1 = nn.ReLU()
    self.pool1 = nn.MaxPool2d(2, 2)
    self.fc1 = nn.Linear(16 * 16 * 16, numclasses)

  def forward(self, x):
    x=self.pool1(torch.relu(self.conv1(x)))
    x=x.view(-1, 16 * 16 * 16)
    x=self.fc1(x)
    return x

Vamos incializar o modelo:

In [37]:
numclasses = len(train_dataset.classes)
model = Net(numclasses=numclasses)

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

Agora vamos para o treinamento:

In [38]:
num_epochs = 25

for epochs in range(num_epochs):
  running_loss = 0.0
  for inputs, labels in train_loader:
    optimizer.zero_grad()
    outputs = model(inputs)
    loss = criterion(outputs, labels)
    loss.backward()
    optimizer.step()

    running_loss += loss.item()

  epoch_loss = running_loss / len(train_loader)
  print(f'Epoch {epochs+1}/{num_epochs}, Loss: {epoch_loss:.4f}')

Epoch 1/25, Loss: 1.6974
Epoch 2/25, Loss: 1.4237
Epoch 3/25, Loss: 1.2639
Epoch 4/25, Loss: 1.1533
Epoch 5/25, Loss: 1.0989
Epoch 6/25, Loss: 1.0502
Epoch 7/25, Loss: 1.0268
Epoch 8/25, Loss: 0.9727
Epoch 9/25, Loss: 0.8921
Epoch 10/25, Loss: 0.8659
Epoch 11/25, Loss: 0.8603
Epoch 12/25, Loss: 0.8047
Epoch 13/25, Loss: 0.7874
Epoch 14/25, Loss: 0.7655
Epoch 15/25, Loss: 0.7129
Epoch 16/25, Loss: 0.6813
Epoch 17/25, Loss: 0.6778
Epoch 18/25, Loss: 0.6852
Epoch 19/25, Loss: 0.6259
Epoch 20/25, Loss: 0.6079
Epoch 21/25, Loss: 0.6052
Epoch 22/25, Loss: 0.5729
Epoch 23/25, Loss: 0.5581
Epoch 24/25, Loss: 0.5396
Epoch 25/25, Loss: 0.5210


Vamos definir uma função de avaliação do modelo:

In [39]:
def evaluate_model(model, dataloader):
  correct = 0
  total = 0
  with torch.no_grad():
    for inputs, labels in dataloader:
      outputs = model(inputs)
      _, predicted = torch.max(outputs.data, 1)
      total += labels.size(0)
      correct += (predicted == labels).sum().item()

  accuracy = 100 * correct / total
  print(f'Accuracy: {accuracy:.2f}%')
  return accuracy

Agora vamos avaliar nos arquivos test:

In [40]:
accuracy1 = evaluate_model(model, test_loader)

Accuracy: 55.35%


Este resultado não esta satisfatório, vamos tentar fazer algumas mudanças, como adicionar mais uma camada covulacional.

In [41]:
class Net2(nn.Module):
  def __init__(self, numclasses):
    super().__init__()

    self.conv1 = nn.Conv2d(3, 16, 3, padding=1)
    self.relu1 = nn.ReLU()
    self.pool1 = nn.MaxPool2d(2, 2)

    self.conv2 = nn.Conv2d(16, 32, 3, padding=1)
    self.relu2 = nn.ReLU()
    self.pool2 = nn.MaxPool2d(2, 2)

    self.fc1 = nn.Linear(32 * 8 * 8, numclasses)

  def forward(self, x):
    x=self.pool1(torch.relu(self.conv1(x)))
    x=self.pool2(torch.relu(self.conv2(x)))
    x=x.view(-1, 32 * 8 * 8)
    x=self.fc1(x)
    return x

Inicializar a segunda rede:

In [42]:
numclasses = len(train_dataset.classes)
model2 = Net2(numclasses=numclasses)

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model2.parameters(), lr=0.001)

Agora o treinamento e a avaliação:

In [43]:
num_epochs = 50

for epochs in range(num_epochs):
  running_loss = 0.0
  for inputs, labels in train_loader:
    optimizer.zero_grad()
    outputs = model2(inputs)
    loss = criterion(outputs, labels)
    loss.backward()
    optimizer.step()

    running_loss += loss.item()

  epoch_loss = running_loss / len(train_loader)
  print(f'Epoch {epochs+1}/{num_epochs}, Loss: {epoch_loss:.4f}')

Epoch 1/50, Loss: 1.8550
Epoch 2/50, Loss: 1.6948
Epoch 3/50, Loss: 1.5352
Epoch 4/50, Loss: 1.3733
Epoch 5/50, Loss: 1.2418
Epoch 6/50, Loss: 1.1810
Epoch 7/50, Loss: 1.1126
Epoch 8/50, Loss: 1.0946
Epoch 9/50, Loss: 1.0294
Epoch 10/50, Loss: 1.0522
Epoch 11/50, Loss: 1.0166
Epoch 12/50, Loss: 0.9440
Epoch 13/50, Loss: 0.9091
Epoch 14/50, Loss: 0.8906
Epoch 15/50, Loss: 0.8747
Epoch 16/50, Loss: 0.8505
Epoch 17/50, Loss: 0.8104
Epoch 18/50, Loss: 0.7544
Epoch 19/50, Loss: 0.7270
Epoch 20/50, Loss: 0.7023
Epoch 21/50, Loss: 0.6630
Epoch 22/50, Loss: 0.6444
Epoch 23/50, Loss: 0.6087
Epoch 24/50, Loss: 0.5965
Epoch 25/50, Loss: 0.5825
Epoch 26/50, Loss: 0.5609
Epoch 27/50, Loss: 0.5407
Epoch 28/50, Loss: 0.5153
Epoch 29/50, Loss: 0.5443
Epoch 30/50, Loss: 0.4736
Epoch 31/50, Loss: 0.4677
Epoch 32/50, Loss: 0.4406
Epoch 33/50, Loss: 0.4448
Epoch 34/50, Loss: 0.4026
Epoch 35/50, Loss: 0.3944
Epoch 36/50, Loss: 0.3956
Epoch 37/50, Loss: 0.3791
Epoch 38/50, Loss: 0.3703
Epoch 39/50, Loss: 0.

In [44]:
accuracy2 = evaluate_model(model2, test_loader)

Accuracy: 64.81%


In [45]:
print(f'Acurácia modelo 1: {accuracy1:.2f}%')
print(f'Acurácia modelo 2: {accuracy2:.2f}%')

Acurácia modelo 1: 55.35%
Acurácia modelo 2: 64.81%


Vamos gerar nosso arquivo de submissão como pedido no desafio:

In [46]:
import pandas as pd

model2.eval()
idx_to_class = {v: k for k, v in train_dataset.class_to_idx.items()}
test_filenames = [os.path.basename(f[0]) for f in test_dataset.samples]

predictions = []
with torch.no_grad():
    for inputs, _ in test_loader:
        outputs = model(inputs)
        _, predicted = torch.max(outputs.data, 1)


        predict_labels = [idx_to_class[idx.item()] for idx in predicted]
        predictions.extend(predict_labels)

submission_df = pd.DataFrame({'row_id': test_filenames, 'label': predictions})
submission_df.to_csv('submission.csv', index=False)


Aqui foram feitas duas redes com configurações diferentes e foi obtido a acurácia de:

Acurácia modelo 1: 59.88%


Acurácia modelo 2: 64.40%

Não é um bomr esultado, mas para uma rede do zero e bem simples é um resultado aceitável.

# Parte 2: Pegar um rede pré treianda

In [47]:
transform_resnet = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

transform_train_resnet = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

train_dataset = datasets.ImageFolder(train_dir, transform=transform_train_resnet)
test_dataset = datasets.ImageFolder(test_dir, transform=transform_resnet)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)

print("Carregando o modelo ResNet-18 pré-treinado...")
model_transfer = models.resnet18(pretrained=True)

for param in model_transfer.parameters():
    param.requires_grad = False

num_classes = len(train_dataset.classes)
num_features = model_transfer.fc.in_features
model_transfer.fc = nn.Linear(num_features, num_classes)


model_transfer = model_transfer.to(device)


optimizer = optim.Adam(model_transfer.fc.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()

num_epochs = 10

print("Iniciando o treinamento (transfer learning)...")
for epoch in range(num_epochs):
    model_transfer.train()
    running_loss = 0.0
    for inputs, labels in train_loader:
        inputs = inputs.to(device)
        labels = labels.to(device)

        optimizer.zero_grad()
        outputs = model_transfer(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()


    model_transfer.eval()
    epoch_loss = running_loss / len(train_loader)
    correct = 0
    total = 0
    with torch.no_grad():
        for inputs, labels in test_loader:
            inputs = inputs.to(device)
            labels = labels.to(device)
            outputs = model_transfer(inputs)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

    accuracy = 100 * correct / total
    print(f'Epoch {epoch+1}/{num_epochs}, Loss: {epoch_loss:.4f}, Acurácia Teste: {accuracy:.2f}%')

print("Treinamento concluído.")



Carregando o modelo ResNet-18 pré-treinado...




Iniciando o treinamento (transfer learning)...
Epoch 1/10, Loss: 1.6795, Acurácia Teste: 58.23%
Epoch 2/10, Loss: 1.0949, Acurácia Teste: 81.48%
Epoch 3/10, Loss: 0.7943, Acurácia Teste: 81.28%
Epoch 4/10, Loss: 0.6463, Acurácia Teste: 85.19%
Epoch 5/10, Loss: 0.5093, Acurácia Teste: 87.24%
Epoch 6/10, Loss: 0.4371, Acurácia Teste: 87.65%
Epoch 7/10, Loss: 0.4052, Acurácia Teste: 88.89%
Epoch 8/10, Loss: 0.3597, Acurácia Teste: 89.71%
Epoch 9/10, Loss: 0.3372, Acurácia Teste: 89.71%
Epoch 10/10, Loss: 0.3173, Acurácia Teste: 89.51%
Treinamento concluído.


Podemos ver aqui com uma rede pré-treinada a rede tem uma acurácia bem superior,
além disso, podemos notar que a partir da sétima época , apesar da função perda diminuir, a acurácia também diminui, notando já um overfiting. Assim, vamos considerar 7 épocas o ideal.

# Parte 3: Regularização e Data agumentation

Agora vamos implentar na nossa rede neural as regularizações L2 e DropOut

In [49]:
num_features = model_transfer.fc.in_features
num_classes = len(train_dataset.classes)

model_transfer.fc = nn.Sequential(
    nn.Linear(num_features, 512),
    nn.ReLU(),
    nn.Dropout(0.5),
    nn.Linear(512, num_classes)
)

model_transfer = model_transfer.to(device)
optimizer = optim.Adam(model_transfer.parameters(), lr=0.001, weight_decay=0.0001)

In [51]:
num_epochs = 10

for eacho in range(num_epochs):
  model_transfer.train()
  running_loss = 0.0
  for inputs, labels in train_loader:
    inputs = inputs.to(device)
    labels = labels.to(device)

    outputs = model_transfer(inputs)
    loss = criterion(outputs, labels)

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    running_loss += loss.item()

  epoch_loss = running_loss / len(train_loader)

  model_transfer.eval()
  correct = 0
  total = 0
  with torch.no_grad():
    for inputs, labels in test_loader:
      inputs = inputs.to(device)
      labels = labels.to(device)
      outputs = model_transfer(inputs)
      _, predicted = torch.max(outputs.data, 1)
      total += labels.size(0)
      correct += (predicted == labels).sum().item()

  accuracy = 100 * correct / total
  print(f'Epoch {eacho+1}/{num_epochs}, Loss: {epoch_loss:.4f}, Acurácia Teste: {accuracy:.2f}%')

Epoch 1/10, Loss: 1.5215, Acurácia Teste: 74.49%
Epoch 2/10, Loss: 0.7690, Acurácia Teste: 84.36%
Epoch 3/10, Loss: 0.5532, Acurácia Teste: 87.86%
Epoch 4/10, Loss: 0.4979, Acurácia Teste: 83.54%
Epoch 5/10, Loss: 0.4512, Acurácia Teste: 87.24%
Epoch 6/10, Loss: 0.5121, Acurácia Teste: 88.89%
Epoch 7/10, Loss: 0.2974, Acurácia Teste: 86.42%
Epoch 8/10, Loss: 0.2836, Acurácia Teste: 89.92%
Epoch 9/10, Loss: 0.2461, Acurácia Teste: 89.71%
Epoch 10/10, Loss: 0.2233, Acurácia Teste: 89.51%


Agora ja temos uma acurácia de quse 90% utilizando uma rede pré treinada e alguns métodos de regularização. Agora para tentar aumentar ainda mais a acurácia iremos aplicar o Data Augmentation, na prática iremos gerar "novas" imagens baseadas nas ja existentes, fazendo rotações aleatórias, ajuste de cores etc. Assimteremos um conjunto de treino ainda maior

In [52]:

transform_train_resnet = transforms.Compose([
    transforms.RandomResizedCrop(224),
    transforms.RandomHorizontalFlip(),
    transforms.TrivialAugmentWide(),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

trnsform_test_resnet = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

train_dataset = datasets.ImageFolder(train_dir, transform=transform_train_resnet)
test_dataset = datasets.ImageFolder(test_dir, transform=transform_resnet)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)

print("Carregando o modelo ResNet-18 pré-treinado...")

model_transfer = models.resnet18(pretrained=True)

for param in model_transfer.parameters():
    param.requires_grad = False

num_features = model_transfer.fc.in_features
num_classes = len(train_dataset.classes)

model_transfer.fc = nn.Sequential(
    nn.Linear(num_features, 512),
    nn.ReLU(),
    nn.Dropout(0.5),
    nn.Linear(512, num_classes)
)

model_transfer = model_transfer.to(device)
optimizer = optim.Adam(model_transfer.parameters(), lr=0.001, weight_decay=0.001)

num_epochs = 10

optimizer = optim.Adam(model_transfer.parameters(), lr=0.001, weight_decay=0.001)

for eacho in range(num_epochs):
  model_transfer.train()
  running_loss = 0.0
  for inputs, labels in train_loader:
    inputs = inputs.to(device)
    labels = labels.to(device)

    outputs = model_transfer(inputs)
    loss = criterion(outputs, labels)

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    running_loss += loss.item()

  epoch_loss = running_loss / len(train_loader)

  model_transfer.eval()
  correct = 0
  total = 0
  with torch.no_grad():
    for inputs, labels in test_loader:
      inputs = inputs.to(device)
      labels = labels.to(device)
      outputs = model_transfer(inputs)
      _, predicted = torch.max(outputs.data, 1)
      total += labels.size(0)
      correct += (predicted == labels).sum().item()

  accuracy = 100 * correct / total
  print(f'Epoch {eacho+1}/{num_epochs}, Loss: {epoch_loss:.4f}, Acurácia Teste: {accuracy:.2f}%')


Carregando o modelo ResNet-18 pré-treinado...
Epoch 1/10, Loss: 1.6701, Acurácia Teste: 57.20%
Epoch 2/10, Loss: 1.2962, Acurácia Teste: 71.60%
Epoch 3/10, Loss: 1.1017, Acurácia Teste: 79.42%
Epoch 4/10, Loss: 0.9629, Acurácia Teste: 79.01%
Epoch 5/10, Loss: 0.9077, Acurácia Teste: 84.77%
Epoch 6/10, Loss: 0.9093, Acurácia Teste: 83.74%
Epoch 7/10, Loss: 0.8924, Acurácia Teste: 85.19%
Epoch 8/10, Loss: 0.8732, Acurácia Teste: 82.92%
Epoch 9/10, Loss: 0.8549, Acurácia Teste: 86.21%
Epoch 10/10, Loss: 0.8319, Acurácia Teste: 86.42%


# Conlusão

Na primeira parte,tentamos criar um rede neural do zero, ou seja, sem treinamento prévio, ajustando alguns parametros, o melhor resultado obtido foi de **64,8%** de acurácia.

Depois utilizamos uma rede pré -treinada (ResNet-18) e assim foi possível chegar **88,8%** de acurácia.

Ao final, tentamos melhorar ainda mais este resultado, com tecnicas de regularização e data augmatation. Aqui, quando utilizamos as tecnincas de regularização L2 e dropout, obtivemos um resultado ainda melhor de **89,51%** de acurácia. Porém utlizando as tecnicas de data augmatation, a acurácia foi diminuida.