# Modulo 3 – K-Means, PCA, Alberi, Random Forest, Reti Neurali base, CNN, NLP

Questo notebook raccoglie esempi sintetici per:

- K-Means (clustering)
- PCA (riduzione dimensionale)
- Decision Tree & Random Forest
- Rete neurale MLP con PyTorch
- CNN semplice (struttura)
- Sentiment Analysis di base


## 1. K-Means e PCA su dati 2D

In [1]:
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans
from sklearn.decomposition import PCA

X, _ = make_blobs(n_samples=500, centers=4, cluster_std=1.0, random_state=42)
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)

kmeans = KMeans(n_clusters=4, random_state=42)
labels = kmeans.fit_predict(X_pca)
print('Centroidi PCA:', kmeans.cluster_centers_)


Centroidi PCA: [[  0.3434169    8.16287054]
 [  3.65764759  -5.95323903]
 [-10.24793335  -1.99519057]
 [  6.24686887  -0.21444094]]


## 2. Decision Tree & Random Forest

In [2]:
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.datasets import load_wine

wine = load_wine()
X = wine.data; y = wine.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

dt = DecisionTreeClassifier(random_state=42)
dt.fit(X_train, y_train)
rf = RandomForestClassifier(n_estimators=100, random_state=42)
rf.fit(X_train, y_train)

print('Accuracy Tree:', accuracy_score(y_test, dt.predict(X_test)))
print('Accuracy RF:', accuracy_score(y_test, rf.predict(X_test)))


Accuracy Tree: 0.9444444444444444
Accuracy RF: 1.0


## 3. Rete neurale MLP con PyTorch (MNIST semplificato)

In [3]:
import torch
from torch import nn, optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

train_ds = datasets.MNIST(root='data', train=True, transform=transform, download=True)
train_loader = DataLoader(train_ds, batch_size=64, shuffle=True)

class SimpleMLP(nn.Module):
    def __init__(self):
        super().__init__()
        self.net = nn.Sequential(
            nn.Flatten(),
            nn.Linear(28*28, 128),
            nn.ReLU(),
            nn.Linear(128, 10)
        )
    def forward(self, x):
        return self.net(x)

model = SimpleMLP().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=1e-3)

for epoch in range(1):  # per demo
    running_loss = 0.0
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
    print(f'Epoca {epoch+1}, loss medio: {running_loss/len(train_loader):.4f}')


100%|█████████████████████████████████████████████████████████████████████████████| 9.91M/9.91M [00:01<00:00, 5.25MB/s]
100%|██████████████████████████████████████████████████████████████████████████████| 28.9k/28.9k [00:00<00:00, 233kB/s]
100%|█████████████████████████████████████████████████████████████████████████████| 1.65M/1.65M [00:00<00:00, 1.97MB/s]
100%|█████████████████████████████████████████████████████████████████████████████| 4.54k/4.54k [00:00<00:00, 1.42MB/s]


Epoca 1, loss medio: 0.3892


## 4. CNN e NLP base

Per la CNN e NLP puoi riutilizzare la struttura MLP adattandola a convoluzioni e a modelli HuggingFace.
Qui riportiamo solo uno scheletro di CNN (non eseguito per velocità):

In [4]:
class SimpleCNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(1, 16, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2),
            nn.Conv2d(16, 32, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )
        self.fc = nn.Sequential(
            nn.Linear(32*7*7, 128),
            nn.ReLU(),
            nn.Linear(128, 10)
        )
    def forward(self, x):
        x = self.conv(x)
        x = x.view(x.size(0), -1)
        return self.fc(x)
