### Prof. Alfio Ferrara
# Reti Neurali
### Master in Digital Humanities

Supponiamo di voler insegnare alla macchina la classificazione dei seguenti pseudo-testi rispetto alla regione climatica che essi rappresentano.

In [4]:
texts = [
    "sabbia sabbia sabbia sole sole vento vento",                     # T1 - arido
    "sabbia sabbia sole vento vento collina",                         # T2 - arido
    "sabbia sabbia sabbia sabbia sole sole vento",                   # T3 - arido
    "sabbia sabbia sole sole vento pioggia",                         # T4 - arido
    "pioggia pioggia pioggia foresta foresta fiume fiume albero",    # T5 - umido
    "pioggia pioggia pioggia pioggia foresta fiume fiume albero",    # T6 - umido
    "pioggia pioggia foresta foresta foresta fiume",                 # T7 - umido
    "pioggia pioggia pioggia foresta foresta fiume fiume fiume",     # T8 - umido
    "sabbia sole vento pioggia collina collina albero albero nuvola",# T9 - temperato
    "sole vento pioggia foresta collina collina albero nuvola",      # T10 - temperato
    "sabbia sole vento foresta collina collina albero albero nuvola",# T11 - temperato
    "sole vento pioggia fiume collina collina albero nuvola"         # T12 - temperato
]

In [5]:
labels = [
    "arido",     # T1
    "arido",     # T2
    "arido",     # T3
    "arido",     # T4
    "umido",     # T5
    "umido",     # T6
    "umido",     # T7
    "umido",     # T8
    "temperato", # T9
    "temperato", # T10
    "temperato", # T11
    "temperato"  # T12
]

## Trasformiamo sia i documenti sia le etichette in vettori

In [6]:
from collections import defaultdict
import numpy as np
import pandas as pd

In [7]:
def indexing(corpus):
    i = defaultdict(lambda: defaultdict(lambda: 0))
    for j, doc in enumerate(corpus):
        for token in doc.split():
            i[j][token] += 1
    return pd.DataFrame(i).fillna(0)

In [8]:
C = indexing(texts)
target = indexing(labels)

In [9]:
C

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11
sabbia,3.0,2.0,4.0,2.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0
sole,2.0,1.0,2.0,2.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0
vento,2.0,2.0,1.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0
collina,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,2.0,2.0,2.0
pioggia,0.0,0.0,0.0,1.0,3.0,4.0,2.0,3.0,1.0,1.0,0.0,1.0
foresta,0.0,0.0,0.0,0.0,2.0,1.0,3.0,2.0,0.0,1.0,1.0,0.0
fiume,0.0,0.0,0.0,0.0,2.0,2.0,1.0,3.0,0.0,0.0,0.0,1.0
albero,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,2.0,1.0,2.0,1.0
nuvola,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0


In [10]:
target

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11
arido,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
umido,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0
temperato,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0


Prendiamo il vettore che rappresenta il primo documento e il vettore che rappresenta l'etichetta corrispondente

In [15]:
x = C[0].values
y = target[0].values

print(x)
print(y)

[3. 2. 2. 0. 0. 0. 0. 0. 0.]
[1. 0. 0.]


Se avessimo un meccanismo per trasformare il primo vettore nel secondo avremmo di fatto a disposizione una macchina capace di classificare il testo.

Nello specifico, nel nostro caso ci troviamo a dover trasformare un vettore con **9** dimensioni in uno con **3** dimensioni.

Per farlo basta moltiplicare il primo vettore per una matrice dotata di **9** colonne e **tre** righe.

La matrice (detta anche matrice dei **parametri** o dei **pesi**) deve avere

- 3 righe: una per ogni output desiderato (una per ciascuna classe)

- 9 colonne: una per ogni dimensione dell’input (una per ciascuna parola nel vocabolario)

Questo perchè, in altri termini, stiamo facendo il prodotto scalare di ogni riga dei parametri per il vettore di input e sistemando il risultato nel vettore di output.

In [20]:
np.random.seed(42) 

W = np.round(np.random.uniform(-1, 1, size=(3, 9)), 2)

print(W)

[[-0.25  0.9   0.46  0.2  -0.69 -0.69 -0.88  0.73  0.2 ]
 [ 0.42 -0.96  0.94  0.66 -0.58 -0.64 -0.63 -0.39  0.05]
 [-0.14 -0.42  0.22 -0.72 -0.42 -0.27 -0.09  0.57 -0.6 ]]


In [23]:
y_hat = W.dot(x)

print(y_hat)

[ 1.97  1.22 -0.82]


Trasformiamo infine il vettore risultante in una probabilità per ogni classe

In [24]:
def softmax(logits):
    exps = np.exp(logits - np.max(logits))  # stabilizzazione numerica
    return exps / np.sum(exps)

In [25]:
softmax(y_hat)

array([0.65198069, 0.30797387, 0.04004545])

## Questioni aperte
1. Qual è il ruolo dei parametri?
2. Come possiamo allenare la macchina e farle apprendere i parametri giusti? 