# Ejemplo: MLP para clasificación de texto

**Entrada:** $\;$ secuencia de palabras de longitud variable, $\boldsymbol{v}_1,\ldots,\boldsymbol{v}_T$
* $\boldsymbol{v}_t$ es un vector one-hot de dimensión igual a la talla del vocabulario, $V$
* La secuencia se trata como una bolsa de palabras, $\{\boldsymbol{v}_t\}$

**Capa 1:** $\;$ matriz de embedding $E\times V$, $\mathbf{W}_1$, que convierte cada vector disperso $\boldsymbol{v}_t$ en uno denso $\boldsymbol{e}_t$
$$\boldsymbol{e}_t=\mathbf{W}_1\boldsymbol{v}_t$$

**Capa 2:** $\;$ convierte la entrada en un único vector $E$-dimensional mediante **global average pooling**
$$\bar{\boldsymbol{e}}=\frac{1}{T}\sum\nolimits_{t=1}^T\boldsymbol{e}_t$$

**Resto:** $\;$ MLP con una capa oculta

In [3]:
import numpy as np; import tensorflow as tf; from tensorflow import keras
num_words = 10000; embed_size = 16
(x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=num_words)
x_train = keras.preprocessing.sequence.pad_sequences(x_train, value=0, padding="post", maxlen=256)
x_test = keras.preprocessing.sequence.pad_sequences(x_test, value=0, padding="post", maxlen=256)
tf.random.set_seed(42); np.random.seed(42)
model = keras.Sequential([keras.layers.Embedding(num_words, embed_size),
    keras.layers.GlobalAveragePooling1D(),
    keras.layers.Dense(16, activation=tf.nn.relu),
    keras.layers.Dense(1, activation=tf.nn.sigmoid)])
model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_1 (Embedding)     (None, None, 16)          160000    
                                                                 
 global_average_pooling1d_1  (None, 16)                0         
  (GlobalAveragePooling1D)                                       
                                                                 
 dense_2 (Dense)             (None, 16)                272       
                                                                 
 dense_3 (Dense)             (None, 1)                 17        
                                                                 
Total params: 160289 (626.13 KB)
Trainable params: 160289 (626.13 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [4]:
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
x_val = x_train[:10000]; x_train = x_train[10000:]
y_val = y_train[:10000]; y_train = y_train[10000:]
model.fit(x_train, y_train, epochs=50, batch_size=512, validation_data=(x_val, y_val), verbose=0)
score = model.evaluate(x_test, y_test, verbose=0)
print(f"Test: loss: {score[0]} - accuracy: {score[1]:.1%}")

Test: loss: 0.3790019452571869 - accuracy: 86.7%
