# Design Pattern 2: Embeddings

> Aborda o problema de representar dados de alta cardinalidade de forma densa em uma dimensão inferior, passando os dados de entrada por uma camada de incorporação que possui pesos treináveis.

### Bibliotecas

In [1]:
import pandas as pd
import numpy as np
import tensorflow as tf
import warnings

warnings.filterwarnings('ignore')

2024-01-04 09:57:38.376952: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-01-04 09:57:38.378258: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-01-04 09:57:38.404353: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-01-04 09:57:38.405039: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


### Dados de teste

In [2]:
df = pd.read_csv('data/babyweight_train.csv')
df.dropna(inplace=True)
print(f'Linhas: {df.shape[0]} | Colunas: {df.shape[1]}')
df.head()

Linhas: 185665 | Colunas: 6


Unnamed: 0,weight_pounds,is_male,mother_age,plurality,gestation_weeks,mother_race
0,7.749249,False,12,Single(1),40,1.0
1,7.561856,True,12,Single(1),40,2.0
2,7.18707,False,12,Single(1),34,3.0
3,6.375769,True,12,Single(1),36,2.0
5,6.000983,False,12,Single(1),39,2.0


### Design Pattern 2: Embeddings

- Defina a coluna de recurso categórico para a pluralidade
- Envolva a coluna de recurso categórico em uma coluna embedding
- A coluna de recurso resultante (plurality_embed) é usada como entrada da rede neural

In [3]:
plurality = tf.feature_column.categorical_column_with_vocabulary_list(
    'plurality', ['Single(1)', 'Multiple(2+)', 'Twins(2)', 'Triplets(3)', 'Quadruplets(4)', 'Quintuplets(5)'])

plurality_embed = tf.feature_column.embedding_column(plurality, dimension=2)

Instructions for updating:
Use Keras preprocessing layers instead, either directly or via the `tf.keras.utils.FeatureSpace` utility. Each of `tf.feature_column.*` has a functional equivalent in `tf.keras.layers` for feature preprocessing when training a Keras model.
Instructions for updating:
Use Keras preprocessing layers instead, either directly or via the `tf.keras.utils.FeatureSpace` utility. Each of `tf.feature_column.*` has a functional equivalent in `tf.keras.layers` for feature preprocessing when training a Keras model.


In [4]:
input_layer = tf.keras.layers.Input(shape=(1,), name='plurality', dtype=tf.string)
resource_columns = [plurality_embed]
inputs = {resource_cols.categorical_column.name: input_layer for resource_cols in resource_columns}
embedding_layer = tf.keras.layers.DenseFeatures(resource_columns)(inputs)

output_layer = tf.keras.layers.Dense(units=1, activation='sigmoid')(embedding_layer)

modelo = tf.keras.models.Model(inputs=inputs, outputs=output_layer)
modelo.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
modelo.summary()

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 plurality (InputLayer)      [(None, 1)]               0         
                                                                 
 dense_features (DenseFeatu  (None, 2)                 12        
 res)                                                            
                                                                 
 dense (Dense)               (None, 1)                 3         
                                                                 
Total params: 15 (60.00 Byte)
Trainable params: 15 (60.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
