# Bilma Classification demo

In [1]:
from bilma import bilma_model
import numpy as np

## Load the model

You can load the model using `bilma_model.load`.

In [2]:
model = bilma_model.load("models-final/bilma_small_MX_epoch-1_classification_epochs-13.h5" )

The layers and parameters of the model are:

In [3]:
model.summary()

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
capt_input (InputLayer)         [(None, 280)]        0                                            
__________________________________________________________________________________________________
embedding (Embedding)           (None, 280, 512)     14860800    capt_input[0][0]                 
__________________________________________________________________________________________________
encoder_5 (Encoder)             (None, 280, 512)     9456640     embedding[0][0]                  
__________________________________________________________________________________________________
tf.__operators__.getitem (Slici (None, 1, 512)       0           encoder_5[0][0]                  
______________________________________________________________________________________________

The input is an array of shape `(bs, 280)` where bs is the batch size and 280 is the max lenght of the sequences. 

The outputs have shape `(bs, 280, 29025)` and  `(bs, 15)`. The first output is the same as the MLM model. The second output predicts the probability of each of the 15 emoticons.

In [4]:
model.inputs, model.outputs

([<KerasTensor: shape=(None, 280) dtype=float32 (created by layer 'capt_input')>],
 [<KerasTensor: shape=(None, 280, 29025) dtype=float32 (created by layer 'dense_37')>,
  <KerasTensor: shape=(None, 15) dtype=float32 (created by layer 'cp')>])

To input data into the model we need a tokenizer just like in the MLM model.

In [5]:
tokenizer = bilma_model.tokenizer(vocab_file="d:/data/twitts/vocab_file_ALL.txt", max_length=280)

In [6]:
texts = ["Tenemos tres días sin internet ni señal de celular en el pueblo.",
         "Incomunicados en el siglo XXI tampoco hay servicio de telefonía fija",
         "Vamos a comer unos tacos",
         "Los del banco no dejan de llamarme"]
tweet = tokenizer.tokenize(texts)

In [None]:
We can now predict the emoticons 

In [7]:
p = model.predict(tweet)
p[1].shape

(4, 15)

In [8]:
tokenizer.decode_emo(p[1])

['🥺', '🤔', '😍', '😡']