<img src="https://github.com/hernancontigiani/ceia_memorias_especializacion/raw/master/Figures/logoFIUBA.jpg" width="500" align="center">


# Procesamiento de lenguaje natural
## Bert to Tensorflow Serving

## 1 - Instalar dependencias (ya sea en el colab o en su PC/servidor)

In [None]:
!pip install transformers --quiet

[K     |████████████████████████████████| 2.9 MB 4.3 MB/s 
[K     |████████████████████████████████| 596 kB 50.0 MB/s 
[K     |████████████████████████████████| 895 kB 40.9 MB/s 
[K     |████████████████████████████████| 3.3 MB 32.2 MB/s 
[K     |████████████████████████████████| 56 kB 5.3 MB/s 
[?25h

In [None]:
# Descargar los pesos entrenados de BERT desde un gogle drive (es la forma más rápida)
# NOTA: No hay garantía de que estos links perduren, en caso de que no estén
# disponibles, se pueden obtener del entrenamiento de BERT de la clase anterior
!curl -L -o 'bert_weights.h5' 'https://drive.google.com/u/0/uc?id=1ILoVmLK3IFMOZiWEkqvqSmnHF7a--3h2&export=download&confirm=t'

Downloading...
From: https://drive.google.com/uc?id=1ZZaAKr4jb9eLSora5kbSg8dJqb4DkjzL&export=download
To: /content/bert_weights.h5
100%|██████████| 438M/438M [00:04<00:00, 105MB/s]


In [None]:
import tensorflow as tf
import numpy as np
from transformers import BertTokenizer, TFBertModel, BertConfig

output_shape = 3
max_length = 140

bert_model = TFBertModel.from_pretrained("bert-base-uncased")

input_ids = tf.keras.layers.Input(shape=(max_length,), dtype=tf.int32, name='input_ids')
attention_mask = tf.keras.layers.Input(shape=(max_length,), dtype=tf.int32, name='attention_mask')

# Get the pooled_output (embedding que representa toda la entrada)
output = bert_model.bert(input_ids=input_ids, attention_mask=attention_mask)[1]

# We can also add dropout as regularization technique:
output = tf.keras.layers.Dropout(rate=0.2)(output)

# Se puede agregar más capas Densas en el medio si se desea

# Provide number of classes to the final layer:
output = tf.keras.layers.Dense(output_shape, activation='softmax')(output)

# Final model:
model = tf.keras.models.Model(inputs=[input_ids, attention_mask], outputs=output)
model.load_weights("bert_weights.h5")

Downloading:   0%|          | 0.00/570 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/511M [00:00<?, ?B/s]

Some layers from the model checkpoint at bert-base-uncased were not used when initializing TFBertModel: ['nsp___cls', 'mlm___cls']
- This IS expected if you are initializing TFBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
All the layers of TFBertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertModel for predictions without further training.


In [None]:
model.summary()

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_ids (InputLayer)          [(None, 140)]        0                                            
__________________________________________________________________________________________________
attention_mask (InputLayer)     [(None, 140)]        0                                            
__________________________________________________________________________________________________
bert (TFBertMainLayer)          TFBaseModelOutputWit 109482240   input_ids[0][0]                  
                                                                 attention_mask[0][0]             
__________________________________________________________________________________________________
dropout_37 (Dropout)            (None, 768)          0           bert[0][1]                   

In [None]:
callable = tf.function(model.call)
concrete_function = callable.get_concrete_function([tf.TensorSpec((None, max_length), tf.int32, name="input_ids"), tf.TensorSpec((None, max_length), tf.int32, name="attention_mask")])
model.save('mybert/1', signatures=concrete_function)





INFO:tensorflow:Assets written to: mybert/1/assets


INFO:tensorflow:Assets written to: mybert/1/assets


In [None]:
# Zipiar el modelo para su descarga
!zip -r mybert.zip mybert

  adding: mybert/ (stored 0%)
  adding: mybert/1/ (stored 0%)
  adding: mybert/1/variables/ (stored 0%)
  adding: mybert/1/variables/variables.index (deflated 77%)
  adding: mybert/1/variables/variables.data-00000-of-00001 (deflated 7%)
  adding: mybert/1/keras_metadata.pb (deflated 95%)
  adding: mybert/1/assets/ (stored 0%)
  adding: mybert/1/saved_model.pb (deflated 92%)


In [None]:
# Conectar el colab al google drive y copiar el modelo a nuestr acuenta
import shutil
import os
from google.colab import drive
drive.mount('/content/drive', force_remount=True)
print(shutil.copyfile('mybert.zip', os.path.join("/content/drive/MyDrive", 'mybert.zip')))

Mounted at /content/drive
/content/drive/MyDrive/mybert.zip
