# Models (TensorFlow)

Install the Transformers and Datasets libraries to run this notebook.

In [1]:
!pip install datasets transformers[sentencepiece]

Collecting datasets
  Downloading datasets-1.12.1-py3-none-any.whl (270 kB)
[?25l[K     |█▏                              | 10 kB 19.7 MB/s eta 0:00:01[K     |██▍                             | 20 kB 23.4 MB/s eta 0:00:01[K     |███▋                            | 30 kB 25.6 MB/s eta 0:00:01[K     |████▉                           | 40 kB 28.1 MB/s eta 0:00:01[K     |██████                          | 51 kB 29.9 MB/s eta 0:00:01[K     |███████▎                        | 61 kB 28.6 MB/s eta 0:00:01[K     |████████▌                       | 71 kB 28.1 MB/s eta 0:00:01[K     |█████████▊                      | 81 kB 28.1 MB/s eta 0:00:01[K     |███████████                     | 92 kB 29.6 MB/s eta 0:00:01[K     |████████████▏                   | 102 kB 31.4 MB/s eta 0:00:01[K     |█████████████▍                  | 112 kB 31.4 MB/s eta 0:00:01[K     |██████████████▋                 | 122 kB 31.4 MB/s eta 0:00:01[K     |███████████████▊                | 133 kB 31.4 MB/s et

In [3]:
from transformers import BertConfig, TFBertModel

# Building the config
config = BertConfig()

# Building the model from the config
model = TFBertModel(config)

In [4]:
print(config)

BertConfig {
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "transformers_version": "4.11.3",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30522
}



In [5]:
from transformers import BertConfig, TFBertModel

config = BertConfig()
model = TFBertModel(config)

# Model is randomly initialized!

In [6]:
from transformers import TFBertModel

model = TFBertModel.from_pretrained("bert-base-cased")

Downloading:   0%|          | 0.00/570 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/502M [00:00<?, ?B/s]

Some layers from the model checkpoint at bert-base-cased were not used when initializing TFBertModel: ['nsp___cls', 'mlm___cls']
- This IS expected if you are initializing TFBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
All the layers of TFBertModel were initialized from the model checkpoint at bert-base-cased.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertModel for predictions without further training.


In [7]:
model.save_pretrained("directory_on_my_computer")

In [9]:
!ls directory_on_my_computer

!config.json tf_model.h5

config.json  tf_model.h5
/bin/bash: config.json: command not found


In [10]:
sequences = [
  "Hello!",
  "Cool.",
  "Nice!"
]

In [11]:
encoded_sequences = [
  [ 101, 7592,  999,  102],
  [ 101, 4658, 1012,  102],
  [ 101, 3835,  999,  102]
]

In [12]:
import tensorflow as tf

model_inputs = tf.constant(encoded_sequences)

In [13]:
model_inputs

<tf.Tensor: shape=(3, 4), dtype=int32, numpy=
array([[ 101, 7592,  999,  102],
       [ 101, 4658, 1012,  102],
       [ 101, 3835,  999,  102]], dtype=int32)>

In [14]:
output = model(model_inputs)

In [15]:
output

TFBaseModelOutputWithPooling([('last_hidden_state',
                               <tf.Tensor: shape=(3, 4, 768), dtype=float32, numpy=
                               array([[[ 4.4495684e-01,  4.8276263e-01,  2.7797201e-01, ...,
                                        -5.4032281e-02,  3.9393449e-01, -9.4770037e-02],
                                       [ 2.4942881e-01, -4.4092983e-01,  8.1772339e-01, ...,
                                        -3.1916580e-01,  2.2992201e-01, -4.1171677e-02],
                                       [ 1.3667591e-01,  2.2517806e-01,  1.4502057e-01, ...,
                                        -4.6914808e-02,  2.8224209e-01,  7.5566083e-02],
                                       [ 1.1788853e+00,  1.6738535e-01, -1.8187082e-01, ...,
                                         2.4671350e-01,  1.0440770e+00, -6.1969673e-03]],
                               
                                      [[ 3.6435843e-01,  3.2464169e-02,  2.0257643e-01, ...,
          