<a href="https://colab.research.google.com/github/niltonmalves/tokenizers_datasets_transformers/blob/main/intro_Create_a_custom_model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

https://huggingface.co/docs/transformers/v4.17.0/en/create_a_model

An AutoClass automatically infers the model architecture and downloads pretrained configuration and weights. Generally, we recommend using an AutoClass to produce checkpoint-agnostic code. But users who want more control over specific model parameters can create a custom 🤗 Transformers model from just a few base classes. This could be particularly useful for anyone who is interested in studying, training or experimenting with a 🤗 Transformers model. In this guide, dive deeper into creating a custom model without an AutoClass. Learn how to:

 - Load and customize a model configuration.\
 - Create a model architecture.\
 - Create a slow and fast tokenizer for text.\
 - Create a feature extractor for audio or image tasks.\
 - Create a processor for multimodal tasks.

# Configuration

A configuration refers to a model’s specific attributes. Each model configuration has different attributes; for instance, all NLP models have the hidden_size, num_attention_heads, num_hidden_layers and vocab_size attributes in common. These attributes specify the number of attention heads or hidden layers to construct a model with.

Get a closer look at DistilBERT by accessing DistilBertConfig to inspect it’s attributes:

In [2]:
!pip install folium==0.2.1

Collecting folium==0.2.1
  Downloading folium-0.2.1.tar.gz (69 kB)
[?25l[K     |████▊                           | 10 kB 18.1 MB/s eta 0:00:01[K     |█████████▍                      | 20 kB 25.2 MB/s eta 0:00:01[K     |██████████████                  | 30 kB 12.7 MB/s eta 0:00:01[K     |██████████████████▊             | 40 kB 6.5 MB/s eta 0:00:01[K     |███████████████████████▍        | 51 kB 6.0 MB/s eta 0:00:01[K     |████████████████████████████    | 61 kB 7.0 MB/s eta 0:00:01[K     |████████████████████████████████| 69 kB 4.0 MB/s 
Building wheels for collected packages: folium
  Building wheel for folium (setup.py) ... [?25l[?25hdone
  Created wheel for folium: filename=folium-0.2.1-py3-none-any.whl size=79808 sha256=0af331812bbe4e0f8d10be448c0e8b4c22975ddc66e4c263c81e19a98f21ae27
  Stored in directory: /root/.cache/pip/wheels/9a/f0/3a/3f79a6914ff5affaf50cabad60c9f4d565283283c97f0bdccf
Successfully built folium
Installing collected packages: folium
  Attempting uni

In [3]:
!pip install transformers

Collecting transformers
  Downloading transformers-4.17.0-py3-none-any.whl (3.8 MB)
[K     |████████████████████████████████| 3.8 MB 6.4 MB/s 
[?25hCollecting huggingface-hub<1.0,>=0.1.0
  Downloading huggingface_hub-0.4.0-py3-none-any.whl (67 kB)
[K     |████████████████████████████████| 67 kB 5.1 MB/s 
Collecting tokenizers!=0.11.3,>=0.11.1
  Downloading tokenizers-0.11.6-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.5 MB)
[K     |████████████████████████████████| 6.5 MB 39.2 MB/s 
[?25hCollecting sacremoses
  Downloading sacremoses-0.0.49-py3-none-any.whl (895 kB)
[K     |████████████████████████████████| 895 kB 46.9 MB/s 
Collecting pyyaml>=5.1
  Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB)
[K     |████████████████████████████████| 596 kB 36.8 MB/s 
Installing collected packages: pyyaml, tokenizers, sacremoses, huggingface-hub, transformers
  Attempting uninstall: pyyaml
    Fo

In [4]:
from transformers import DistilBertConfig

config = DistilBertConfig()
print(config)

DistilBertConfig {
  "activation": "gelu",
  "attention_dropout": 0.1,
  "dim": 768,
  "dropout": 0.1,
  "hidden_dim": 3072,
  "initializer_range": 0.02,
  "max_position_embeddings": 512,
  "model_type": "distilbert",
  "n_heads": 12,
  "n_layers": 6,
  "pad_token_id": 0,
  "qa_dropout": 0.1,
  "seq_classif_dropout": 0.2,
  "sinusoidal_pos_embds": false,
  "transformers_version": "4.17.0",
  "vocab_size": 30522
}



DistilBertConfig displays all the default attributes used to build a base DistilBertModel. All attributes are customizable, creating space for experimentation. For example, you can customize a default model to:

Try a different activation function with the activation parameter.
Use a higher dropout ratio for the attention probabilities with the attention_dropout parameter.

In [5]:
my_config = DistilBertConfig(activation="relu", attention_dropout=0.4)
print(my_config)

DistilBertConfig {
  "activation": "relu",
  "attention_dropout": 0.4,
  "dim": 768,
  "dropout": 0.1,
  "hidden_dim": 3072,
  "initializer_range": 0.02,
  "max_position_embeddings": 512,
  "model_type": "distilbert",
  "n_heads": 12,
  "n_layers": 6,
  "pad_token_id": 0,
  "qa_dropout": 0.1,
  "seq_classif_dropout": 0.2,
  "sinusoidal_pos_embds": false,
  "transformers_version": "4.17.0",
  "vocab_size": 30522
}



In [6]:
my_config = DistilBertConfig(activation="relu", attention_dropout=0.4, vocab_size= 60000)
print(my_config)

DistilBertConfig {
  "activation": "relu",
  "attention_dropout": 0.4,
  "dim": 768,
  "dropout": 0.1,
  "hidden_dim": 3072,
  "initializer_range": 0.02,
  "max_position_embeddings": 512,
  "model_type": "distilbert",
  "n_heads": 12,
  "n_layers": 6,
  "pad_token_id": 0,
  "qa_dropout": 0.1,
  "seq_classif_dropout": 0.2,
  "sinusoidal_pos_embds": false,
  "transformers_version": "4.17.0",
  "vocab_size": 60000
}



In [15]:
my_config.save_pretrained(save_directory="./my_config.json")

In [9]:
!rm config.json

In [14]:
!ls

config.json  sample_data  tes.json


In [16]:
my_config = DistilBertConfig.from_pretrained("./my_config.json")

In [18]:
from transformers import DistilBertModel

model = DistilBertModel(my_config)

**If you trained your own tokenizer, you can create one from your vocabulary file:**

In [None]:
# from transformers import DistilBertTokenizer

# my_tokenizer = DistilBertTokenizer(vocab_file="my_vocab_file.txt", do_lower_case=False, padding_side="left")