### NLP Model Size
1. How many parameters does the BERT base cased model have (bert-base-cased)? 

2. If I know the number of parameters for a model, how might I be able to determine how much memory is required when running a model inference? (Each parameter is represented as a single precision floating point number)

3. If I wanted to run a GPT-3 inference. How much RAM would my infrastructure require.

In [1]:
from transformers import AutoModel, AutoTokenizer
import torch

# Two of the checkpoints that I will ususally use with BERT are BERT base cased-
# and BERT base uncased.

# BERT base cased means that I distinguish between upper and lower case words.
# BERT base uncased means that I don't.

def get_model_size(checkpoint='bert-base-cased'):
  '''
  Usage: 
      checkpoint - this is NLP model with its configuration and its associated weights
      returns the size of the NLP model I want to determine
  '''
  
  model = AutoModel.from_pretrained(checkpoint)
  tokenizer = AutoTokenizer.from_pretrained(checkpoint)
  num_params = 0
  
  return sum(torch.numel(param) for param in model.parameters())

checkpoint='bert-base-cased'
print(f"The number of parameters for {checkpoint} is : {get_model_size(checkpoint)}")

Downloading: 100%|██████████| 570/570 [00:00<00:00, 190kB/s]
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Downloading: 100%|██████████| 436M/436M [00:21<00:00, 20.2MB/s] 
Some weights of the model checkpoint at bert-base-cased were not used when initializing BertModel: ['cls.predictions.transform.dense.weight', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining

The number of parameters for bert-base-cased is : 108310272


In [2]:
# 2. If I know the number of parameters for a model, how might I be able to determine how much memory is required when running a model inference? 
# (Each parameter is represented as a single precision floating point number)

# Because each of these parameters is represented as precision floating point numbers,-
# this means they require 4 bytes. 

# So if I do 4 times the number of parameters (108 MILLION), this will give me an approximate value of the size of the model.

# This means I would require approximately 432 megabytes of RAM. Which is half a gig of RAM.
4 * 108

432

In [3]:
# 3. If I wanted to run a GPT-3 inference. How much RAM would my infrastructure require.
175000 * 4

700000