# Inference using packages

Differen gpt2 model options
- gpt2: This is the "small" version of GPT-2. It has 124 million parameters.
- gpt2-medium: This is the "medium" version of GPT-2. It has 355 million parameters.
- gpt2-large: This is the "large" version of GPT-2. It has 774 million parameters.
- gpt2-xl: This is the "extra large" version of GPT-2. It has 1.5 billion parameters.


In [20]:
from transformers import GPT2LMHeadModel, GPT2Tokenizer
gpt2 = GPT2LMHeadModel.from_pretrained('gpt2') # loading gpt2 from transformers library
gpt2_tokenizer = GPT2Tokenizer.from_pretrained('gpt2') # loading gpt2 tokenizer from transformers library



In [23]:
input_text = "A long time ago in a galaxy far far away ..."
input_ids = gpt2_tokenizer.encode(input_text, return_tensors='pt') # tokenize input
output = gpt2.generate(input_ids, max_length=50) # run inference
generated_text = gpt2_tokenizer.decode(output[0], skip_special_tokens=True) # decode output tokens
print(generated_text)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


A long time ago in a galaxy far far away...

The first human-made planet was discovered in the early 1960s by a team of astronomers from the University of California, Berkeley.

The discovery of the first human-made planet


# Inference using Numpy

In [18]:
import numpy as np

def torch_to_numpy(tensor):
    # nessessarry because numpy will be run on CPU. More complicated otherwise.
    if tensor.is_cuda:
        tensor = tensor.cpu()
    numpy_array = tensor.numpy()
    return numpy_array.copy()

In [27]:
state_dict = gpt2.state_dict()
for name, param in state_dict.items():
    ans = torch_to_numpy(param)
    print(f'{name}: {ans.shape}')

transformer.wte.weight: (50257, 768)
transformer.wpe.weight: (1024, 768)
transformer.h.0.ln_1.weight: (768,)
transformer.h.0.ln_1.bias: (768,)
transformer.h.0.attn.c_attn.weight: (768, 2304)
transformer.h.0.attn.c_attn.bias: (2304,)
transformer.h.0.attn.c_proj.weight: (768, 768)
transformer.h.0.attn.c_proj.bias: (768,)
transformer.h.0.ln_2.weight: (768,)
transformer.h.0.ln_2.bias: (768,)
transformer.h.0.mlp.c_fc.weight: (768, 3072)
transformer.h.0.mlp.c_fc.bias: (3072,)
transformer.h.0.mlp.c_proj.weight: (3072, 768)
transformer.h.0.mlp.c_proj.bias: (768,)
transformer.h.1.ln_1.weight: (768,)
transformer.h.1.ln_1.bias: (768,)
transformer.h.1.attn.c_attn.weight: (768, 2304)
transformer.h.1.attn.c_attn.bias: (2304,)
transformer.h.1.attn.c_proj.weight: (768, 768)
transformer.h.1.attn.c_proj.bias: (768,)
transformer.h.1.ln_2.weight: (768,)
transformer.h.1.ln_2.bias: (768,)
transformer.h.1.mlp.c_fc.weight: (768, 3072)
transformer.h.1.mlp.c_fc.bias: (3072,)
transformer.h.1.mlp.c_proj.weight: (