First lets install the correct packages for GPT3.  We are already in the conda environment from jupyter.

First lets install pytorch.

In [1]:
!conda install pytorch torchvision torchaudio -c pytorch -c nvidia -c apple -y

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.



Now lets install HuggingFace.  It makes using popular Tranformers MUCH easier.

In [2]:
!conda install -c huggingface transformers -y

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.



Lets import the needed packages now

In [3]:
from transformers import GPTNeoForCausalLM, GPT2Tokenizer
import torch

Now lets get the model.  We can either run the 1.3 billion paramater model or the 2.7 billion parameter model. Lets do the 2.7B model, which is "EleutherAI/gpt-neo-2.7B".  The 1.3B model is "EleutherAI/gpt-neo-1.3B"

In [4]:
model_name = "EleutherAI/gpt-neo-2.7B"
model = GPTNeoForCausalLM.from_pretrained(model_name)

This model can be ran on a GPU, but does not have to be. The 2.7B model takes slightly less than 13 GB of Vram.  The 1.3B model takes slighly less than 7.5GB of Vram.  The model will be placed on the GPU if there is one and if there is enough Vram.

Lets install pynvml to take a look at how much VRAM we have.

In [5]:
!pip install pynvml



In [6]:
free_vram = 0.0
if torch.cuda.is_available():
    from pynvml import *
    nvmlInit()
    h = nvmlDeviceGetHandleByIndex(0)
    info = nvmlDeviceGetMemoryInfo(h)
    free_vram = info.free/1048576000
    print("There is a GPU with " + str(free_vram) + "GB of free VRAM")

In [7]:
if model_name == "EleutherAI/gpt-neo-2.7B" and free_vram>13.5:
    use_cuda = True
    model.to("cuda:0")
elif model_name == "EleutherAI/gpt-neo-1.3B" and free_vram>7.5:
    use_cuda = True
    model.to("cuda:0")
else:
    use_cuda = False

Now we need to load the tokenizer to prepare the input for GPT3

In [8]:
tokenizer = GPT2Tokenizer.from_pretrained(model_name)

We are almost done. At this point we need to decide what prompt we need to decide what prompt we want the model to continue, as well a how long we want the generated output to be.

In [9]:
prompt = str(input("Please enter a prompt: "))

Please enter a prompt: Is there a God?


In [10]:
output_length = int(input("How long should the generated output be? "))

How long should the generated output be? 100


We now need to tokenize the input prompt to prepare it for use with the model.  If we are using a GPU we will put it on the GPU as well.

In [11]:
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
if use_cuda:
    input_ids = input_ids.cuda()

In [12]:
gen_tokens = model.generate(input_ids, do_sample=True, temperature=0.9, max_length=output_length)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In [13]:
gen_text = tokenizer.batch_decode(gen_tokens)[0]
print(gen_text)

Is there a God? (and what about science?) A series of articles on whether Jesus Christ is a “god” and what science can tell us about Jesus. I believe there is a God. I’m a Christian, and I believe in Jesus. I believe science can only inform us if Jesus Christ is not.

I have been accused of being a Christian apologist. I respond:

Apologists for the supernatural are not Christians. They believe in theism
