# Parrot-GPT

Install PyTorch and HuggingFace Transformers.

In [1]:
!conda install pytorch torchvision torchaudio transformers -c pytorch -c nvidia -c apple -c huggingface -y

Retrieving notices: ...working... done
Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.



Import needed packages.

In [3]:
from transformers import GPTNeoForCausalLM, GPT2Tokenizer
import torch

Now lets get the model.  We can either run the 1.3 billion paramater model or the 2.7 billion parameter model. Lets do the 2.7B model, which is "EleutherAI/gpt-neo-2.7B".  The 1.3B model is "EleutherAI/gpt-neo-1.3B".

In [7]:
model_name = "EleutherAI/gpt-neo-2.7B"
model = GPTNeoForCausalLM.from_pretrained(model_name)

This model can be ran on a GPU, but does not have to be. The 2.7B model takes slightly less than 13 GB of Vram.  The 1.3B model takes slighly less than 7.5GB of Vram.  The model will be placed on the GPU if there is one and if there is enough Vram.

Install pynvml to take a look at how much VRAM we have.

In [4]:
!pip install pynvml



In [8]:
free_vram = 0.0
if torch.cuda.is_available():
    from pynvml import *
    nvmlInit()
    h = nvmlDeviceGetHandleByIndex(0)
    info = nvmlDeviceGetMemoryInfo(h)
    free_vram = info.free/1048576000
    print("There is a GPU with " + str(free_vram) + "GB of free VRAM")

In [9]:
if model_name == "EleutherAI/gpt-neo-2.7B" and free_vram>13.5:
    use_cuda = True
    model.to("cuda:0")
elif model_name == "EleutherAI/gpt-neo-1.3B" and free_vram>7.5:
    use_cuda = True
    model.to("cuda:0")
else:
    use_cuda = False

Load the tokenizer to prepare the input for GPT3.

In [10]:
tokenizer = GPT2Tokenizer.from_pretrained(model_name)

Enter your prompt.

In [11]:
prompt = str(input("Please enter a prompt: "))

Please enter a prompt: What is the meaning of life?


In [12]:
output_length = int(input("How long should the generated output be? "))

How long should the generated output be? 200


Tokenize the input prompt to prepare it for use with the model.

In [13]:
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
if use_cuda:
    input_ids = input_ids.cuda()

In [14]:
gen_tokens = model.generate(input_ids, do_sample=True, temperature=0.9, max_length=output_length)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In [15]:
gen_text = tokenizer.batch_decode(gen_tokens)[0]
print(gen_text)

What is the meaning of life?

If the meaning of life can be said to be in its own right, then it's difficult to deny its existence. Without question, life itself is the universe and everything in it, with the exception of something like death, which is simply our realization of the unendingness of the universe and its impermanence. For some people, such as the Hindus, the most meaningful thing to them is to serve others in whatever form they can (Hinduism is a very ancient religion that is still growing and evolving). This was how I thought it should be. The more that I experienced myself, as a Buddhist and as a human being, however, the more I realized I didn't really like how things were. I didn't like the way the world was making me. I was tired of the way the world was making me. I wanted to change, especially for the better.

This isn't something new to me. I've always
