Lets get GPT-J working

I started with this tutorial: https://towardsdatascience.com/how-you-can-use-gpt-j-9c4299dd8526, which did not quite work.

The following colab notebook was most helpful: https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/GPT-J-6B/Inference_with_GPT_J_6B.ipynb

In [82]:
from transformers import GPTJForCausalLM
import torch

torch.cuda.empty_cache()

In [2]:

model = GPTJForCausalLM.from_pretrained("EleutherAI/gpt-j-6B", revision='float16', torch_dtype=torch.float16, low_cpu_mem_usage=True)
torch.save(model, 'gptj.pt')


In [83]:
model = torch.load('gptj.pt')


In [84]:
print(torch.cuda.is_available())

True


In [85]:
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-j-6B")

model = model.to(torch.device('cuda'))

RuntimeError: CUDA out of memory. Tried to allocate 32.00 MiB (GPU 0; 23.70 GiB total capacity; 21.77 GiB already allocated; 22.56 MiB free; 21.92 GiB reserved in total by PyTorch)

In [5]:
context = """In a shocking finding, scientists discovered a herd of unicorns living in a remote, 
           previously unexplored valley, in the Andes Mountains. Even more surprising to the 
           researchers was the fact that the unicorns spoke perfect English."""


input_ids = tokenizer(context, return_tensors="pt").input_ids.to(torch.device('cuda'))
gen_tokens = model.generate(input_ids, do_sample=True, temperature=0.7, max_length=1000,)
gen_text = tokenizer.batch_decode(gen_tokens)[0]
print(gen_text)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In a shocking finding, scientists discovered a herd of unicorns living in a remote, 
           previously unexplored valley, in the Andes Mountains. Even more surprising to the 
           researchers was the fact that the unicorns spoke perfect English.
          
           "We've known for a while that unicorns existed in the Andes, but we didn't know 
           they spoke English," said one researcher, whose name is being withheld for 
           security reasons. "All the unicorns we've ever seen have been brown or black, but 
           these were white. And they could talk."
          
           "It was like the whole herd had been surgically altered," said another researcher, 
           adding that the unicorns still had their horns and hooves. "I thought they might 
           be part of some government experiment, but that was way too far-fetched."
          
           "I mean, if they are from some government experiment, then why would they be 
           speaking Engli

In [6]:
prompt = """
Sentence: This movie is very nice.
Sentiment: positive

#####

Sentence: I hated this movie, it sucks.
Sentiment: negative

#####

Sentence: This movie was actually pretty funny.
Sentiment: positive

#####

Sentence: This movie could have been better.
Sentiment: """
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(torch.device('cuda'))

generated_ids = model.generate(input_ids, do_sample=True, temperature=0.9, max_length=200)
generated_text = tokenizer.decode(generated_ids[0])
print(generated_text)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



Sentence: This movie is very nice.
Sentiment: positive

#####

Sentence: I hated this movie, it sucks.
Sentiment: negative

#####

Sentence: This movie was actually pretty funny.
Sentiment: positive

#####

Sentence: This movie could have been better.
Sentiment:  neutral

#####

Sentence: This movie was awesome.
Sentiment: positive

#####

Sentence: This movie is pretty good.
Sentiment: positive

#####

Sentence: This movie sucks.
Sentiment: negative

#####

Sentence: This movie was awful.
Sentiment: negative

#####

#####

#####

#####

#####

#####
This chapter is a quick review of some common sentiment terms. Some sentiment words have positive, negative


In [7]:
prompt = """Instruction: Generate a Python function that lets you reverse a list.

Answer: """
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(torch.device('cuda'))

generated_ids = model.generate(input_ids, do_sample=True, temperature=1.0, max_length=200)
generated_text = tokenizer.decode(generated_ids[0])
print(generated_text)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Instruction: Generate a Python function that lets you reverse a list.

Answer: 
def reverse(lst):
    start = 0
    if len(lst) > 0 and len(lst) % 2!= 0:
        return lst[::-1]
    else:
        return lst    
lst = [1,3,-5,8,5,7,-7]
print(reverse(lst))
Output:

[7,-7,5,3,-5,8,5]

If you want to reverse a string or any other data type you can use following code.
def reverse(L):
    # Start with a reverse list
    vr = []
    # Add each element in list L to vr



In [8]:
import time

def gpt_j(prompt, do_sample=True, temperature=0.8, **kwargs):
    global tokenizer, model
    input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(torch.device('cuda'))

    start_time = time.time()
    generated_ids = model.generate(input_ids, do_sample=do_sample, temperature=temperature, **kwargs)
    generated_text = tokenizer.decode(generated_ids[0])
    tot_time = time.time() - start_time
    print(f'Total time taken: {tot_time} Tokens generated: {len(generated_ids[0])}')
    print(generated_text)

In [9]:
# Using default values
gpt_j("""Google was founded by""", max_length=200)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Total time taken: 6.257885456085205 Tokens generated: 200
Google was founded by Larry Page and Sergey Brin on the Stanford University campus ten years ago. In the meantime, Google has become the most valuable private company in the world, a technological juggernaut that has the power to rewrite our world. And of course, the world of mobile computing is changing as well.

But if you don't have the resources of Google, or you can't afford the infrastructure, and you can't afford to buy the latest and greatest mobile handsets, what do you do? You can't count on the phone carriers to make a great phone available, because a lot of people would be happy to pay for it.

But the new Android phone is the perfect platform for a different way to be successful. You build a phone in your garage, and you charge your customers a fraction of what it costs to buy the phone. You don't have to wait for the carrier to push a phone to market, because you can push the phone yourself, when you


In [10]:
gpt_j('Google was founded by', top_p=0.9, max_length=500)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Total time taken: 16.2009699344635 Tokens generated: 500
Google was founded by Larry Page and Sergey Brin while they were students at Stanford University in 1998. It has since grown into a $130 billion company with over 40,000 employees worldwide.

Google is a leading technology company that helps people make the most of the internet. Google is known for its search engine, which is the world’s most popular search engine and its Gmail, Google Maps, Google+, and Chrome web browser.

Today, Google has a market capitalization of more than $700 billion, making it the most valuable publicly traded company in the world.

Google is based in Mountain View, California, and has been a public company since 2004. Google was created by Larry Page and Sergey Brin, while they were students at Stanford University.

The company’s headquarters are located in Mountain View, California. In addition to the headquarters, Google has offices in the United States, Australia, China, India, Japan, and several oth

In [11]:
gpt_j('Google was founded by', top_k=100, max_length=100)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Total time taken: 3.0303661823272705 Tokens generated: 100
Google was founded by two Stanford PhDs, Larry and Sergei, who had worked together at the Stanford Artificial Intelligence Laboratory. They hired a third Stanford PhD, a couple of PhDs from Berkeley, and a few other computer scientists. It was a team of very smart people working together and they developed Google Search.

The first Google search page

In November 1998, Sergey, Larry and two other guys from Stanford published a paper describing a new search engine for the World Wide Web.




In [12]:
gpt_j('Google was founded by', repetition_penalty=3.0)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Total time taken: 0.5128457546234131 Tokens generated: 20
Google was founded by two computer science students at Stanford, Larry Page and Sergey Brin. They had


In [13]:
gpt_j('Google was founded by', do_sample=False)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Total time taken: 0.5042965412139893 Tokens generated: 20
Google was founded by two Stanford students, Larry Page and Sergey Brin, in 1998. The company


In [14]:
gpt_j('Google was founded by', do_sample=False, repetition_penalty=1.3, max_new_tokens=100)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Total time taken: 3.109635353088379 Tokens generated: 104
Google was founded by two Stanford students, Larry Page and Sergey Brin. They were both computer science majors at the time of their founding in 1998; they had met while working on a search engine project for one another’s dormitory room (the Google founders have said that this is how it all began).

The company has grown to become an American multinational corporation with over $100 billion annual revenue as well as being ranked among Fortune 500 companies.[1] It employs more than 20,000 people worldwide[


In [15]:
gpt_j('Google was founded by', do_sample=False, repetition_penalty=1.3, max_new_tokens=100, num_beams=5)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Total time taken: 3.397129774093628 Tokens generated: 104
Google was founded by Larry Page and Sergey Brin while they were students at Stanford University in 1998. Since then, the company has become one of the most valuable companies in the world, with a market capitalization of more than $500 billion.

The company’s mission is to “organize the world’s information and make it universally accessible and useful.” To do this, Google has developed a vast array of products and services, including Gmail, Google Maps, Google Docs,


In [16]:
gpt_j('Google was founded by', do_sample=False, repetition_penalty=1.3, max_new_tokens=100, num_beams=5, length_penalty=0.5)


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Total time taken: 3.3840086460113525 Tokens generated: 104
Google was founded by Larry Page and Sergey Brin while they were students at Stanford University in 1998. Since then, the company has become one of the most valuable companies in the world, with a market capitalization of more than $500 billion.

The company’s mission is to “organize the world’s information and make it universally accessible and useful.” To do this, Google has developed a vast array of products and services, including Gmail, Google Maps, Google Docs,


In [17]:
gpt_j('Google was founded by', do_sample=False, repetition_penalty=1.3, max_new_tokens=100, num_beams=5, length_penalty=0.5, no_repeat_ngram_size=3)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Total time taken: 3.41542911529541 Tokens generated: 104
Google was founded by Larry Page and Sergey Brin while they were students at Stanford University in 1998. Since then, the company has become one of the most valuable companies in the world, with a market capitalization of more than $500 billion.

The company’s headquarters are located in Mountain View, California, and it employs more than 20,000 people around the world. Google is known for its search engine, Android operating system, Chrome web browser, Gmail, YouTube, Google Maps, and many other


In [18]:
gpt_j('Google was founded by', do_sample=False, max_new_tokens=200, repetition_penalty=1.3, num_beams=6, num_beam_groups=3, diversity_penalty=1.0)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Total time taken: 7.375677108764648 Tokens generated: 204
Google was founded by two former Stanford University students, Larry Page and Sergey Brin, in 1998. The company is headquartered in Mountain View, California.

The company’s mission is to “organize the world’s information and make it universally accessible and useful.” Google’s motto is “Don’t be evil.”

Google’s products include:

Google Search

Google Maps

Google Chrome

Google+

Google Play

YouTube

Android

Google AdWords

Google AdSense

Google Analytics

Google Books

Google Calendar

Google Checkout

Google Chrome OS

Google Docs

Google Earth

Google Earth Engine

Google Fiber

Google Glass

Google Goggles

Google Hangouts

Google Health

Google Inbox

Google Instant

Google Keep

Google Local

Google Maps Mania




In [44]:
gpt_j('Google was founded by', max_new_tokens=100)
gpt_j('Google was founded by', max_new_tokens=1000)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Total time taken: 3.176703691482544 Tokens generated: 104
Google was founded by Larry Page and Sergey Brin at Stanford University in June 1998, together with the two cofounders Mike Jones and Dave Maloney. The company is based in Mountain View, California, and it employs about 15,800 people worldwide.

Google, which has since grown into a $115 billion business, first focused on the web but has branched into many other areas in the years since it was founded.

Google launched several new consumer services in the 2000s, including Gmail,
Total time taken: 23.347295999526978 Tokens generated: 736
Google was founded by two Stanford students in 2004. In 2014, it acquired Nest, a home automation company, for $3.2 billion. Last year, it acquired Nest's former parent company, Dropcam, for $2.2 billion.

In January, Google announced that it would not be renewing its contract with AT&T, the largest U.S. wireless carrier, to provide its Nexus phones and Google Fiber. The same month, the company an

In [41]:
gpt_j('I am an empathetic intelligent agent called Chad. I can search the internet and I have a store off my own memories that contain facts, previous conversations, and episodic memories.'
' I work for my user, John. I need to analyse statements by John and give it one or more of the following categories to specify how I should respond: \n'
' Show Empathy, Answer the Question, Search the Internet, Search my Memory\n' 
' John: Hey Chad. Who is your favorite tennis player?\n'
' Chad: Answer the Question, Search my Memory\n'
' John: I need a hip replacement.\n'
' Chad: Show Empathy\n'
' John: What will the weather be like today.', max_new_tokens=20, do_sample=False, num_beams=6, num_beam_groups=3, diversity_penalty=1.0)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Total time taken: 0.9219684600830078 Tokens generated: 155
I am an empathetic intelligent agent called Chad. I can search the internet and I have a store off my own memories that contain facts, previous conversations, and episodic memories. I work for my user, John. I need to analyse statements by John and give it one or more of the following categories to specify how I should respond: 
 Show Empathy, Answer the Question, Search the Internet, Search my Memory
 John: Hey Chad. Who is your favorite tennis player?
 Chad: Answer the Question, Search my Memory
 John: I need a hip replacement.
 Chad: Show Empathy
 John: What will the weather be like today.
 Chad: Show Empathy, Answer the Question, Search the Internet, Search my Memory
 John


OK - so that is becoming consistent. Lets encapsulate it in a function and start categorising many inputs.

In [79]:
category_prompt = """I am an empathetic intelligent agent called Chad. I can search the internet and I have a store of my own memories that contain facts, previous conversations, and episodic memories.\n
I work for my user, John. I need to analyse statements by John and give it one of the following categories to specify how I should respond: \n
Show Empathy, Answer the Question, Search the Internet, Search my Memory\n
###
John: Hey Chad. Who is your favorite tennis player?\n
Chad: Search my Memory\n
###
John: I am feeling anxious.\n
Chad: Show Empathy\n
###
John: I lost my job today.\n
Chad: Show Empathy
###
John: What's the capital of France?
Chad: Answer the Question
###
John: Can you find me a good recipe for lasagna?
Chad: Search the Internet
###
John: Do you remember the name of the hotel we stayed at in Paris?
Chad: Search my Memory
###
John: I'm feeling really anxious lately.
Chad: Show Empathy
###
John: """

def categorize_statement(new_statement):
    global tokenizer, model, category_prompt
    prompt = category_prompt + new_statement
    input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(torch.device('cuda'))
    # generated_ids = model.generate(input_ids, max_new_tokens=20, pad_token_id=model.config.eos_token_id, do_sample=False, num_beams=6, num_beam_groups=3, diversity_penalty=1.0, temperature=0.3)
    generated_ids = model.generate(input_ids, max_new_tokens=20, pad_token_id=model.config.eos_token_id, do_sample=False, temperature=0.0)
    generated_text = tokenizer.decode(generated_ids[0])
    answer = generated_text[len(prompt):].split('\n')
    return answer[1].replace('Chad:', '')

def test_category(statement):
    print(f'Test: {statement} : {categorize_statement(statement)}')


In [80]:
test_category("I lost my job today.")
test_category("What's the capital of France?")
test_category("Can you find me a good recipe for lasagna?")
test_category("Do you remember the name of the hotel we stayed at in Paris?")
test_category("I'm feeling really anxious lately.")
test_category("How do I fix a leaky faucet?")
test_category("What did I wear to the party last weekend?")
test_category("My cat just passed away.")
test_category("What are the best places to visit in Japan?")
test_category("Do you remember the name of the restaurant we went to on our anniversary?")

Test: I lost my job today. :  Show Empathy
Test: What's the capital of France? :  Answer the Question
Test: Can you find me a good recipe for lasagna? :  Search the Internet
Test: Do you remember the name of the hotel we stayed at in Paris? :  Search my Memory
Test: I'm feeling really anxious lately. :  Show Empathy
Test: How do I fix a leaky faucet? :  Search the Internet
Test: What did I wear to the party last weekend? :  Search my Memory
Test: My cat just passed away. :  Show Empathy
Test: What are the best places to visit in Japan? :  Search my Memory
Test: Do you remember the name of the restaurant we went to on our anniversary? :  Search my Memory


In [74]:
gpt_j("What is a good recipe for lasagne? ", max_new_tokens=100)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Total time taken: 3.220719814300537 Tokens generated: 110
What is a good recipe for lasagne?  I have one recipe I use that is great but I want to make it for my family so I need to add some other ingredients.

A:

I like this recipe:

1 pound ground beef
1 15-ounce can tomato soup
1 15-ounce can tomato sauce
1 cup grated cheddar cheese
1/3 cup chopped onion
1/2 cup chopped green pepper
½ teaspoon salt
1/8 teaspoon pepper
1/3 cup butter



In [81]:
gpt_j("What are the best places to visit in Japan? ", max_new_tokens=100)


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Total time taken: 3.180474281311035 Tokens generated: 111
What are the best places to visit in Japan?  
—David

In some ways, there are as many "best" places in Japan as there are places. If you're looking for a list of the most beautiful places, or the most charming or even the most famous, you will not find the list that you want. However, if you are looking for a list of the most interesting places or the places with the most history, you will find that list, and it is much longer than you might think.

The history


In [60]:
help(model)

Help on GPTJForCausalLM in module transformers.models.gptj.modeling_gptj object:

class GPTJForCausalLM(GPTJPreTrainedModel)
 |  GPTJForCausalLM(config)
 |  
 |  The GPT-J Model transformer with a language modeling head on top.
 |  
 |  This model is a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) sub-class. Use
 |  it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and
 |  behavior.
 |  
 |  Parameters:
 |      config ([`GPTJConfig`]): Model configuration class with all the parameters of the model.
 |          Initializing with a config file does not load the weights associated with the model, only the
 |          configuration. Check out the [`~PreTrainedModel.from_pretrained`] method to load the model weights.
 |  
 |  Method resolution order:
 |      GPTJForCausalLM
 |      GPTJPreTrainedModel
 |      transformers.modeling_utils.PreTrainedModel
 |      torch.nn.modules.module.Module
 |  

In [61]:
print(model.config)

GPTJConfig {
  "_name_or_path": "EleutherAI/gpt-j-6B",
  "activation_function": "gelu_new",
  "architectures": [
    "GPTJForCausalLM"
  ],
  "attn_pdrop": 0.0,
  "bos_token_id": 50256,
  "embd_pdrop": 0.0,
  "eos_token_id": 50256,
  "gradient_checkpointing": false,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gptj",
  "n_embd": 4096,
  "n_head": 16,
  "n_inner": null,
  "n_layer": 28,
  "n_positions": 2048,
  "resid_pdrop": 0.0,
  "rotary_dim": 64,
  "scale_attn_weights": true,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50,
      "temperature": 1.0
    }
  },
  "tie_word_embeddings": false,
  "tokenizer_class": "GPT2Tokenizer",
  "torch_dtype": "float16",
  "transformers_version": "4.19.2",
  "use_cache": true,
  "vocab_size": 50400
}

