<a href="https://colab.research.google.com/github/bitanb1999/TalentSumoAI/blob/main/GPT_AI_Quote_Gen.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install transformers



In [2]:
#for reproducability
SEED = 34

#maximum number of words in output text
MAX_LEN = 20

###I. Intro
A language model is a machine learning model that can look at part of a sentence and predict the next word/sequence of words. Much like the autofill features on your iPhone/Android, GPT-2 is capable of next word prediction on a much larger and more sophisticated scale. For reference, the smallest available GPT-2 has 117 million parameters, whereas the largest one (invisible to the public) has over 1.5 billion parameters. The largest one available for public use is half the size of their main GPT-2 model

😊 Transformers makes it very easy to import this model with both PyTorch and TensorFlow - in this notebook we will be using TensorFlow but it is just as easy in PyTorch. Both the model and its Tokenizer can be imported from the transformers library that anyone can get by typing !pip install transformers. Let's see just how simple it is to generate text with a neural network. We begin with our input sequence:

In [3]:
input_sequence = "All our dreams can come true"

In [4]:
#get transformers
from transformers import AutoModelWithLMHead, AutoTokenizer

#get large GPT2 tokenizer and GPT2 model
tokenizer = AutoTokenizer.from_pretrained("nandinib1999/quote-generator")
GPT2 = AutoModelWithLMHead.from_pretrained("nandinib1999/quote-generator", pad_token_id=tokenizer.eos_token_id)
GPT2

Downloading:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/666 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/96.0 [00:00<?, ?B/s]



Downloading:   0%|          | 0.00/487M [00:00<?, ?B/s]

GPT2LMHeadModel(
  (transformer): GPT2Model(
    (wte): Embedding(50260, 768)
    (wpe): Embedding(1024, 768)
    (drop): Dropout(p=0.1, inplace=False)
    (h): ModuleList(
      (0): GPT2Block(
        (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (attn): GPT2Attention(
          (c_attn): Conv1D()
          (c_proj): Conv1D()
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dropout(p=0.1, inplace=False)
        )
        (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (mlp): GPT2MLP(
          (c_fc): Conv1D()
          (c_proj): Conv1D()
          (act): NewGELUActivation()
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
      (1): GPT2Block(
        (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (attn): GPT2Attention(
          (c_attn): Conv1D()
          (c_proj): Conv1D()
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dro

In [5]:
import requests

API_URL = "https://api-inference.huggingface.co/models/nandinib1999/quote-generator"
headers = {"Authorization": "Bearer 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"}

def query(payload):
	response = requests.post(API_URL, headers=headers, json=payload)
	return response.json()
	
output = query({
	"inputs": "All our dreams can come true",
})
output

[{'generated_text': 'All our dreams can come true. We must start with the conviction that dreams are real.'}]

In [6]:
from transformers import pipeline
generator = pipeline('text-generation',
                     model='huggingtweets/greatestquotes')
generator("Choose a job", num_return_sequences=30)

Downloading:   0%|          | 0.00/790 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/487M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/357 [00:00<?, ?B/s]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'Choose a job, get a good job, find a way. - Oprah Winfrey'},
 {'generated_text': 'Choose a job that pays well and keeps you interesting. - Andy Warhol'},
 {'generated_text': 'Choose a job that is rewarding, fun and rewarding. - Dale Carnegie'},
 {'generated_text': 'Choose a job and do the best for it. - Brian Tracy'},
 {'generated_text': 'Choose a job that matters and the rest of your life is spent helping people do the same." - Charles Buechner'},
 {'generated_text': 'Choose a job, study a book, work on something. - Henry Ford'},
 {'generated_text': 'Choose a job with lots of effort at first. Then start." - Mark Victor Hansen'},
 {'generated_text': "Choose a job that isn't filled by stress and then quit. — John Madden"},
 {'generated_text': 'Choose a job within you that interests you, not within yourself." - Dalai Lama'},
 {'generated_text': 'Choose a job you love and you will be happy. - James Allen'},
 {'generated_text': 'Choose a job well done. - Steve Jobs'},


In [9]:
import tensorflow as tf
from transformers import GPT2LMHeadModel, GPT2Tokenizer #importing the main model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2-large")
model = GPT2LMHeadModel.from_pretrained("gpt2-large", pad_token_id=tokenizer.eos_token_id)
sentence = 'Career motivation' #input sentence
input_ids = tokenizer.encode(sentence, return_tensors='pt')
output = model.generate(input_ids, max_length=15, num_beams=5, no_repeat_ngram_size=2, early_stopping=True)
print(tokenizer.decode(output[0], skip_special_tokens=True))#printing results
output = model.generate(input_ids, max_length=20, num_beams=5, no_repeat_ngram_size=2, early_stopping=True) #setting length as 500 to generate larger output text


Career motivation is one of the most important things you can do for your


In [8]:
print(tokenizer.decode(output[0], skip_special_tokens=True))
text = tokenizer.decode(output[0],skip_special_tokens = True)
with open('AIBLOG.txt','w') as f:
  f.write(text)

Career motivation is one of the most important things you can do to improve your game.




In [22]:
beam_outputs = model.generate(
    input_ids, 
    max_length = 20, 
    num_beams = 20, 
    temperature = .7,
    no_repeat_ngram_size = 2, 
    num_return_sequences = 20, 
    early_stopping = True
)

print('')
print("Output:\n" + 100 * '-')

# now we have 3 output sequences
for i, beam_output in enumerate(beam_outputs):
      print("{}: {}".format(i, tokenizer.decode(beam_output, skip_special_tokens=True)))


Output:
----------------------------------------------------------------------------------------------------
0: Career motivation is one of the most important things you can do to improve your game.


1: Career motivation is one of the most important things you can do for yourself and for your career.
2: Career motivation is one of the best things you can do for yourself and your career.


3: Career motivation is one of the most important things you can do to improve your performance.


4: Career motivation is one of the most important things you can do for yourself and your career.

5: Career motivation is one of the most important things you can do to improve your life.


6: Career motivation is one of the most important things you can do for yourself and for your team.
7: Career motivation is one of the most important things you can do for yourself and your career. It
8: Career motivation is one of the most important things you can do for yourself. If you want to
9: Career motivati

In [17]:
#combine both sampling techniques
sample_outputs = model.generate(
                              input_ids,
                              do_sample = True, 
                              max_length = 30,                              #to test how long we can generate and it be coherent
                              temperature = .9,
                              top_k = 50, 
                              top_p = 0.85, 
                              num_return_sequences = 5
)

print("Output:\n" + 100 * '-')
for i, sample_output in enumerate(sample_outputs):
    print("{}: {}...".format(i, tokenizer.decode(sample_output, skip_special_tokens = True)))
    print('')

Output:
----------------------------------------------------------------------------------------------------
0: Career motivation for young people.

In my work with young people, I have learned that young people are always on the lookout for opportunities....

1: Career motivation was an essential part of his life.

"He has never been one to rest on his laurels, he never really looked...

2: Career motivation to be competitive is not about trying to get to the top of the mountain; it's about getting to the top of the mountain and...

3: Career motivation for men is to achieve financial security and power over others. This means having money, power, and control over others.

This...

4: Career motivation was the biggest variable among the top ten factors, with an average of 7.5 points for the top ten factors, compared with 6...

