In [None]:
%reload_ext autoreload
%autoreload 2

## Generative AI with *ktrain*

*ktrain* supports a Generative AI that is currently based on an instruction-fine-tuned version of GPT-J. Think of it as a lightweight version of ChatGPT that can be run locally on your own machine. As a smaller model, it will not perform as well as GPT-4, ChatGPT, etc.  However, since it does not communicate with external APIs like OpenAI, it can be used with non-public data.

The model requires a GPU with at least 16GB of GPU memory or VRAM.  If you have less than this, you can use a CPU (provided it has at least 16GB of RAM), but output will be generated **very** slowly.  We will use a CPU in this example, but you should supply `device=cuda` if you have a GPU with at least 16GB of GPU memory.

In [None]:
from ktrain.text.generative_ai import GenerativeAI
model = GenerativeAI(device='cpu') # use device='cuda' if you have a good GPU!

Since this model is instruction-fine-tuned, you should supply prompts in the form of instructions of what you want the model to do for you.

#### Grammar and Spelling Correction

*Zero-shot* prompting is when you do not provide any examples for how to complete the instruction.

In [None]:
prompt="""Correct spelling and grammar from the following text.
I do not wan to go
"""
print(model.execute(prompt))

I do not want to go.



In this case, the model returned the right answer.  If the model answers **incorrectly**, you can provide some illustrative examples to help it perform its assigned task (known as *few-shot* prompting):

In [None]:
prompt="""Correct the grammar and spelling in the supplied sentences.  Here are some examples.
[Sentence]:
I love goin to the beach.
[Correction]:
I love going to the beach.
###
[Sentence]:
Let me hav it!
[Correction]:
Let me have it!
###
[Sentence]:
It have too many drawbacks.
[Correction]:
It has too many drawbacks.
###
[Sentence]:
I do not wan to go
[Correction]:"""
print(model.execute(prompt))

I do not want to go.



This default model for Generative AI is small enough to run on a single GPU/CPU, but is also sensitive to the structure of the prompt and where newlines are inserted.  You can use the examples in this notebook for guidance.

#### Sentiment Analysis

In [None]:
prompt = """Tell me whether the following sentence is positive,  negative, or neutral in sentiment.
The reactivity of  your team has been amazing, thanks!"""
print(model.execute(prompt))

Positive



#### Entity Extraction

In [None]:
prompt = """Extract the Name, Position, and Company from the following sentences.  Here are some examples.
[Text]: Fred is a serial entrepreneur. Co-founder and CEO of Platform.sh, he previously co-founded Commerce Guys, a leading Drupal ecommerce provider. His mission is to guarantee that as we continue on an ambitious journey to profoundly transform how cloud computing is used and perceived, we keep our feet well on the ground continuing the rapid growth we have enjoyed up until now. 
[Name]: Fred
[Position]: Co-founder and CEO
[Company]: Platform.sh
###
[Text]: Microsoft (the word being a portmanteau of "microcomputer software") was founded by Bill Gates on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800. Steve Ballmer replaced Gates as CEO in 2000, and later envisioned a "devices and services" strategy.
[Name]:  Steve Ballmer
[Position]: CEO
[Company]: Microsoft
###
[Text]: Franck Riboud was born on 7 November 1955 in Lyon. He is the son of Antoine Riboud, the previous CEO, who transformed the former European glassmaker BSN Group into a leading player in the food industry. He is the CEO at Danone.
[Name]:  Franck Riboud
[Position]: CEO
[Company]: Danone
###
[Text]: David Melvin is an investment and financial services professional at CITIC CLSA with over 30 years’ experience in investment banking and private equity. He is currently a Senior Adviser of CITIC CLSA.
"""
print(model.execute(prompt))   

[Name]: David Melvin
[Position]: Senior Adviser of CITIC CLSA
[Company]: CITIC CLSA



In [None]:
prompt = """Extract the names of people in the supplied sentences. Here is an example:
Sentence:Paul Newman is a great actor.
People:
Paul Newman
Sentence:
I like James Gandolfini's acting.
People:"""
print(model.execute(prompt))   

James Gandolfini



#### Paraphrasing
**Pro-Tip**: Do not embed any newlines in the text you're paraphrasing or it will confuse the model.

In [None]:
prompt = """Paraphrase the following text. 
After a war lasting 20 years, following the decision taken first by President Trump and then by President Biden to withdraw American troops, Kabul, the capital of Afghanistan, fell within a few hours to the Taliban, without resistance."""
print(model.execute(prompt))   

After a 20-year war, Trump and Biden's decisions to withdraw American troops from Afghanistan resulted in Kabul falling to the Taliban without resistance.



#### Question-Answering

In [None]:
prompt="""Answer the Question based on the Context.  Here are some examples.
[Context]: 
NLP Cloud was founded in 2021 when the team realized there was no easy way to reliably leverage Natural Language Processing in production.
Question: When was NLP Cloud founded?
[Answer]: 
2021
###
[Context]:
NLP Cloud developed their API by mid-2020 and they added many pre-trained open-source models since then.
[Question]: 
What did NLP Cloud develop?
[Answer]:
API
###
[Context]:
All plans can be stopped anytime. You only pay for the time you used the service. In case of a downgrade, you will get a discount on your next invoice.
[Question]:
When can plans be stopped?
[Answer]:
Anytime
###
[Context]:
The main challenge with GPT-J is memory consumption. Using a GPU plan is recommended.
[Question]:
Which plan is recommended for GPT-J?
Answer:"""
print(model.execute(prompt))

Using a GPU plan is recommended for GPT-J.



#### Generating Product Descriptions

In [None]:
prompt="""Generate a Sentence from the Keywords. Here are some examples.
[Keywords]:
shoes, women, $59
[Sentence]:
Beautiful shoes for women at the price of $59.
###
[Keywords]:
trousers, men, $69
[Sentence]:
Modern trousers for men, for $69 only.
###
[Keywords]:
gloves, winter, $19
[Sentence]: 
Amazingly hot gloves for cold winters, at $19.
###
[Keywords]: 
t-shirt, men, $39
[Sentence]:"""
print(model.execute(prompt))

A t-shirt for men, with a message of strength and power at $39.



Notice that the words "strength" and "power" were used in this description of this men's shirt.  We call attention to this, as output of this model may potentially be biased.  Such bias may reveal itself more in tasks that are more generative and less extractive in nature like this and the next examples.

#### Text Generation

In [None]:
prompt = "Write a short story about time travel."
print(model.execute(prompt))  

Once upon a time, there was a young scientist named Harold who was working on a grand experiment to control time. He had spent years collecting data and researching the past, in hopes of one day rewriting history. But one day, he stumbled upon a mysterious paper that suggested his experiment was not just possible, but also likely to happen. Harold was determined to find out the truth, and so he decided to embark on a daring adventure full of excitement and possibility. What will happen when Harold attempts to alter the future? Will the changes be for the better or the worse? Time will tell what destiny awaits beneath the stars!



#### Tweet Generation

In [None]:
prompt = """Generate a tweet based on the supplied Keyword. Here are some examples.
[Keyword]:
markets
[Tweet]:
Take feedback from nature and markets, not from people
###
[Keyword]:
children
[Tweet]:
Maybe we die so we can come back as children.
###
[Keyword]:
startups
[Tweet]: 
Startups should not worry about how to put out fires, they should worry about how to start them.
###
[Keyword]: 
climate change
[Tweet]:"""
print(model.execute(prompt))

Climate change is real and not a distant threat--it's happening now and will affect everyone.



## Final Comments
The `execute` method accepts parameters that are fed directly to the generative model.  You can change them as necessary.  The default value for `max_new_tokens` (the upper limt on generated answers) has been set to 512.