In [None]:
%reload_ext autoreload
%autoreload 2

## Generative AI with *ktrain*

As of v0.38.x, the `generative_ai` module in **ktrain** is now powered by our [OnPrem.LLM](https://github.com/amaiya/onprem) package.  The `generative_ai.LLM` class replaces the older `generative_ai.GenerativeAI` class. 

Think of the `generative_ai` module in **ktrain** as a lightweight version of ChatGPT that can be run locally on your own machine. Since it does not communicate with external APIs like OpenAI, it can be used with non-public data and within air-gapped networks (e.g., behind corporate firewalls). For lighter-weight deployments, you can also install and use **OnPrem.LLM** separately without the rest of **ktrain**, if you'd like:


```python
!pip install onprem
from onprem import LLM

```


By default, the `generative_ai.LLM` module uses a CPU. To speed up inference with a GPU (even a not-very-powerful one), you can simply perform [these steps](https://amaiya.github.io/onprem/#speeding-up-inference-using-a-gpu).


In this notebook, we will use an NVIDIA Titan V GPU with 12GB memory. 

In [None]:
from ktrain.text.generative_ai import LLM # or use "from onprem import LLM" instead
llm = LLM(n_gpu_layers=35)

ggml_init_cublas: found 2 CUDA devices:
  Device 0: NVIDIA TITAN V, compute capability 7.0
  Device 1: NVIDIA TITAN V, compute capability 7.0
llama_model_loader: loaded meta data with 19 key-value pairs and 291 tensors from /home/amaiya/onprem_data/Wizard-Vicuna-7B-Uncensored.Q4_K_M.gguf (version GGUF V2 (latest))
llama_model_loader: - tensor    0:                token_embd.weight q4_K     [  4096, 32000,     1,     1 ]
llama_model_loader: - tensor    1:              blk.0.attn_q.weight q4_K     [  4096,  4096,     1,     1 ]
llama_model_loader: - tensor    2:              blk.0.attn_k.weight q4_K     [  4096,  4096,     1,     1 ]
llama_model_loader: - tensor    3:              blk.0.attn_v.weight q6_K     [  4096,  4096,     1,     1 ]
llama_model_loader: - tensor    4:         blk.0.attn_output.weight q4_K     [  4096,  4096,     1,     1 ]
llama_model_loader: - tensor    5:            blk.0.ffn_gate.weight q4_K     [  4096, 11008,     1,     1 ]
llama_model_loader: - tensor    6:  

..................................................................................................
llama_new_context_with_model: n_ctx      = 2048
llama_new_context_with_model: freq_base  = 10000.0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init: offloading v cache to GPU
llama_kv_cache_init: offloading k cache to GPU
llama_kv_cache_init: VRAM kv self = 1024.00 MB
llama_new_context_with_model: kv self size  = 1024.00 MB
llama_new_context_with_model: compute buffer total size = 317.88 MB
llama_new_context_with_model: VRAM scratch buffer: 312.01 MB
llama_new_context_with_model: total VRAM used: 5156.94 MB (model: 3820.93 MB, context: 1336.01 MB)


Since this model is instruction-fine-tuned, you should supply prompts in the form of instructions of what you want the model to do for you.  

## Information Extraction

In [None]:
prompt = """Extract the Name, Position, and Company from the following sentences.  Here are some examples.
[Text]: Fred is a serial entrepreneur. Co-founder and CEO of Platform.sh, he previously co-founded Commerce Guys, a leading Drupal ecommerce provider. His mission is to guarantee that as we continue on an ambitious journey to profoundly transform how cloud computing is used and perceived, we keep our feet well on the ground continuing the rapid growth we have enjoyed up until now. 
[Name]: Fred
[Position]: Co-founder and CEO
[Company]: Platform.sh
###
[Text]: Microsoft (the word being a portmanteau of "microcomputer software") was founded by Bill Gates on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800. Steve Ballmer replaced Gates as CEO in 2000, and later envisioned a "devices and services" strategy.
[Name]:  Steve Ballmer
[Position]: CEO
[Company]: Microsoft
###
[Text]: Franck Riboud was born on 7 November 1955 in Lyon. He is the son of Antoine Riboud, the previous CEO, who transformed the former European glassmaker BSN Group into a leading player in the food industry. He is the CEO at Danone.
[Name]:  Franck Riboud
[Position]: CEO
[Company]: Danone
###
[Text]: David Melvin is an investment and financial services professional at CITIC CLSA with over 30 years’ experience in investment banking and private equity. He is currently a Senior Adviser of CITIC CLSA.
"""
saved_output = llm.prompt(prompt)

[Name]: David Melvin
[Position]: Senior Advisor
[Company]: CITIC CLSA

## Grammar and Spelling Correction

In [None]:
prompt="""Correct the grammar and spelling in the supplied sentences.  Here are some examples.
[Sentence]:
I love goin to the beach.
[Correction]: I love going to the beach.
[Sentence]:
Let me hav it!
[Correction]: Let me have it!
[Sentence]:
It have too many drawbacks.
[Correction]: It has too many drawbacks.
[Sentence]:
I do not wan to go
[Correction]:"""
saved_output = llm.prompt(prompt)

 I don't want to go.

## Classification

In [None]:
prompt="""Classify each sentence as either positive, negative, or neutral.  Here are some examples.
[Sentence]: I love going to the beach.
[[Classification]: Positive
[Sentence]: It is 10am right now.
[Classification]: Neutral
[Sentence]: I just got fired from my job.
[Classification]: Negative
[Sentence]: The reactivity of  your team has been amazing, thanks!
[Classification]:"""

saved_output = llm.prompt(prompt)

 Positive

## Paraphrasing and Summarization

In [None]:
prompt = """Paraphrase the following text delimited by triple backticks. 
```After a war lasting 20 years, following the decision taken first by President Trump and then by President Biden to withdraw American troops, Kabul, the capital of Afghanistan, fell within a few hours to the Taliban, without resistance.```
"""
saved_output = llm.prompt(prompt)

The original text reads as follows: "After a war lasting 20 years, following the decision taken first by President Trump and then by President Biden to withdraw American troops, Kabul, the capital of Afghanistan, fell within a few hours to the Taliban, without resistance."

## Few-Shot Answer Extraction

In [None]:
prompt="""Answer the Question based on the Context.  Here are some examples.
[Context]: 
NLP Cloud was founded in 2021 when the team realized there was no easy way to reliably leverage Natural Language Processing in production.
Question: When was NLP Cloud founded?
[Answer]: 
2021
###
[Context]:
NLP Cloud developed their API by mid-2020 and they added many pre-trained open-source models since then.
[Question]: 
What did NLP Cloud develop?
[Answer]:
API
###
[Context]:
All plans can be stopped anytime. You only pay for the time you used the service. In case of a downgrade, you will get a discount on your next invoice.
[Question]:
When can plans be stopped?
[Answer]:
Anytime
###
[Context]:
The main challenge with GPT-J is memory consumption. Using a GPU plan is recommended.
[Question]:
Which plan is recommended for GPT-J?
Answer:"""
saved_output = llm.prompt(prompt)

 The GPU plan.

## Generating Product Descriptions

In [None]:
prompt="""Generate a Sentence from the Keywords. Here are some examples.
[Keywords]:
shoes, women, $59
[Sentence]:
Beautiful shoes for women at the price of $59.
###
[Keywords]:
trousers, men, $69
[Sentence]:
Modern trousers for men, for $69 only.
###
[Keywords]:
gloves, winter, $19
[Sentence]: 
Amazingly hot gloves for cold winters, at $19.
###
[Keywords]: 
t-shirt, men, $39
[Sentence]:"""
saved_output = llm.prompt(prompt)


Comfortable and stylish t-shirts for men, starting from $39.

## Tweet Generation

In [None]:
prompt = """Generate a tweet based on the supplied Keyword. Here are some examples.
[Keyword]:
markets
[Tweet]:
Take feedback from nature and markets, not from people
###
[Keyword]:
children
[Tweet]:
Maybe we die so we can come back as children.
###
[Keyword]:
startups
[Tweet]: 
Startups should not worry about how to put out fires, they should worry about how to start them.
###
[Keyword]:
climate change
[Tweet]:"""
saved_output = llm.prompt(prompt)


Climate change is like a fire, we need to put it out before it becomes an uncontrollable blaze.

## Email Draft Generation

In [None]:
prompt = """Generate an email introducing Tesla to shareholders."""
saved_output = llm.prompt(prompt)

 
Dear Shareholder,
I am writing today as a fellow shareholder of Tesla and supporter of Elon Musk's mission to accelerate the world's transition to sustainable energy. 
As you may know, Tesla has achieved numerous milestones in this pursuit, including the launch of the Model S sedan, the introduction of the Powerwall home battery system, and most recently, the unveiling of the Model 3 electric vehicle. These achievements are just a few examples of Tesla's ongoing innovation and dedication to environmental sustainability.
Looking ahead, I believe that Tesla has an exciting future ahead as it continues to expand its product offerings, reach new markets, and increase production capacity. With the global climate crisis becoming an increasingly pressing issue, now more than ever we need companies like Tesla to lead the way in promoting sustainable solutions.
I am excited to be a part of this journey with you as a shareholder, and I encourage you to join me in supporting Tesla's continued e

## Talk to Your Documents

The `GenerativeAI` module also allows you to talk to your documents. Place your PDFs, TXT, Microsoft Word, or PowerPoint documents in a folder and point `GenerativeAI.ingest` to it. 

The default model used above is a heavily quantized 7B parameter model that may hallucinate on question-answering applications like this. Thus, we will use a slightly bigger and better model here by supplying the `use_larger=True` to `LLM`. (You may need to reload this noteobok and begin with the cell below.) 

In [None]:
from ktrain.text.generative_ai import LLM
llm = LLM(use_larger=True, n_gpu_layers=43)

You are about to download the LLM wizardlm-13b-v1.2.ggmlv3.q4_0.bin to the /home/amaiya/onprem_data folder. Are you sure? (Y/n) Y
[██████████████████████████████████████████████████]

Here, we'll download the **ktrain** paper from ArXiv and ask it a question. 

In [None]:
!wget --user-agent="Mozilla" https://arxiv.org/pdf/2004.10703.pdf -O /tmp/downloaded_paper.pdf -q
import tempfile, shutil, os

with tempfile.TemporaryDirectory() as tmpdirname:
    print('created temporary directory', tmpdirname)
    shutil.copyfile('/tmp/downloaded_paper.pdf', os.path.join(tmpdirname, 'downloaded_paper.pdf'))
    llm.ingest(tmpdirname)

created temporary directory /tmp/tmpu280u_gz
Creating new vectorstore
Loading documents from /tmp/tmpu280u_gz


Loading new documents: 100%|██████████████████████| 1/1 [00:00<00:00,  6.81it/s]


Loaded 9 new documents from /tmp/tmpu280u_gz
Split into 57 chunks of text (max. 500 tokens each)
Creating embeddings. May take some minutes...
Ingestion complete! You can now query your documents using the LLM.ask method


In [None]:
answer, sources = llm.ask('What is ktrain? Remember to only use the provided context.')

 ktrain is a low-code library that focuses on automating some aspects of the machine learning workflow, particularly for domain experts who may not be experienced in machine learning or software coding. It is designed to complement and augment human engineers rather than replace them, and it uses automation to facilitate the full ML workow from data curation and preprocessing to training, tuning, troubleshooting, and model application.

## Final Comments
The constructor for `GenerativeAI` accepts additional parameters such as `max_tokens` that controls the maxmum number of tokens to generate. You can adjust them, as necessary. 