<a href="https://colab.research.google.com/github/kkrueger/Redis-Workshops/blob/main/04-Large_Language_Model/04-Large_Language_Model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Large Language Models

![Redis](https://redis.com/wp-content/themes/wpx/assets/images/logo-redis.svg?auto=webp&quality=85,75&width=120)

In this notebook you'll be using two LLMs. OpenAI ChatGPT `gpt-3.5-turbo` and Self - hosted in - notebook `databricks/dolly-v2`. 

In [1]:
%pip -q install openai accelerate transformers sentence-transformers


[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/73.6 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m73.6/73.6 kB[0m [31m3.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m227.6/227.6 kB[0m [31m12.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.2/7.2 MB[0m [31m73.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m86.0/86.0 kB[0m [31m9.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m25.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m236.8/236.8 kB[0m [31m26.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.8/7.8 MB[0m [31m96.0 MB/s[0m eta [36m

Initialize OpenAI. You need to supply the OpenAI API key (starts with `sk-...`) when prompted. You can find your API key at https://platform.openai.com/account/api-keys

In [2]:
import openai
import os
import getpass

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY","")
if OPENAI_API_KEY == "":
    key=getpass.getpass(prompt='OpenAI Key: ', stream=None)
    os.environ['OPENAI_API_KEY']=key
    
openai.api_key = os.getenv("OPENAI_API_KEY")

OpenAI Key: ··········


Initialize `databricks/dolly-v2-3b` via [HuggingFace](https://huggingface.co/databricks/dolly-v2-3b). Multiple progressively more powerful models are available, including 3b, 7b and 12b (referring to Billions of parameters). `dolly-v2-3b` is the only model in the family that would fit in the memory and GPU available in a free Google Colab instance.

Loading and initializing the model can take few minutes.

In [3]:
import torch
from transformers import pipeline

dolly_completion = pipeline(model="databricks/dolly-v2-3b", 
                         torch_dtype=torch.bfloat16, 
                         trust_remote_code=True, 
                         device_map="auto")


Downloading (…)lve/main/config.json:   0%|          | 0.00/819 [00:00<?, ?B/s]

Downloading (…)instruct_pipeline.py:   0%|          | 0.00/9.16k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/databricks/dolly-v2-3b:
- instruct_pipeline.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


Downloading pytorch_model.bin:   0%|          | 0.00/5.68G [00:00<?, ?B/s]



Downloading (…)okenizer_config.json:   0%|          | 0.00/450 [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/2.11M [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/228 [00:00<?, ?B/s]

Helper function for OpenAI ChatGPT model

In [4]:
def openai_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response

# Create the prompt

Prompt contains instructions, context and the question. Feel free to experiment with the prompt and see the difference in responses from different models.

News article used in this example: https://www.cnn.com/2023/05/18/media/disney-florida-desantis/index.html

In [8]:
context = """

Disney on Thursday upped the ante in its battle with Florida’s Republican Gov. Ron DeSantis, and it cost his state 2,000 white-collar jobs.
Disney is scrapping plans to build a $1 billion office complex in Florida, citing “changing business conditions,” according to a memo provided by a Disney spokesperson.
The decision comes at a time when the company is openly feuding with DeSantis, who is expected to officially enter the 2024 GOP presidential race next week, CNN reported Thursday.
A spokesperson for DeSantis said it was “unsurprising” that Disney would cancel the project “given the company’s financial straits, falling market cap and declining stock price.”
Disney, along with the broader media industry, is grappling with a difficult advertising environment and a massive writers strike. Earlier this year it announced it would be cutting 7,000 jobs as part of a cost-cutting effort.
Separately, the company confirmed Thursday that it would shut down its Star Wars: Galactic Starcruiser resort at Disney World just over a year after it opened.
The popular attraction “will take its final voyage” at the end of September, Disney said, adding that it is working with guests to rebook reservations for later in the year.
"""

question="What plans Disney is cancelling?"

prompt = f"""
Instruction: Use only information in the following context to answer the question at the end. 
If you don't know, say that you do not know. 
 
Context:  {context}
 
Question: {question}
 
Response:
"""
print(prompt)

res = dolly_completion(prompt)
print("Dolly:")
print(res[0]['generated_text'])


res = openai_completion(prompt)
print("\nOpenAI:")
print(res.choices[0].message["content"])


Instruction: Use only information in the following context to answer the question at the end. 
If you don't know, say that you do not know. 
 
Context:  

Disney on Thursday upped the ante in its battle with Florida’s Republican Gov. Ron DeSantis, and it cost his state 2,000 white-collar jobs.
Disney is scrapping plans to build a $1 billion office complex in Florida, citing “changing business conditions,” according to a memo provided by a Disney spokesperson.
The decision comes at a time when the company is openly feuding with DeSantis, who is expected to officially enter the 2024 GOP presidential race next week, CNN reported Thursday.
A spokesperson for DeSantis said it was “unsurprising” that Disney would cancel the project “given the company’s financial straits, falling market cap and declining stock price.”
Disney, along with the broader media industry, is grappling with a difficult advertising environment and a massive writers strike. Earlier this year it announced it would be cu

## TODO:

Some ideas for you to try: 
- add "Respond in French/Spanish" to the prompt.

-  more information into the context until you hit the token limit of the model.

- Replace the entire prompt with a simple task like "Tell me about Newmarket, Ontario"

In [7]:
prompt = "Tell me about Newmarket, Ontario"

res = dolly_completion(prompt)
print("Dolly:")
print(res[0]['generated_text'])


res = openai_completion(prompt)
print("\nOpenAI:")
print(res.choices[0].message["content"])

Dolly:
Newmarket is a small town in Canada located at the northwest corner of Ontario. Newmarket is best known for thoroughbred horse racing and holds four first class racetracks including Woodbine, where a bid for the 2023 Summer Esteem of the World was made by the winner, Mukopia. Additionally, Newmarket has produced several Canadian and Olympic athletes including boxer Ray Arcel, and curler Kris Knowles.

OpenAI:
Newmarket is a town located in the Regional Municipality of York in Ontario, Canada. It is situated approximately 45 km north of Toronto and has a population of around 85,000 people. 

Newmarket is known for its historic downtown area, which features a variety of shops, restaurants, and cultural attractions. The town is also home to several parks and recreational facilities, including the Fairy Lake Park and the Newmarket Riverwalk Commons.

The town has a strong economy, with a focus on healthcare, technology, and manufacturing industries. It is also home to several educat