<a href="https://colab.research.google.com/github/Redislabs-Solution-Architects/Redis-Workshops/blob/main/04-Large_Language_Model/04.02_Large_Language_Model_Google.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Large Language Models

![Redis](https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120)

In this notebook you'll be sending prompts to LLMs programatically. Google's `gemini-pro` and (optionally) self-hosted in-notebook `databricks/dolly-v2`.

In [1]:
# Install dependencies
!pip -q install google-generativeai

# Uncomment if you installing the Dolly model locally
# !pip -q install accelerate transformers sentence-transformers tiktoken

## Authenticate notebook to Google Cloud API
You can get your Google Cloud API key at https://console.cloud.google.com/apis/credentials

For security reason we recomment to restrict the key to allow only Generative Language API

***Note - If you are participating in a workshop, your instructor should provide you with an API key.***

In [2]:
import getpass
import os

if "GOOGLE_API_KEY" not in os.environ:
    os.environ["GOOGLE_API_KEY"] = getpass.getpass("Provide your Google API Key: ")

Provide your Google API Key: ··········


Test that we have access to Google APIs by requesting the list of models

In [3]:
import google.generativeai as genai

for m in genai.list_models():
  if 'generateContent' in m.supported_generation_methods:
    print(m.name)

models/gemini-1.0-pro-latest
models/gemini-1.0-pro
models/gemini-pro
models/gemini-1.0-pro-001
models/gemini-1.0-pro-vision-latest
models/gemini-pro-vision
models/gemini-1.5-pro-latest
models/gemini-1.5-pro-001
models/gemini-1.5-pro-002
models/gemini-1.5-pro
models/gemini-1.5-pro-exp-0801
models/gemini-1.5-pro-exp-0827
models/gemini-1.5-flash-latest
models/gemini-1.5-flash-001
models/gemini-1.5-flash-001-tuning
models/gemini-1.5-flash
models/gemini-1.5-flash-exp-0827
models/gemini-1.5-flash-002
models/gemini-1.5-flash-8b
models/gemini-1.5-flash-8b-001
models/gemini-1.5-flash-8b-latest
models/gemini-1.5-flash-8b-exp-0827
models/gemini-1.5-flash-8b-exp-0924


***Note - This step is optional. Uncomment and run this cell if you do not have a Google Cloud API key or would like to compare results from Gemini and a self-hosted model***

Initialize `databricks/dolly-v2-3b` via [HuggingFace](https://huggingface.co/databricks/dolly-v2-3b). Multiple progressively more powerful models are available, including 3b, 7b and 12b (referring to Billions of parameters). `dolly-v2-3b` is the only model in the family that would fit in the memory and GPU available in a free Google Colab instance.

Loading and initializing the model can take few minutes.

In [4]:
# Skip dolly initialization
# import torch
# from transformers import pipeline

# dolly_completion = pipeline(model="databricks/dolly-v2-3b",
#                         torch_dtype=torch.bfloat16,
#                         trust_remote_code=True,
#                         device_map="auto")

Helper function for Google Gemini model

In [5]:
model = genai.GenerativeModel('gemini-pro')
def gemini_completion(prompt):
    response = model.generate_content(prompt)
    return response

# Create the prompt

Prompt contains instructions, context and the question. Feel free to experiment with the prompt and see the difference in responses from different models.

News article used in this example: https://www.cnn.com/2024/10/04/entertainment/lady-gaga-bravado-harley-quinn-joker-sequel/index.html

In [6]:
# Specify the content, question, and propmt to be passed to the LLM
context = """
With Friday’s release of “Joker: Folie à Deux,” Lady Gaga will introduce audiences to Lee Quinzel, her version of the Joker’s love interest Harley Quinn from the DC comic books.
In an interview with CNN, Gaga explained just how much she had to change her typical approach to performing for the movie, which does double duty as a a somewhat pared-down musical and a dark love story between Joker, a.k.a. Arthur Fleck (Joaquin Phoenix), and Gaga’s character.
“All the showmanship, all the bravado of being on stage, all the things that I do as Gaga that come naturally to me, I tried to completely do away with those things,” Gaga, whose real name is Stefani Germanotta, said.

The 13-time Grammy-winning singer, who also won an Oscar for her original song “Shallow” from the 2018 film “A Star is Born,” even unlearned singing techniques for her latest role.

“I tried to breathe incorrectly,” she shared, later adding that she “tried not to do any vocal placement in my throat the way that I would when I’m on stage.”
The result is a character with a breathy and understated singing technique who matches Phoenix’s similar singing style in their duets. They sync up in other ways too, as she feels quite at home with violence and arson alongside the at-times homicidal Joker, as seen in the movie.
“Folie à Deux” is the sequel to 2019’s “Joker,” which saw Phoenix take home the Academy Award for best lead actor. The movie made over $1 billion at the global box office, which in part paved the way for this somewhat unlikely sequel.
“Joker: Folie à Deux” is in theaters on Friday. It’s produced by Warner Bros. Pictures, which like CNN is owned by Warner Bros. Discovery.
"""

question = "Who plays Harley Quinn in the movie?"

prompt = f"""
Instruction: Use only information in the following context to answer the question at the end.
If you don't know, say that you do not know.

Context:  {context}

Question: {question}

Response:
"""

print(prompt)

# Uncomment if you installed the dolly model locally and would like to compare results
#res = dolly_completion(prompt)
#print("Dolly:")
#print(res[0]['generated_text'])

res = gemini_completion(prompt)
print("\nGemini:")
print(res.candidates[0].content.parts[0].text)


Instruction: Use only information in the following context to answer the question at the end.
If you don't know, say that you do not know.

Context:  
With Friday’s release of “Joker: Folie à Deux,” Lady Gaga will introduce audiences to Lee Quinzel, her version of the Joker’s love interest Harley Quinn from the DC comic books.
In an interview with CNN, Gaga explained just how much she had to change her typical approach to performing for the movie, which does double duty as a a somewhat pared-down musical and a dark love story between Joker, a.k.a. Arthur Fleck (Joaquin Phoenix), and Gaga’s character.
“All the showmanship, all the bravado of being on stage, all the things that I do as Gaga that come naturally to me, I tried to completely do away with those things,” Gaga, whose real name is Stefani Germanotta, said.

The 13-time Grammy-winning singer, who also won an Oscar for her original song “Shallow” from the 2018 film “A Star is Born,” even unlearned singing techniques for her late

## TODO:
Play around with basic prompts to see what kind of replies you can generate.

Some ideas for you to try:
- Ask the same question as above, but without the context

- Add "Respond in French/Spanish" to the prompt.

- Add more information into the context until you hit the token limit of the model.

- Check system memory: e.g. "What was my last question?"

- Provoke hallucinations, e.g. "What was the name of the first elephant to walk on the moon?"

In [7]:
prompt = "Who plays Harley Quinn in the movie?"

# Uncomment if you installed the dolly model locally and would like to compare results
# res = dolly_completion(prompt)
# print("Dolly:")
# print(res[0]['generated_text'])

res = gemini_completion(prompt)
print("\nGemini:")
print(res.candidates[0].content.parts[0].text)


Gemini:
Margot Robbie
