<a href="https://colab.research.google.com/github/Redislabs-Solution-Architects/Redis-Workshops/blob/main/04-Large_Language_Model/04.02_Large_Language_Model_Google.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Large Language Models

![Redis](https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120)

In this notebook you'll be sending prompts to LLMs programatically. Google's `gemini-pro` and (optionally) self-hosted in-notebook `databricks/dolly-v2`.

In [None]:
# Install dependencies
!pip -q install google-generativeai accelerate transformers sentence-transformers tiktoken

## Authenticate notebook to Google Cloud API
You can get your Google Cloud API key at https://console.cloud.google.com/apis/credentials

For security reason we recomment to restrict the key to allow only Generative Language API

***Note - If you are participating in a workshop, your instructor should provide you with an API key.***

In [None]:
import getpass
import os

if "GOOGLE_API_KEY" not in os.environ:
    os.environ["GOOGLE_API_KEY"] = getpass.getpass("Provide your Google API Key: ")

Test that we have access to Google APIs by requesting the list of models

In [None]:
import google.generativeai as genai

for m in genai.list_models():
  if 'generateContent' in m.supported_generation_methods:
    print(m.name)

***Note - This step is optional. Uncomment and run this cell if you do not have a Google Cloud API key or would like to compare results from Gemini and a self-hosted model***

Initialize `databricks/dolly-v2-3b` via [HuggingFace](https://huggingface.co/databricks/dolly-v2-3b). Multiple progressively more powerful models are available, including 3b, 7b and 12b (referring to Billions of parameters). `dolly-v2-3b` is the only model in the family that would fit in the memory and GPU available in a free Google Colab instance.

Loading and initializing the model can take few minutes.

In [None]:
# Skip dolly initialization
# import torch
# from transformers import pipeline

# dolly_completion = pipeline(model="databricks/dolly-v2-3b",
#                          torch_dtype=torch.bfloat16,
#                          trust_remote_code=True,
#                          device_map="auto")

Helper function for Google Gemini model

In [None]:
model = genai.GenerativeModel('gemini-pro')
def gemini_completion(prompt):
    response = model.generate_content(prompt)
    return response

# Create the prompt

Prompt contains instructions, context and the question. Feel free to experiment with the prompt and see the difference in responses from different models.

News article used in this example: https://techcrunch.com/2023/10/31/toyota-to-invest-another-8b-into-north-carolina-ev-battery-factory/

In [None]:
# Specify the content, question, and propmt to be passed to the LLM
context = """
Toyota said Tuesday it will invest another $8 billion into its first EV battery factory in North America, as the Japanese automaker tries to ramp up its electrification program and introduce 30 battery electric models globally by the end of the decade.
The North Carolina-based factory, which is slated to go into production in 2025, is now valued at $13.9 billion, according to the company. 
That’s a jump from Toyota’s initial plans to make a $1.29 billion investment to build a North American factory that will make batteries for hybrid electric vehicles and battery electric vehicles.
This latest investment will add eight battery electric and plug-in hybrid battery production lines to the facility located on 1,825 acres in Liberty, North Carolina. 
Once completed, the factory will have 10 lines and will reach a total production capacity of 30 GWh annually by 2030. To put that into perspective, the so-called Tesla gigafactory, which is a joint venture between the automaker and Panasonic, has the capacity to make 35 GWh of cells annually.
"""

question = "What is Toyota building in North Carolina?"

prompt = f"""
Instruction: Use only information in the following context to answer the question at the end.
If you don't know, say that you do not know.

Context:  {context}

Question: {question}

Response:
"""

print(prompt)

res = gemini_completion(prompt)
print("\nGemini:")
print(res.candidates[0].content.parts[0].text)

## TODO:
Play around with basic prompts to see what kind of replies you can generate.

Some ideas for you to try:
- Add "Respond in French/Spanish" to the prompt.

- Add more information into the context until you hit the token limit of the model.

- Check system memory: e.g. "What was my last question?"

- Provoke hallucinations, e.g. "What was the name of the first elephant to walk on the moon?"

In [None]:
prompt = "Tell me about Dallas, Texas"

# Uncomment if you installed the dolly model locally and would like to compare results
# res = dolly_completion(prompt)
# print("Dolly:")
# print(res[0]['generated_text'])

res = gemini_completion(prompt)
print("\nGemini:")
print(res.candidates[0].content.parts[0].text)