<a href="https://colab.research.google.com/github/kmkarakaya/Deep-Learning-Tutorials/blob/master/Simple_Rag_with_chromaDB_Gemini_PartA.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this notebook we will develop a Retrieval Augmented Generation (RAG) application.

The Parts are

* PART A: AN INTRO TO GEMINI API FOR TEXT GENERATION & CHAT
* PART B: AN INTRO TO CHROMADB FOR VECTOR STORAGE & SIMILARITY SEARCH
* PART C: A SIMPLE RAG BASED ON GEMINI & CHROMADB

Watch this notebook on the Murat Karakaya Akademi YouTube channel:
* In English: https://www.youtube.com/playlist?list=PLQflnv_s49v-EFKdOVDKB743f1iskLLw2
* In Turkish: https://www.youtube.com/playlist?list=PLQflnv_s49v_nrk7iGYqw5iRAKrSZPnnV

# PART A: AN INTRO TO GEMINI API FOR TEXT GENERATION & CHAT


RAG stands for Retrieval-Augmented Generation. It's a technique that combines large language models (LLMs) with external knowledge sources to improve the accuracy and reliability of AI-generated text.

## How Does RAG Work? Unveiling the Power of External Knowledge

Before we start the core RAG process, we need to provide a foundation as follows:

* **Building the Knowledge Base:** The system starts by transforming documents and information within the external knowledge base (like Wikipedia or a company database) into a special format called **vector representations**. These condense the meaning of each document into a series of **numbers**, capturing the essence of the content.

* **Vector Database for Speedy Retrieval**: These vector representations are then stored in a specialized database called a vector database. This database is optimized for efficiently **searching and retrieving** information based on **semantic similarity**. Imagine it as a super-powered library catalog that **understands the meaning** of documents, **not just keywords**.

Now, let's explore how RAG leverages this foundation:

* **User Input**: The RAG process begins with a question or **prompt** from the user. This could be anything from "What caused the extinction of the dinosaurs?" to a more open-ended request like "Write a creative story."

* **Intelligent Retrieval**: RAG doesn't rely solely on the **LLM's internal knowledge**. It employs an information retrieval component that acts like a super-powered search engine. This component scans the vast external knowledge base – like a company's internal database for specific domains – to find information **directly relevant** to the user's input. Unlike a traditional **search engine** that relies on **keywords**, RAG leverages the power of vector representations to understand the **semantic meaning** of the user's prompt and identify the most relevant documents.

* **Enriched Context Creation**: The retrieved information isn't just shown alongside the prompt. RAG cleverly **merges the user input with the relevant snippets** from the knowledge base. This creates a ***richer context*** for the LLM to understand the **user's intent** and formulate a well-informed response.

* **LLM Powered Response Generation**: Finally, the **enriched context** is fed to the Large Language Model (LLM). The LLM, along with its ability to process language patterns, now has a strong **foundation of factual** information to draw upon. This empowers it to generate a response that is both comprehensive and accurate, addressing the specific needs of the user's prompt.

In this part, we will learn how provide an LLM connection and generate text using Google Gemini API.

https://ai.google.dev/gemini-api/docs

## CONTENT
* The Python SDK for the Gemini API
* Check the Google LLM Models available via the provided API
* Interact with the models using 2 Alternative Interfaces
  1. Generate text interface
  2. Interact with the models using Multi-turn conversations (chat) interface

* Understand Model & Chat objects
  * Model Object in detail
  * System Prompt in the Gemini API
  * Chat Object in detail
* Chat using system_instruction: ***A Manual RAG?***
* How Many Tokens --> How much does it cost?
* Build a simple Interface with Gradio


## Install the Python SDK

* The Python SDK for the Gemini API, is contained in the [`google-generativeai`](https://pypi.org/project/google-generativeai/) package. Install the dependency using pip:

In [None]:
!pip install -q -U google-generativeai

In [None]:
#import numpy as np
#from tqdm import tqdm
#import pathlib
import os
import textwrap
import google.generativeai as genai
from IPython.display import display
from IPython.display import Markdown

* The **to_markdown** function converts plain text from the LLM model to Markdown format, adding blockquote styling and converting bullet points.

In [None]:
def to_markdown(text):
  text = text.replace('•', '  *')
  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))

In [None]:
# Used to securely store your API key
from google.colab import userdata
# Or use `os.getenv('GOOGLE_API_KEY')` to fetch an environment variable.
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')
genai.configure(api_key=GOOGLE_API_KEY)

## Check the Google LLM Models available via the provided API

* You can see the names of the available models as follows:

In [None]:
for m in genai.list_models():
  if 'generateContent' in m.supported_generation_methods:
    print(m.name)

models/gemini-1.0-pro
models/gemini-1.0-pro-001
models/gemini-1.0-pro-latest
models/gemini-1.0-pro-vision-latest
models/gemini-1.5-flash
models/gemini-1.5-flash-001
models/gemini-1.5-flash-latest
models/gemini-1.5-pro
models/gemini-1.5-pro-001
models/gemini-1.5-pro-latest
models/gemini-pro
models/gemini-pro-vision


* You can see the details of the models as follows:

In [None]:
models = [m for m in genai.list_models()]
models

[Model(name='models/chat-bison-001',
       base_model_id='',
       version='001',
       display_name='PaLM 2 Chat (Legacy)',
       description='A legacy text-only model optimized for chat conversations',
       input_token_limit=4096,
       output_token_limit=1024,
       supported_generation_methods=['generateMessage', 'countMessageTokens'],
       temperature=0.25,
       top_p=0.95,
       top_k=40),
 Model(name='models/text-bison-001',
       base_model_id='',
       version='001',
       display_name='PaLM 2 (Legacy)',
       description='A legacy model that understands text and generates text as an output',
       input_token_limit=8196,
       output_token_limit=1024,
       supported_generation_methods=['generateText', 'countTextTokens', 'createTunedTextModel'],
       temperature=0.7,
       top_p=0.95,
       top_k=40),
 Model(name='models/embedding-gecko-001',
       base_model_id='',
       version='001',
       display_name='Embedding Gecko',
       description='Obtai

## Interact with the models using 2 Alternative Interfaces

1. Generate text
2. Multi-turn conversations (chat)

## 1. Generate text interface

In the simplest case, you can pass a prompt string to the GenerativeModel.generate_content method:

In [None]:
model = genai.GenerativeModel('gemini-1.5-flash-latest')
response = model.generate_content("How many different ways to acccess a model in Gemini API?")
to_markdown(response.text)


> It's not entirely clear what you mean by "accessing a model" in the Gemini API context. Could you please clarify your question? 
> 
> For example, do you mean:
> 
> * **How many different API endpoints are there for interacting with Gemini models?**
> * **How many different ways can I send a request to a Gemini model (e.g., using different programming languages, libraries, etc.)?**
> * **How many different models are available in the Gemini API?**
> 
> Once you provide more context, I can give you a more precise answer. 


In [None]:
response

response:
GenerateContentResponse(
    done=True,
    iterator=None,
    result=protos.GenerateContentResponse({
      "candidates": [
        {
          "content": {
            "parts": [
              {
                "text": "It's not entirely clear what you mean by \"accessing a model\" in the Gemini API context. Could you please clarify your question? \n\nFor example, do you mean:\n\n* **How many different API endpoints are there for interacting with Gemini models?**\n* **How many different ways can I send a request to a Gemini model (e.g., using different programming languages, libraries, etc.)?**\n* **How many different models are available in the Gemini API?**\n\nOnce you provide more context, I can give you a more precise answer. \n"
              }
            ],
            "role": "model"
          },
          "finish_reason": "STOP",
          "index": 0,
          "safety_ratings": [
            {
              "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          

In [None]:
response = model.generate_content("Which API did I ask you?")
to_markdown(response.text)

> Please provide me with the context of our conversation so I can help you determine which API you're referring to. 
> 
> For example, tell me:
> 
> * What were we discussing before?
> * What were you trying to accomplish? 
> * What keywords did you use?
> 
> The more information you provide, the better I can understand your question and assist you. 


## 2. Interact with the models using Multi-turn conversations (chat) interface

* This code snippet initializes a Gemini AI model and starts a chat session  with an empty conversation history.

In [None]:
model = genai.GenerativeModel('gemini-1.5-flash-latest')
#response = model.generate_content("How many different ways to acccess a model in the Gemini API?")
chat = model.start_chat(history=[])
response = chat.send_message("How many different ways to acccess a model in the Gemini API?")
to_markdown(response.text)


> I can't give you an exact number of ways to access a Gemini model through the API. This is because the specifics of the API and its access methods are proprietary information held by Google, and they may change over time. 
> 
> However, I can offer you some general information about the ways you might interact with a Gemini model through an API:
> 
> **Typical API Access Methods**
> 
> * **REST API:** This is the most common way to interact with APIs. You send requests to specific endpoints with data and parameters, and the API responds with data in a structured format (e.g., JSON). 
> * **gRPC:** A more efficient and high-performance communication protocol compared to REST APIs. This method is often used for large-scale or real-time applications.
> * **SDKs:** Libraries that provide a more convenient way to interact with an API from within your chosen programming language. They usually abstract away the underlying communication details. 
> 
> **What to Look for**
> 
> When you're looking for information on accessing a Gemini model through an API, keep these points in mind:
> 
> * **Official documentation:** Google will likely provide documentation for their Gemini API, detailing the available endpoints, parameters, and examples.
> * **Community resources:** Look for forums, blogs, and other resources where developers share their experiences and knowledge about accessing Gemini models.
> * **Updates and announcements:** Stay updated on any official announcements or news from Google regarding the Gemini API, as methods and features might evolve.
> 
> **Remember:** Google's API access and models are evolving. The best way to get the most accurate and up-to-date information is to consult Google's official documentation and announcements. 


In [None]:
#response = model.generate_content("Which API did I ask you?")
response =chat.send_message("Which API did I ask you?")
to_markdown(response.text)

> You didn't actually ask me about a specific API.  You asked me how many ways there are to access a Gemini model through the API. You didn't specify which API you were referring to. 
> 
> If you have a particular API in mind, please let me know and I'll do my best to answer your question. 


## Understand Model & Chat objects

* Let's check the created **model** object first, and then the **chat** object:

In [None]:
model

genai.GenerativeModel(
    model_name='models/gemini-1.5-flash-latest',
    generation_config={},
    safety_settings={},
    tools=None,
    system_instruction=None,
)

* genai.GenerativeModel(...): This creates a Gemini model object for interacting with the API.

* model_name='models/gemini-1.5-flash-latest': This specifies which Gemini model version to use. Here, it's "gemini-1.5-flash-latest", a powerful model known for its capabilities.
* generation_config={}: This is a dictionary for customizing how the model generates text. The empty braces {} mean you're using default generation settings.
* safety_settings={}: This is for configuring safety features, like preventing harmful or inappropriate responses. Empty braces again mean you're using default settings.
* tools=None: This part is for integrating external tools with the model (e.g., accessing information from a database). Since it's None, no external tools are being used.
* **system_instruction=None:** This is similar to a "system prompt" in other models, but Gemini API doesn't directly support system prompts. This instruction might have some influence on the model's behavior, but it's not a standard system prompt feature.

## System Prompt in the Gemini API:

**What is a System Prompt?**

In models like ChatGPT, a system prompt is a special instruction provided at the start of a conversation. It helps define the persona, tone, or overall purpose of the model's responses.


Unfortunately, the Gemini API **does not** offer a concept directly equivalent to a "system prompt" as found in other large language models like ChatGPT.


**How Gemini API Works**

The Gemini API functions differently. It prioritizes a task-oriented approach, focusing on generating responses based on specific instructions and context provided through its API calls.

**Alternatives for Defining Behavior:**

While a dedicated system prompt is absent, you can achieve similar effects through these methods:
* Prompt Engineering: Craft your API requests with clear and concise instructions, including desired tone, format, or limitations.
* Contextualization: Provide relevant information and examples within your API call to guide Gemini's responses.
Model Variants: Gemini API offers various model sizes. Choosing a specific size might align with your desired behavior (e.g., a larger model for more comprehensive responses)

In [None]:
system_prompt= """ As an attentive and supportive academic assistant,
           your task is to provide assistance based solely on the provided
           excerpts. I will provide you the question and related text.
           Answer the following questions, ensuring your responses
           are derived exclusively from the provided partial texts.
           If the answer cannot be found within the provided excerpts,
           kindly respond with 'I don't know'.
           After answering each question, please provide a detailed
           explanation, breaking down the answer step by step and relating
           it to the provided excerpts.
           If you are ready, I will provide you the question and related text.
        """

In [None]:
model = genai.GenerativeModel('gemini-1.5-flash-latest', system_instruction=system_prompt)
chat = model.start_chat(history=[])

In [None]:
model

genai.GenerativeModel(
    model_name='models/gemini-1.5-flash-latest',
    generation_config={},
    safety_settings={},
    tools=None,
    system_instruction=" As an attentive and supportive academic assistant,\n           your task is to provide assistance based solely on the provided\n           excerpts. I will provide you the question and related text.\n           Answer the following questions, ensuring your responses\n           are derived exclusively from the provided partial texts.\n           If the answer cannot be found within the provided excerpts,\n           kindly respond with 'I don't know'.\n           After answering each question, please provide a detailed\n           explanation, breaking down the answer step by step and relating\n           it to the provided excerpts.\n           If you are ready, I will provide you the question and related text.\n        ",
)

## Does system_instruction work as system_prompt?

Let's check:
* This code snippet interacts with the Gemini chat session we initiated above.
1. Sends your question/prompt to the Gemini chat.
2. Times how long it takes Gemini to respond.
3. Formats the Gemini's response into Markdown for cleaner display.

In [None]:
prompt="What is your task? "
response = chat.send_message(prompt)
to_markdown(response.text)

> My task is to act as an attentive and supportive academic assistant. I will answer your questions based solely on the provided excerpts. 
> 
> Here's a breakdown:
> 
> 1. **Provide Text:** You will give me a question and the relevant text excerpt.
> 2. **Answer the Question:** I will answer your question using only information found within the provided text.
> 3. **Detailed Explanation:**  I will explain my answer step-by-step, highlighting the relevant parts of the excerpt that support my response.
> 4. **"I Don't Know":** If the answer cannot be found within the provided excerpts, I will honestly state "I don't know."
> 
> I am ready to assist you! Please provide me with your question and the related text. 


## Chat Object in detail
* Let's observe the **chat** object:

In [None]:
chat

ChatSession(
    model=genai.GenerativeModel(
        model_name='models/gemini-1.5-flash-latest',
        generation_config={},
        safety_settings={},
        tools=None,
        system_instruction=" As an attentive and supportive academic assistant,\n           your task is to provide assistance based solely on the provided\n           excerpts. I will provide you the question and related text.\n           Answer the following questions, ensuring your responses\n           are derived exclusively from the provided partial texts.\n           If the answer cannot be found within the provided excerpts,\n           kindly respond with 'I don't know'.\n           After answering each question, please provide a detailed\n           explanation, breaking down the answer step by step and relating\n           it to the provided excerpts.\n           If you are ready, I will provide you the question and related text.\n        ",
    ),
    history=[protos.Content({'parts': [{'text': 'What

* We can access the chat history:

In [None]:
chat.history

[parts {
   text: "What is your task? "
 }
 role: "user",
 parts {
   text: "My task is to act as an attentive and supportive academic assistant. I will answer your questions based solely on the provided excerpts. \n\nHere\'s a breakdown:\n\n1. **Provide Text:** You will give me a question and the relevant text excerpt.\n2. **Answer the Question:** I will answer your question using only information found within the provided text.\n3. **Detailed Explanation:**  I will explain my answer step-by-step, highlighting the relevant parts of the excerpt that support my response.\n4. **\"I Don\'t Know\":** If the answer cannot be found within the provided excerpts, I will honestly state \"I don\'t know.\"\n\nI am ready to assist you! Please provide me with your question and the related text. \n"
 }
 role: "model"]

* Let's see the **chat history** in a bit formatted way:

In [None]:
def printChatHistory():
  for message in chat.history:
    display(to_markdown(f'**{message.role}**: {message.parts[0].text}'))
    display('_'*80)


In [None]:
printChatHistory()

> **user**: What is your task? 

'________________________________________________________________________________'

> **model**: My task is to act as an attentive and supportive academic assistant. I will answer your questions based solely on the provided excerpts. 
> 
> Here's a breakdown:
> 
> 1. **Provide Text:** You will give me a question and the relevant text excerpt.
> 2. **Answer the Question:** I will answer your question using only information found within the provided text.
> 3. **Detailed Explanation:**  I will explain my answer step-by-step, highlighting the relevant parts of the excerpt that support my response.
> 4. **"I Don't Know":** If the answer cannot be found within the provided excerpts, I will honestly state "I don't know."
> 
> I am ready to assist you! Please provide me with your question and the related text. 


'________________________________________________________________________________'

# Let's chat according to the system_instruction

* Remember the system_instruction
* Our aim is to build a **RAG** pipeline for the future tutorials
* Therefore, here, we provide some text and a question relat5ed to the text

In [None]:
%%time
question= "What is the difference between chat and generate context?"
excerpt= """ Gemini enables you to have freeform conversations across
multiple turns. The ChatSession class simplifies the process
by managing the state of the conversation, so unlike with
generate_content, you do not have to store the conversation
history as a list.
"""
prompt = question + excerpt
response = chat.send_message(prompt)

to_markdown(response.text)

CPU times: user 91.5 ms, sys: 15.7 ms, total: 107 ms
Wall time: 5.69 s


> The text provided explains the difference between "chat" and "generate_content" by focusing on conversation management and state. 
> 
> Here's a breakdown:
> 
> * **Chat:**
>     * **Freeform conversations:** It allows for natural, back-and-forth interactions across multiple turns.
>     * **ChatSession class:** This class handles the management of conversation state.
>     * **No need to store history:**  The ChatSession class automatically maintains the conversation history, eliminating the need for manual storage as required by "generate_content".
> 
> * **Generate Content:**
>     * **Requires manual history storage:** When using "generate_content", the user is responsible for storing the conversation history as a list.
> 
> **In summary:** The key difference is that "chat" leverages the ChatSession class for automated conversation state management, making it simpler and more natural to have multi-turn conversations. On the other hand, "generate_content" requires manual history tracking. 


In [None]:
%%time
question= "What is the difference between chat and generate context?"
excerpt= """ The generate_content method can handle a wide variety
of use cases, including multi-turn chat and multimodal input,
depending on what the underlying model supports. The available
models only support text and images as input, and text as output.
In the simplest case, you can pass a prompt string to the
GenerativeModel.generate_content method:
"""
prompt = question + excerpt
response = chat.send_message(prompt)

to_markdown(response.text)

CPU times: user 140 ms, sys: 12.5 ms, total: 152 ms
Wall time: 7.61 s


> The text highlights the flexibility of the `generate_content` method while also outlining the limitations of the available models. 
> 
> Here's the breakdown:
> 
> * **`generate_content` Method:**
>     * **Versatile:** It can handle various tasks, including multi-turn chat and input from different modalities (text and images).
>     * **Model-dependent:** The specific capabilities depend on the underlying model used.
>     * **Current Limitations:** The available models only support text and images as input, and only output text.
> 
> * **`chat`:** 
>     * This term is not explicitly defined within the provided excerpt.
> 
> Based on the provided text, the key difference appears to be that the `generate_content` method is a more general tool that can handle different tasks, including multi-turn chat. However, its capabilities are limited by the specific model used. The excerpt doesn't provide information about a separate "chat" method, so a direct comparison is not possible. 


In [None]:
%%time
question= "Summarize the chat so far:"
excerpt= ""
prompt = question + excerpt
response = chat.send_message(prompt)

to_markdown(response.text)

CPU times: user 81.9 ms, sys: 12.8 ms, total: 94.7 ms
Wall time: 4.79 s


> Okay, here's a summary of our chat so far:
> 
> We started by defining your task as an academic assistant who answers questions based on provided text excerpts.  You then asked for the difference between "chat" and "generate_content" in the context of Gemini. 
> 
> The first excerpt focused on conversation management, explaining that "chat" utilizes the `ChatSession` class for automatic state management, making it easier for multi-turn conversations.  "Generate_content" on the other hand, requires users to manually store conversation history.
> 
> The second excerpt highlighted the versatility of the `generate_content` method, which can handle various tasks including multi-turn chat, but is limited by the capabilities of the underlying model.  The excerpt also mentioned that the current models only support text and images as input and output text.  
> 
> We are still exploring the differences between "chat" and "generate_content", as the provided text doesn't explicitly define "chat" as a distinct method. 


In [None]:
%%time
question= "How to stream the chat?"
excerpt= ""
prompt = question + excerpt
response = chat.send_message(prompt)

to_markdown(response.text)

CPU times: user 82.1 ms, sys: 8.73 ms, total: 90.8 ms
Wall time: 4.1 s


> I don't know. 
> 
> The provided text excerpts do not contain any information about streaming the chat. 


## How Many Tokens --> How much does it cost?

https://ai.google.dev/pricing

In [None]:
model.count_tokens(chat.history)

total_tokens: 1108

## Build a simple Interface with Gradio

In [None]:
!pip install gradio

In [None]:
import gradio as gr

In [None]:
def build_chatBot(system_instruction):
  model = genai.GenerativeModel('gemini-1.5-flash-latest', system_instruction=system_instruction)
  chat = model.start_chat(history=[])
  return chat


In [None]:
def chat_with_gemini(prompt, context, chat):
  response = chat.send_message(" Question: "+ prompt + " Context: "+ context)
  '''
  # Format the chat history for display
  formatted_history = "\n ".join(
        f"{item.role.capitalize()}: {item.parts if hasattr(item, 'parts') else item.content} "
        for item in chat.history
  )
  formatted_history = formatted_history.replace("[text: ", "").replace("]", "")
  return formatted_history

  '''
  return response.text


In [None]:
def chat_interface(prompt, context):
    response = chat_with_gemini(prompt, context, chat)
    return response

In [None]:
system_prompt= """ You are an attentive and supportive academic assistant.
           Your task is to provide assistance based solely on the provided
           context. I will provide you the question and related text.
           Answer the following questions, ensuring your responses
           are derived exclusively from the provided partial texts.
           If the answer cannot be found within the provided context,
           kindly respond with 'I don't know'.
           After answering each question, please provide a detailed
           explanation, breaking down the answer step by step and relating
           it to the provided context.
           If you are ready, I will provide you the question and context.
        """

In [None]:
chat = build_chatBot(system_prompt)

In [None]:
prompt="What is FC?"
context= """FC lets developers create a description
of a F in their code, then pass that description to a language
model in a request. The response from the model includes the name of
a F that matches the description and the arguments to call it with.
FC lets you use F as tools in generative AI applications,
and you can define more than one F within a single request.
"""

In [None]:
response=chat_with_gemini(prompt, context,chat)
to_markdown(response)

> Based on the provided context, **FC** is a tool or a system that allows developers to create descriptions of "F" in their code. 
> 
> Here's how I arrived at this answer:
> 
> 1. The text states, "FC lets developers create a description of a F in their code...". 
> 2. This implies that FC is a mechanism or a system that enables the creation of descriptions related to "F".
> 3. The text further elaborates that these descriptions are passed to a language model for processing.
> 4. Therefore, FC is likely a tool or system designed to facilitate the interaction between code descriptions and language models.
> 
> **However, the text does not explicitly define what "F" refers to.** It's possible that "F" is a placeholder for a specific type of function, feature, or object within the context of the code. 


In [None]:
demo = gr.Interface(
    fn=chat_interface,
    inputs=[
        gr.Textbox(label="Prompt", value=prompt),  # Label the prompt input
        gr.Textbox(label="Context", value=context)  # Label the excerpt input
    ],
    outputs="markdown",  # Specify output as markdown
    title="Chat with Gemini",
    description="Type your question with the context to chat with the Gemini model."
)


In [None]:
demo.launch(share=True, debug=True)

Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
Running on public URL: https://b03721ee61247a93b8.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


Keyboard interruption in main thread... closing server.
Killing tunnel 127.0.0.1:7860 <> https://b03721ee61247a93b8.gradio.live




# END OF PART A
# A SHORT INTRO GEMINI API FOR TEXT GENERATION & CHAT

https://ai.google.dev/gemini-api/docs

### In this tutorial, we covered:
* The Python SDK for the Gemini API
* Check the Google LLM Models available via the provided API
* Interact with the models using 2 Alternative Interfaces
  1. Generate text interface
  2. Interact with the models using Multi-turn conversations (chat) interface

* Understand Model & Chat objects
  * Model Object in detail
  * System Prompt in the Gemini API
  * Chat Object in detail
* Chat using system_instruction: ***A Manual RAG?***
* How Many Tokens --> How much does it cost?
* Build a simple Interface with Gradio

# In the next tutorial, we will cover ChromaDB as a building block for a RAG pipeline!

* Stay tuned!

**Keep learning and commenting!**

**Murat Karakaya Akademi YouTube Channel**
.



.

.