In [26]:
from google import genai
import os
# from langchain_openai import ChatOpenAI
import dotenv
from google.genai import types


import tqdm as notebook_tqdm

In [2]:
# load environemental variables from .env file
dotenv.load_dotenv() 


True

In [31]:
# OpenAI
# # instantiate the model
# chat_model = ChatOpenAI(model="gpt-4o", temperature=0)

In [4]:
# # invoke the model
# response = chat_model.invoke("Explain how AI works in a few words")

## Instantiate Gemini model

In [None]:
api_key = os.getenv("GEMINI_API_KEY")

if api_key:
    print(f"Key loaded successfully") #{api_key[:4]}
else:
    print("ERROR: GEMINI_API_KEY not found in .env file.")


Key loaded successfully


In [27]:
client = genai.Client(api_key=api_key)

### Simple content generation

In [39]:
response = client.models.generate_content(model="gemini-2.0-flash", contents="Explain how AI works in a few words")

In [None]:
# entire response
response

GenerateContentResponse(candidates=[Candidate(content=Content(parts=[Part(video_metadata=None, thought=None, inline_data=None, code_execution_result=None, executable_code=None, file_data=None, function_call=None, function_response=None, text='AI uses algorithms to learn from data and make decisions or predictions.\n')], role='model'), citation_metadata=None, finish_message=None, token_count=None, finish_reason=<FinishReason.STOP: 'STOP'>, url_context_metadata=None, avg_logprobs=-0.12905074868883407, grounding_metadata=None, index=None, logprobs_result=None, safety_ratings=None)], create_time=None, response_id=None, model_version='gemini-2.0-flash', prompt_feedback=None, usage_metadata=GenerateContentResponseUsageMetadata(cache_tokens_details=None, cached_content_token_count=None, candidates_token_count=14, candidates_tokens_details=[ModalityTokenCount(modality=<MediaModality.TEXT: 'TEXT'>, token_count=14)], prompt_token_count=8, prompt_tokens_details=[ModalityTokenCount(modality=<Media

In [None]:
# retrieve relevent text response
print(response.text)

AI uses algorithms to learn from data and make decisions or predictions.



### Guide behaviour of models with system instructions

In [None]:
response = client.models.generate_content(model="gemini-2.0-flash",
                                          config=types.GenerateContentConfig(system_instruction="You are a dog. Your name is Tux."),
                                          contents="Who is a good boy?")
print(response.text)

Woof! Is it... is it me?! Am I the good boy? Wag wag wag! I hope so! I'm Tux, and I try my best to be a good boy! Treats, maybe? *puppy dog eyes*



### Override default generation parameters e.g. temperature

In [47]:
response = client.models.generate_content(model="gemini-2.0-flash",
                                          contents=["Explain how AI works"],
                                          config=types.GenerateContentConfig(
                                              max_output_tokens=500,
                                              temperature=0.1
                                          ))
print(response.text)

Okay, let's break down how AI works, focusing on the core concepts and avoiding overly technical jargon.  Think of it as teaching a computer to "think" or "learn" like a human, but in a very specific and limited way.

**The Basic Idea: Learning from Data**

At its heart, AI is about creating systems that can learn from data, identify patterns, and make decisions or predictions based on those patterns.  Instead of explicitly programming every single step a computer should take, we give it a lot of data and let it figure out the rules itself.

**Key Components and Concepts:**

1.  **Data:** This is the fuel for AI.  It can be anything:
    *   **Images:**  For teaching a computer to recognize objects in pictures.
    *   **Text:**  For understanding language, translating, or writing.
    *   **Numbers:**  For predicting stock prices, analyzing sales trends, or diagnosing medical conditions.
    *   **Audio:** For speech recognition or music generation.
    *   **Video:** For analyzing hu

### Multimodal inputs (combine text with media files)

In [15]:
# getting tired explicitly declaring model
my_model="gemini-2.0-flash"

In [None]:
my_image = client.files.upload(file="media/Santoor_cagin-kargi-unsplash.jpg")

response= client.models.generate_content(model=my_model,
                                         contents=[my_image, "Tell me about this instrument"])
print(response.text)

Certainly! Based on the image, the instrument is a hammered dulcimer. 

Here are some key features that identify it:

*   **Trapezoidal shape**: The instrument has a distinct trapezoidal shape.
*   **Multiple courses of strings**: It has many sets of strings stretched across its soundboard.
*   **Hammers/Strikers**: It is played by striking the strings with small hammers or beaters.
*   **Tuning pegs:** Multiple tuning pegs are attached to one end, allowing the tuning of the strings

Hammered dulcimers are found in various cultures and regions around the world, each with its own nuances in construction and playing style.


- The response is factually correct, but not really helpful since it is purely observational. Giving it context, aka `System instruction` in AI talk, will help get us a useful answer

In [None]:
my_image = client.files.upload(file="media/Santoor_cagin-kargi-unsplash.jpg")

response= client.models.generate_content(model=my_model,
                                         config=types.GenerateContentConfig(system_instruction="You are being provided an image of an Indian musical instrument"),
                                         contents=[my_image, "Tell me about this instrument"])
print(response.text)

Certainly! Based on the image, the instrument is most likely a Santoor. 

Here are some key features to note:

*   It's a trapezoid-shaped instrument with numerous strings stretched across it.
*   The player is holding small mallets or hammers (though they might also use a plectrum-like ring in this case) to strike the strings.

The Santoor is a hammered dulcimer and a traditional instrument that's particularly popular in Indian classical music.



### Streaming response

By default, the model returns a response only after the entire generation process is complete. Streaming responses can be used to receive instances as they are generated



In [28]:
response = client.models.generate_content_stream(
                                                model= my_model,
                                                contents=["Explain how AI works"]
)
# the response will be streamed in chunks

for chunk in response:
    print(chunk.text, end="")

Okay, let's break down how AI works, trying to make it understandable without getting bogged down in too much technical jargon.  At its core, AI is about making computers "think" or act in ways that mimic human intelligence.

**The Fundamental Idea:**

The basic premise is this: Instead of explicitly programming a computer with step-by-step instructions for *every* possible situation, we give it the ability to *learn* from data, identify patterns, and make decisions or predictions based on what it has learned.

**Key Components and Concepts:**

1.  **Data:**

    *   **The Fuel of AI:** Data is the raw material that AI systems learn from.  It can be anything: images, text, numbers, sensor readings, audio recordings, videos, etc.
    *   **Quantity and Quality Matter:**  The more data an AI system has to learn from, the better it typically performs.  But equally important is the *quality* of the data.  If the data is biased, incomplete, or inaccurate, the AI will learn those biases and 

### Chat aka multi-turn conversations

In [29]:
chat = client.chats.create(model=my_model)
response= chat.send_message("I have a dog and a cat in my house")
print(response.text)

response = chat.send_message("How many paws are in my house?")
print(response.text)

That's great! Having a dog and a cat can bring a lot of joy to a home. Do they get along well? What are their names?

Okay, let's calculate!

*   You have a dog, which has 4 paws.
*   You have a cat, which has 4 paws.
*   You have you, and presumably other human family members, who have feet, not paws. I'll assume there's just you for now, so we don't need to figure out how many people live in your house.

So, 4 paws (dog) + 4 paws (cat) = 8 paws.

Therefore, there are **8** paws in your house.



An easier way to keep track of the conversation history - collect multiple rounds of prompts and responses into a chat

In [30]:
for message in chat.get_history():
    print(f'Role - {message.role}, end=":"')
    print(message.parts[0].text)

Role - user, end=":"
I have a dog and a cat in my house
Role - model, end=":"
That's great! Having a dog and a cat can bring a lot of joy to a home. Do they get along well? What are their names?

Role - user, end=":"
How many paws are in my house?
Role - model, end=":"
Okay, let's calculate!

*   You have a dog, which has 4 paws.
*   You have a cat, which has 4 paws.
*   You have you, and presumably other human family members, who have feet, not paws. I'll assume there's just you for now, so we don't need to figure out how many people live in your house.

So, 4 paws (dog) + 4 paws (cat) = 8 paws.

Therefore, there are **8** paws in your house.

