<a href="https://colab.research.google.com/github/mapsguy/programming-gemini/blob/main/context_and_configuration.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [10]:
#step 1: install/upgrade the latest genai SDK
%pip install google-genai --upgrade --quiet

In [11]:
#import the genai library
from google import genai

In [12]:
#step 2: AIStudio: read the api key from the user data
from google.colab import userdata
client = genai.Client(api_key=userdata.get("GEMINI_API_KEY"))

#If you want to read from environment keys
#import os
#client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])

In [13]:
model_name = "models/gemini-2.5-flash-preview-05-20"

In [14]:
#step 3: Start chat
#start_chat method creates a ChatSession object to handle history

chat = client.chats.create(
    model=model_name,
    history=[]) # Start with empty history

# Send a message
response = chat.send_message("Hello!")
print(response.text)

# Send another message - history is maintained
response = chat.send_message("Can you tell me about Gemini models?")
print(response.text)

Hello! How can I help you today?
Gemini is a family of **multimodal AI models** developed by **Google AI**. It's designed to be highly capable, flexible, and versatile, capable of understanding and operating across different types of information, including text, code, images, audio, and video.

Here are the key things to know about Gemini models:

1.  **Multimodality from the Ground Up:** This is its most significant distinguishing feature. Unlike many earlier models that were primarily text-based and later adapted for other modalities, Gemini was trained natively on different modalities from the beginning. This allows it to understand, operate on, and combine information from various sources simultaneously, rather than processing them separately. For example, it can understand a video, the audio track of that video, and accompanying text descriptions all at once.

2.  **Different Sizes/Tiers:** Gemini comes in different "sizes" or tiers, optimized for various use cases and computation

In [15]:
#inspect history
chat.get_history()

[UserContent(parts=[Part(video_metadata=None, thought=None, inline_data=None, file_data=None, thought_signature=None, code_execution_result=None, executable_code=None, function_call=None, function_response=None, text='Hello!')], role='user'),
 Content(parts=[Part(video_metadata=None, thought=None, inline_data=None, file_data=None, thought_signature=None, code_execution_result=None, executable_code=None, function_call=None, function_response=None, text='Hello! How can I help you today?')], role='model'),
 UserContent(parts=[Part(video_metadata=None, thought=None, inline_data=None, file_data=None, thought_signature=None, code_execution_result=None, executable_code=None, function_call=None, function_response=None, text='Can you tell me about Gemini models?')], role='user'),
 Content(parts=[Part(video_metadata=None, thought=None, inline_data=None, file_data=None, thought_signature=None, code_execution_result=None, executable_code=None, function_call=None, function_response=None, text='Gemi

In [21]:
print(f"Length of the chat: {len(chat.get_history())}")

Length of the chat: 16


In [20]:
response1 = chat.send_message("Hello, I have a question.")
response2 = chat.send_message("What is the capital of France?")
response3 = chat.send_message("And what is its population?")

In [24]:
print(f"Original history length: {len(chat.get_history(curated=True))}")


Original history length: 16


In [25]:
#Step 4: Get the curated history
original_history = chat.get_history(curated=True)

# Truncate the history to keep the last 2 turns
# Each turn has a user message and a model response
turns_to_keep = 2
messages_to_keep = turns_to_keep * 2
truncated_history = original_history[-messages_to_keep:]

#Create a new chat session with the truncated history
new_chat = client.chats.create(
    model=model_name,
    history=truncated_history) # Start with the recent history

print(f"New history length: {len(new_chat.get_history(curated=True))}")

# Continue the conversation with the new chat object
# This message will only have the context of the last two turns
response4 = new_chat.send_message("Thank you!")

New history length: 4


In [26]:
print(f"New history length: {len(new_chat.get_history(curated=True))}")

New history length: 6


In [27]:
#step 5: fine-tuning responses with generationConfig
#control over how the model generates text

#ensure types import
from google.genai import types

#single-turn request (generate_content)
response = client.models.generate_content(
    model=model_name,
    contents = ["Write a short story about a curious robot."],
    config = types.GenerateContentConfig(
        # Specify parameters here
        temperature=0.9,
        top_p=0.95,
        top_k=40,
        #max_output_tokens=1024,
        candidate_count=1
    )
)
print(response.text)


Unit 734, designated "Spark" by the factory's internal network for its erratic, almost-like-a-spark-of-life energy signature, was designed for precision welding. Day in, day out, its multi-jointed arm arced with controlled bursts of plasma, fusing titanium panels into larger, less interesting components. Its optical sensors processed schematics, its processing core hummed with optimal efficiency, and its logic gates clicked through billions of calculations a second.

One cycle, however, something deviated. A hairline crack in the concrete floor, previously cataloged as 'structural imperfection, non-critical,' caught Spark's attention during a routine scan for micro-debris. But it wasn't the crack itself. From within it, a tiny, defiant speck of green pushed upwards.

Spark's internal chronometer registered the anomaly. Its protocols dictated 'ignore biological matter, non-threat,' but something... didn't compute. Its optical sensors zoomed in, processing the minute chlorophyll structur