# Vertex AI with Gemini 2.5 Flash using LangChain

This notebook demonstrates how to use the latest Google Gemini 2.5 Flash model with Vertex AI and LangChain using the modern `ChatGoogleGenerativeAI` class.

In [16]:
# Install required packages (uncomment if needed)
# !pip install langchain-google-genai google-cloud-aiplatform python-dotenv

In [17]:
import os
from dotenv import load_dotenv
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.messages import HumanMessage, SystemMessage

# Load environment variables from .env file
load_dotenv()

# Configure Vertex AI from environment variables
project_id = os.getenv("GOOGLE_CLOUD_PROJECT_ID")
location = os.getenv("GOOGLE_CLOUD_LOCATION")
credentials_path = os.getenv("GOOGLE_APPLICATION_CREDENTIALS")

# Validate required environment variables
if not project_id:
    raise ValueError("GOOGLE_CLOUD_PROJECT_ID environment variable is not set")
if not location:
    raise ValueError("GOOGLE_CLOUD_LOCATION environment variable is not set")
if not credentials_path:
    raise ValueError("GOOGLE_APPLICATION_CREDENTIALS environment variable is not set")

# Set the credentials path for Google Cloud authentication
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = credentials_path

print(f"✅ Vertex AI Configuration:")
print(f"   Project ID: {project_id}")
print(f"   Location: {location}")
print(f"   Credentials: {credentials_path}")
print(f"   Model: gemini-2.5-flash-lite")

✅ Vertex AI Configuration:
   Project ID: notebooklm-clone-483513
   Location: asia-south1
   Credentials: /Users/bhaskars/Developer/Bhaskar/notebooklm-clone/notebooklm-clone-483513-14d51a29f057.json
   Model: gemini-2.5-flash-lite


In [24]:
# Initialize the Gemini 2.5 Flash model
# Flash is a balanced performance model with thinking capabilities, 
# supporting structured output, function calling, and grounding
# Using ChatGoogleGenerativeAI with vertexai=True to use Vertex AI backend

llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",
    project=project_id,
    location=location,
    vertexai=True,  # Use Vertex AI backend
    temperature=0.7,
    max_tokens=8192,
)

print("✅ Initialized ChatGoogleGenerativeAI with gemini-2.5-flash (Vertex AI backend)")

# Example 1: Basic chat interaction
messages = [
    HumanMessage(content="What is the capital of France?")
]

response = llm.invoke(messages)
print("Example 1 - Basic Question:")
print(response.content)
print("\n" + "="*50 + "\n")

✅ Initialized ChatGoogleGenerativeAI with gemini-2.5-flash (Vertex AI backend)
Example 1 - Basic Question:
The capital of France is **Paris**.




In [8]:
# Example 2: Conversation with system message
messages = [
    SystemMessage(content="You are a helpful assistant that explains concepts clearly and concisely."),
    HumanMessage(content="Explain quantum computing in simple terms.")
]

response = llm.invoke(messages)
print("Example 2 - System Message + Question:")
print(response.content)
print("\n" + "="*50 + "\n")

Example 2 - System Message + Question:
Imagine a regular computer as a light switch that can be either **ON (1)** or **OFF (0)**.

Quantum computing is like having a light switch that can be:

1.  **ON (1)**
2.  **OFF (0)**
3.  **ON and OFF at the same time (Superposition)**: This is the first mind-bending part. A quantum bit, or **qubit**, can exist in a combination of both states simultaneously, like a spinning coin that hasn't landed yet. This means it can store much more information than a regular bit.

But wait, there's more!

4.  **Entanglement**: This is where things get really weird and powerful. If you have two entangled qubits, they become linked in a special way. Whatever state one qubit is in, the other instantly knows and reacts, no matter how far apart they are. It's like having two magic coins that, even if flipped in different rooms, always land on opposite sides (or the same side, depending on how they're entangled). This allows quantum computers to perform calculation

In [9]:
# Example 3: Multi-turn conversation
conversation = [
    HumanMessage(content="What is Python?"),
]

response1 = llm.invoke(conversation)
print("User: What is Python?")
print(f"Assistant: {response1.content}\n")

# Add the assistant's response and continue the conversation
conversation.append(response1)
conversation.append(HumanMessage(content="Can you give me a simple example?"))

response2 = llm.invoke(conversation)
print("User: Can you give me a simple example?")
print(f"Assistant: {response2.content}")

User: What is Python?
Assistant: Python is a **high-level, interpreted, general-purpose programming language** created by Guido van Rossum and first released in 1991.

It's known for its **simplicity, readability, and versatility**, making it one of the most popular programming languages in the world today.

Here's a breakdown of what that means and why it's so widely used:

1.  **High-Level:**
    *   This means you don't have to worry about low-level details like managing computer memory directly. Python handles many complex tasks automatically, allowing you to focus on solving the problem at hand.
    *   Its syntax is designed to be very readable and often resembles natural English, making it easier to learn and write.

2.  **Interpreted:**
    *   Unlike compiled languages (like C++ or Java), Python code is executed line by line by an interpreter at runtime, rather than being fully translated into machine code before execution.
    *   This makes the development process faster, as

In [11]:
# Example 4: Streaming response
# Streaming allows you to see the response as it's being generated, token by token
# This is useful for better user experience and real-time applications

print("Example 4 - Streaming Response:")
print("=" * 50)
print("\nQuestion: Explain the concept of machine learning in 2-3 sentences.\n")
print("Streaming response:\n")

messages = [
    HumanMessage(content="Explain the concept of machine learning.")
]

# Use stream() instead of invoke() for streaming responses
for chunk in llm.stream(messages):
    # Print each chunk as it arrives (without newline to see streaming effect)
    print(chunk.content, end="", flush=True)

print("\n\n" + "=" * 50)

Example 4 - Streaming Response:

Question: Explain the concept of machine learning in 2-3 sentences.

Streaming response:

Machine Learning (ML) is a subfield of Artificial Intelligence (AI) that enables systems to **learn from data, identify patterns, and make decisions or predictions with minimal human intervention.**

Think of it this way: Instead of explicitly programming a computer with a set of rules for every possible scenario, you give it lots of data and an algorithm, and the computer figures out the rules *itself*.

Here's a breakdown of the core concepts:

1.  **The Problem ML Solves:**
    *   **Traditional Programming:** You write specific, step-by-step instructions for the computer to follow. For example, "If the email subject contains 'VIAGRA' AND the sender is unknown, then mark as spam." This works well for simple, well-defined tasks.
    *   **Limitations:** What if the rules are too complex, constantly changing, or unknown? How would you write rules to recognize a ca