In [None]:
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Open Source Models (Gemma) as a agent with Agentspace

<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/search/agentspace/oss_model_with_agentspace.ipynb">
      <img width="32px" src="https://www.gstatic.com/pantheon/images/bigquery/welcome_page/colab-logo.svg" alt="Google Colaboratory logo"><br> Open in Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fsearch%2Fagentspace%2Foss_model_with_agentspace.ipynb">
      <img width="32px" src="https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN" alt="Google Cloud Colab Enterprise logo"><br> Open in Colab Enterprise
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/search/agentspace/oss_model_with_agentspace.ipynb">
      <img src="https://www.gstatic.com/images/branding/gcpiconscolors/vertexai/v1/32px.svg" alt="Vertex AI logo"><br> Open in Vertex AI Workbench
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://github.com/GoogleCloudPlatform/generative-ai/blob/main/search/agentspace/oss_model_with_agentspace.ipynb">
      <img width="32px" src="https://www.svgrepo.com/download/217753/github.svg" alt="GitHub logo"><br> View on GitHub
    </a>
  </td>
</table>



<br>
<br>
<br>

<div style="clear: both;"></div>

<b>Share to:</b>

<a href="https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/search/agentspace/oss_model_with_agentspace.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg" alt="LinkedIn logo">
</a>

<a href="https://bsky.app/intent/compose?text=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/search/agentspace/oss_model_with_agentspace.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/7/7a/Bluesky_Logo.svg" alt="Bluesky logo">
</a>

<a href="https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/search/agentspace/oss_model_with_agentspace.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/5a/X_icon_2.svg" alt="X logo">
</a>

<a href="https://reddit.com/submit?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/search/agentspace/oss_model_with_agentspace.ipynb" target="_blank">
  <img width="20px" src="https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png" alt="Reddit logo">
</a>

<a href="https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/search/agentspace/oss_model_with_agentspace.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg" alt="Facebook logo">
</a>

TODO: Update links

| Author |
| --- |
| Parag Mhatre |

## Building Agents with Open-Source Models and Agentspace

This repository contains a comprehensive notebook that provides an end-to-end guide for deploying open-source Large Language Models (LLMs), like Gemma, and building a conversational agent that interacts with the deployed model.

### Overview 📝

This notebook walks you through the entire lifecycle of creating and deploying a custom agent powered by an open-source LLM. The key stages covered are:

1. Model Deployment: Deploying an open-source model (Gemma) to a serving endpoint.

2. Agent Construction: Building an agent that uses the deployed model.

3. Root Agent Configuration: Providing two distinct options for the agent's core logic:

   3.1. Option 1: Using Google Gemini model as the primary "root agent" to orchestrate tasks and interact with the specialized Gemma model.

   3.2. Option 2: Using LiteLLM to enable the deployed Gemma model itself to function as the root agent, offering a fully open-source-based solution.

4. Testing & Deployment: Testing the agent locally with the Agent Development Kit (ADK), deploying it to the Agent Engine, and finally registering it in Agentspace for user access.

### Background 💡
The demand for specialized, fine-tuned LLMs is rapidly growing. Organizations are increasingly looking to leverage open-source models, training them on proprietary data to create experts in specific domains like customer support, internal documentation, or financial analysis.

However, a key challenge remains: how to make these powerful, fine-tuned models easily accessible to business users? Simply deploying a model isn't enough. You need a robust, interactive, and discoverable interface. This is where agents come in. An agent acts as a smart layer on top of the model, enabling seamless conversation and integration with other tools.

This notebook provides a practical blueprint for bridging that gap—from deploying a custom open-source model to making it available as a fully functional agent in Agentspace.

### Business Scenarios 🏢
This solution is ideal for organizations that want to:

1. Create Expert Chatbots: Fine-tune a model like Gemma on an internal knowledge base (e.g., HR policies, technical documentation, product specs). The resulting agent can then provide instant, accurate answers to employee or customer queries.

2. Develop Specialized Assistants: Build an agent trained on financial reports or market data to assist analysts with data retrieval and summarization.

3. Ensure Data Privacy and Control: Use open-source models hosted within their own cloud environment to maintain full control over sensitive data, instead of relying on external model APIs.

4. Reduce Costs: Leverage powerful, cost-effective open-source models as an alternative to proprietary, closed-source options.

### Notebook Steps 🚀
The notebook is structured into a clear, step-by-step process. Each section contains the necessary code and explanations to guide you through the implementation.

#### Step 1: Setup and Prerequisites
> This initial step involves importing the required libraries and configuring your project environment variables. It ensures all dependencies are in place before you begin.

#### Step 2: Deploying the Open-Source Model (Gemma)
> Here, you will take the open-source Gemma model and deploy it to a serving endpoint (e.g., on Vertex AI). This makes the model available to receive requests via an API, a crucial first step for agent integration.

#### Step 3: Configuring the Root Agent
> This is a critical decision point where you choose the "brain" of your agent. The notebook provides two paths:

> **Option A: Using Gemini as the Root Agent**

> In this configuration, the powerful, multi-modal Gemini model acts as the primary orchestrator. It understands the user's intent and can decide when to call the specialized, deployed Gemma model for specific tasks. This is a great hybrid approach that combines the broad capabilities of Gemini with the specialized knowledge of your fine-tuned model.

> **Option B: Using Gemma as the Root Agent via LiteLLM**

> This approach uses LiteLLM, a clever library that creates a standardized interface for interacting with various LLMs. We use it to wrap our deployed Gemma model, allowing it to serve as the root agent itself. This is the perfect choice for creating a solution built entirely on open-source components.

#### Step 4: Building and Testing the Agent with ADK
> With the model deployed and the root agent configured, you will now formally build the agent. The notebook then guides you through using the Agent Development Kit (ADK) to run the agent locally. This allows for rapid testing and debugging to ensure the agent behaves as expected before a full deployment.

#### Step 5: Deploying the Agent to Agent Engine
> Once the agent is validated locally, this step shows you how to deploy it to the scalable, managed Agent Engine. This moves your agent from a local test environment to a production-ready platform.

#### Step 6: Registering the Agent in Agentspace
> In the final step, you will register your deployed agent with Agentspace. This makes the agent discoverable and accessible to authorized end-users within your organization, allowing them to easily find and interact with your new, custom-built assistant.

#### Step 1: Setup and Prerequisites

In [None]:
# TODO for Developer: Update project id.
PROJECT_NUMBER = "[your-project-number]"

# TODO for Developer: Update project name.
PROJECT_ID = "[your-project-id]"

GEMMA_PARAMETER = "gemma3-1b"
REGION = "us-central1"
SERVICE_NAME = "gemma-3-1b"

In [None]:
# Setup Google Cloud Product project.
!gcloud config set project {PROJECT_ID}
!gcloud config get-value project

# Enable required services.
!gcloud services enable iam.googleapis.com

#### Step 2: Deploying the Open-Source Model (Gemma)

In [None]:
!gcloud run deploy {SERVICE_NAME} \
   --image us-docker.pkg.dev/cloudrun/container/gemma/{GEMMA_PARAMETER} \
   --concurrency 4 \
   --cpu 8 \
   --set-env-vars OLLAMA_NUM_PARALLEL=4 \
   --gpu 1 \
   --gpu-type nvidia-l4 \
   --max-instances 1 \
   --memory 32Gi \
   --no-allow-unauthenticated \
   --no-cpu-throttling \
   --timeout=600 \
   --region {REGION} \
   --no-gpu-zonal-redundancy

#### Step 3: Configuring the Root Agent

In [None]:
pip install --quiet google-adk==1.7.0 litellm==1.74.7 ollama==0.5.1

In [207]:
from google.adk.agents import LlmAgent
import google.auth.transport.requests
from google.genai import types
import google.oauth2.id_token
from ollama import Client

api_base_url = "https://gemma-3-1b-[your-project-number].us-central1.run.app"
model_name_at_endpoint = "ollama/gemma3:1b"

# Create a Google Auth request object
auth_req = google.auth.transport.requests.Request()

# Generate the ID token
id_token = google.oauth2.id_token.fetch_id_token(auth_req, api_base_url)

# Make the request with the ID token
auth_headers = {"Authorization": f"Bearer {id_token}"}


def get_answer(question: str):
    """/
    Answer the question using gemma model.

    Args:
        question: user's question.

    Returns:
        returns detailed answer based on user's question.
    """
    try:
        client = Client(host=api_base_url, headers=auth_headers)
        response = client.chat(
            model="gemma3:1b",
            messages=[
                {
                    "role": "user",
                    "content": question,
                },
            ],
        )

        return response.message.content
    except Exception as e:
        print(e)
        return "Not able to provide answer."


# Option 1: Use Gemma as root agent.
# from google.adk.models.lite_llm import LiteLlm
# model=LiteLlm(
#         model=model_name_at_endpoint,
#         api_base=api_base_url,
#         extra_headers=auth_headers
#     )

# Option 2: Use Gemini as root agent and use gemma agent as supportive agent for better results.

gemma_agent = LlmAgent(
    model="gemini-2.5-pro",
    name="gemma_agent",
    instruction="""1. You are a helpful assistant, your task is the answer the question.
                   2. Only answer the question asked by user, do not ask follow up questions .
                   3. if no answer produce from function call tool then, you can mention that 'Sorry, answer can't be generated.'.
                   4. Only answer the question based on provided context from tool. Do not add additinoal text. Preferably return the answer as is.
                    """,
    tools=[get_answer],
)

#### Step 4: Building and Testing the Agent with ADK

In [None]:
from vertexai.preview import reasoning_engines

app = reasoning_engines.AdkApp(
    agent=gemma_agent,
    enable_tracing=True,
)

In [None]:
session = app.create_session(user_id="u_1232")
session

In [None]:
query = "where is india located?"
contents = types.Content(role="user", parts=[types.Part.from_text(text=query)])

In [None]:
contents

In [None]:
for event in app.stream_query(
    user_id="u_1232",
    session_id=session.id,
    message=contents.model_dump(),
):
    print(event)

#### Step 5: Deploying the Agent to Agent Engine

In [None]:
import vertexai

PROJECT_NUMBER = PROJECT_NUMBER
LOCATION = "us-central1"  # TODO for Developer : Update region here.
STAGING_BUCKET = (
    "gs://[bucket-name]"  # TODO for Developer : Update GCS bucket name here.
)
vertexai.init(project=PROJECT_NUMBER, location=LOCATION, staging_bucket=STAGING_BUCKET)

In [None]:
from vertexai import agent_engines

remote_app = agent_engines.create(
    display_name="Gemma Agent v7",
    agent_engine=app,
    requirements=[
        "litellm (==1.74.7)",
        "google-adk (==1.7.0)",
        "google-genai (==1.24.0)",
        "pydantic (==2.11.7)",
        "ollama (==0.5.1)",
    ],
)

In [None]:
remote_app.resource_name

#### Step 6: Registering the Agent in Agentspace

**Create oauth consent and mention the "clientId" and "clientSecret".**

In [None]:
%%bash

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
-H "X-Goog-User-Project: [your-project-id]" \
"https://discoveryengine.googleapis.com/v1alpha/projects/[your-project-id]/locations/global/authorizations?authorizationId=customhr9893" \
-d '{
"name": "projects/[your-project-id]/locations/global/authorizations/customhr9893",
"serverSideOauth2": {
"clientId": "[UPDATE-CLIENT-ID]",
"clientSecret": "[UPDATE-CLIENT-SECRET]",
"authorizationUri": "https://accounts.google.com/o/oauth2/v2/auth?client_id=[UPDATE-CLIENT-ID]&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform&include_granted_scopes=true&response_type=code&access_type=offline&prompt=consent",
"tokenUri": "https://oauth2.googleapis.com/token"
}
}'

In [None]:
%%bash

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
-H "X-Goog-User-Project: [your-project-number]" \
"https://discoveryengine.googleapis.com/v1alpha/projects/[your-project-number]/locations/global/collections/default_collection/engines/[your-agentspace-engine-id]/assistants/default_assistant/agents" \
-d '{
"displayName": "Gemma Agent v7",
"description": "Gemma Agent to answer questions.",
"adk_agent_definition": {
"tool_settings": {
"tool_description": "Gemma Agent to answer questions."
}, 
"provisioned_reasoning_engine": {
"reasoning_engine": "projects/[your-project-number]/locations/us-central1/reasoningEngines/[reasoning-engine-id]"
},
"authorizations": ["projects/[your-project-number]/locations/global/authorizations/customhr9885"]
}
}'

#### After agent with MCP server is enabled, you can interact with agent on agentspace. 

![oss_model_with_agentspace](https://services.google.com/fh/files/misc/oss_model_with_agentspace_v2.png)