# Image Analysis Agent with GCS Artifact Persistence and Vertex AI Agent Engine Deployment

This notebook demonstrates how to build and deploy an image analysis application using the **Google Agent Development Kit (ADK)** and **Vertex AI Agent Engine**. It specifically showcases how to handle persistent image data using **Google Cloud Storage (GCS)** as an artifact service.

---

### Key Components

* **LlmAgent (Root Agent):** Configured with `gemini-2.5-flash`. It is given specific instructions to analyze images and is equipped with the `load_artifacts` tool to retrieve image content when it isn't directly in the prompt context.
* **GcsArtifactService:** Manages the storage of files in a GCS bucket. This allows the application to handle large files (like 4K images) or multiple files without bloating the immediate chat history.
* **SaveFilesAsArtifactsPlugin:** An ADK plugin that automatically intercepts files sent in a message and saves them as managed artifacts in the GCS bucket.
* **AdkApp:** The core application wrapper that integrates the agent, the artifact service, and the plugins into a single deployable unit.

### Workflow Summary

1. **Local Execution:** The user sends a base64-encoded image to the `AdkApp` via a `ChatClient`.
* The `SaveFilesAsArtifactsPlugin` saves this image to GCS.
* The agent uses the `load_artifacts` tool to "see" the image, then provides a detailed description.
* In subsequent turns, the agent maintains context, correctly identifying the number of people in the image without needing the file re-uploaded.


2. **Remote Deployment:**
* The application is packaged and deployed to **Vertex AI Agent Engine**.
* The notebook verifies the remote deployment by running the same image analysis query against the cloud-hosted instance.



---

### Why this architecture matters

By using `GcsArtifactService`, you avoid passing massive amounts of raw data back and forth in every turn of a conversation. Instead, the data stays in storage, and the agent only "loads" what it needs, making the application more scalable and cost-effective for multi-modal tasks.

## Install packages

In [None]:
%pip install --upgrade --user google-adk

In [None]:
# Reboot kernel
import IPython
app = IPython.Application.instance()
_ = app.kernel.do_shutdown(True)

## Preparation

In [1]:
import base64
import os
import vertexai
from vertexai.agent_engines import AdkApp
from google.adk.agents import LlmAgent
from google.adk.artifacts import GcsArtifactService
from google.adk.plugins.save_files_as_artifacts_plugin import SaveFilesAsArtifactsPlugin
from google.adk.tools import load_artifacts
from google.genai.types import Part, Content

[PROJECT_ID] = !gcloud config list --format 'value(core.project)'
LOCATION = 'us-central1'

vertexai.init(project=PROJECT_ID, location=LOCATION)

os.environ['GOOGLE_CLOUD_PROJECT'] = PROJECT_ID
os.environ['GOOGLE_CLOUD_LOCATION'] = LOCATION
os.environ['GOOGLE_GENAI_USE_VERTEXAI'] = 'True'

BUCKET = f'{PROJECT_ID}_artifacts'
!gsutil ls -b gs://{BUCKET} 2>/dev/null || \
 gsutil mb -l {LOCATION} gs://{BUCKET}

gs://etsuji-15pro-poc_artifacts/


In [2]:
# Chat client to test AdkApp
class ChatClient:
    def __init__(self, app, user_id='default_user'):
        self._app = app
        self._user_id = user_id
        self._session_id = None
        
    async def async_stream_query(self, message):
        if not self._session_id:
            session = await self._app.async_create_session(
                user_id=self._user_id,
            )
            self._session_id = getattr(session, 'id', None) or session['id']

        result = []
        async for event in self._app.async_stream_query(
            user_id=self._user_id,
            session_id=self._session_id,
            message=message,
        ):
            print('====')
            print(event)
            print('====')
            if ('content' in event and 'parts' in event['content']):
                response = '\n'.join(
                    [p['text'] for p in event['content']['parts'] if 'text' in p]
                )
                if response:
                    print(response)
                    result.append(response)
        return result

## Define root agent and AdkApp

In [3]:
root_agent = LlmAgent(
    name='image_analyst_agent',
    model='gemini-2.5-flash',
    instruction='''
Your role is to analyze given image files.
Use load_artifacts() if the image content is not in the context.
''',
    tools=[load_artifacts]
)

def artifact_builder():
    return GcsArtifactService(bucket_name=BUCKET)

app = AdkApp(
    agent=root_agent,
    app_name='iamge_analyzer_app',
    artifact_service_builder=artifact_builder,
    plugins=[SaveFilesAsArtifactsPlugin()],
)

## Test the local AdkApp

In [4]:
def get_image_data(file_path: str):
    with open(file_path, 'rb') as f:
        image_bytes = f.read()
    return base64.b64encode(image_bytes).decode('utf-8')

In [5]:
client = ChatClient(app)

image_base64 = get_image_data('testimage.png')
message_input = {
        'role': 'user',
        'parts': [
            {'text': 'describe the image'},
            {
                'inline_data': {
                    'mime_type': 'image/png',
                    'data': image_base64
                }
            }
        ]
}

# For the first turn, the agent loads the image content from artifacts into the session context.
_ = await client.async_stream_query(message_input)



====
{'model_version': 'gemini-2.5-flash', 'content': {'parts': [{'function_call': {'id': 'adk-093e1ce0-82cf-4f93-b790-efb0b59aeb96', 'args': {'artifact_names': ['artifact_e-2ee9157b-1ef6-42bb-bd64-43a5ad44786b_1']}, 'name': 'load_artifacts'}, 'thought_signature': 'CtwBAY89a1_RDGvDGuxvpttse7dYSdi7hXMmI1oV34qMvcxptMtIlLPsqzLVFopMBAwWEaSr8lwC65byq-ujYbXL2X-Dy5Pk5mgs0mImE2zYBvHNrrRBsifhlVx69znLdeNob-uInjJRzfITE1a_V-v0W2tIYHE_wC-0QbhiAvye1TWppyG6XYrcR73s9uGIvP74w9l8hepUoNR2PlwcjeSYb5DXCQlVYWOZKK9OMxPM8-BsAik4KBQ5J0fLr51lzVUU7jfF3jula-gLE3fuZB-lepe2j8S_j41OY4DuUQ=='}], 'role': 'model'}, 'finish_reason': 'STOP', 'usage_metadata': {'candidates_token_count': 43, 'candidates_tokens_details': [{'modality': 'TEXT', 'token_count': 43}], 'prompt_token_count': 1540, 'prompt_tokens_details': [{'modality': 'IMAGE', 'token_count': 1290}, {'modality': 'TEXT', 'token_count': 250}], 'thoughts_token_count': 45, 'total_token_count': 1628, 'traffic_type': 'ON_DEMAND'}, 'avg_logprobs': -0.43170449900072677, '

In [6]:
# For the second turn, the agent reuses the image content in the session context.
_ = await client.async_stream_query('The number of people in the image?')

====
{'model_version': 'gemini-2.5-flash', 'content': {'parts': [{'text': 'There are 3 people in the image.', 'thought_signature': 'CuUBAY89a198VpYvKWrt89tmsYSJP4tiUeXnnHmnjCxXdNL6_oF4vZI-0sA0hCi24jGQAKIr0zvGz1ojHam7_GTBQbqVinwNayuTlLHDSfBAWYO7h8DKFEWxiu__im7eFNe-id9VuKvmbgGw4WHTP1eujGIzW4VURWP69-whw412PKWKCjSnckyu1pPWq7gY9ZiT7aT9NTHYtafs4CoQNsc7jY2RH411lG5_hrahuhTHgHnRBVrWuwnU7P9bndRqc6nJ_T5CxdS2BVRKGrroBTK-MKOkzNYrFIH946liS7AzPZ6DAOi7EA=='}], 'role': 'model'}, 'finish_reason': 'STOP', 'usage_metadata': {'cache_tokens_details': [{'modality': 'TEXT', 'token_count': 454}, {'modality': 'IMAGE', 'token_count': 1129}], 'cached_content_token_count': 1583, 'candidates_token_count': 9, 'candidates_tokens_details': [{'modality': 'TEXT', 'token_count': 9}], 'prompt_token_count': 1766, 'prompt_tokens_details': [{'modality': 'TEXT', 'token_count': 519}, {'modality': 'IMAGE', 'token_count': 1290}], 'thoughts_token_count': 41, 'total_token_count': 1816, 'traffic_type': 'ON_DEMAND'}, 'avg_logprobs':

## Deploy AdkApp on Agent Engine

In [7]:
agent_engines = vertexai.Client().agent_engines
display_name = 'Image Analyzer App'

remote_app = None
for item in agent_engines.list():
    if item.api_resource.display_name == display_name:
        remote_app = agent_engines.get(name=item.api_resource.name)
        break

if not remote_app:
    remote_app = agent_engines.create(
        agent=app,
        config={
            'agent_framework': 'google-adk',
            'requirements': ['google-adk==1.21.0'],
            'staging_bucket': f'gs://{PROJECT_ID}',
            'display_name': display_name,
        }
    )

The following requirements are missing: {'google-cloud-aiplatform', 'cloudpickle', 'pydantic'}


## Test the deployed AdkApp

In [8]:
client = ChatClient(remote_app)

image_base64 = get_image_data('testimage.png')
message_input = {
        'role': 'user',
        'parts': [
            {'text': 'describe the image'},
            {
                'inline_data': {
                    'mime_type': 'image/png',
                    'data': image_base64
                }
            }
        ]
}

_ = await client.async_stream_query(message_input)

====
{'model_version': 'gemini-2.5-flash', 'content': {'parts': [{'function_call': {'id': 'adk-74b5ffca-3a82-4b08-a45b-3a865f63dcd7', 'args': {'artifact_names': ['artifact_e-0b70f13a-fb6f-45f6-8b19-d1edbf0fcb9b_1']}, 'name': 'load_artifacts'}, 'thought_signature': 'Cr4BAY89a1_161HeqW9ZuwCIwpZ945-ijH2jA5yrtRfB8rQKklFMyXHSI5toun-lwPn9sB43VWtosZ6pyp8eXKKuaIYVCaAtW2y1G9MOsXALGPxQHwjQtTpVnEDCqrxEtTu6seRzkWbwjBqDAU2kxfDUkD9ez2chWHFSy1Q5Vf18wlg0unDTTroTKARwcIl1wRM4kqC0_ToJNCz0SURacU6S15pGYVRnNh-yVnN4wa4SBlEmFbPYLMS1Q6SzC1Dq1A=='}], 'role': 'model'}, 'finish_reason': 'STOP', 'usage_metadata': {'candidates_token_count': 44, 'candidates_tokens_details': [{'modality': 'TEXT', 'token_count': 44}], 'prompt_token_count': 1542, 'prompt_tokens_details': [{'modality': 'IMAGE', 'token_count': 1290}, {'modality': 'TEXT', 'token_count': 252}], 'thoughts_token_count': 36, 'total_token_count': 1622, 'traffic_type': 'ON_DEMAND'}, 'avg_logprobs': -0.25456666946411133, 'invocation_id': 'e-0b70f13a-fb6f-45f6-8b

In [9]:
_ = await client.async_stream_query('The number of people in the image?')

====
{'model_version': 'gemini-2.5-flash', 'content': {'parts': [{'text': 'There are 3 people in the image.', 'thought_signature': 'CrMCAY89a19ilsBp3IrrKrig1DPdQiztK2U42y7qaX5IMsHS-Qcq_AB76kr9dZqLn16Mr9jiULTpmiDT4cwO7ckJD7bObiwJ2cGUaDz_ZMNz6N4vLVyHTvyyeHrpekcJfnE23nCnm7bnNrCfiLifIrWK6vw_Cvgf9fx9Vq-9-eQ4bfUAeJrJACi8o6RRKH6ES-Ets3aHsK2Of6Bphc3Ng8F1ogvGBjsGzP48N0SldLUqhNMuSx4uuK2Lca65G7hYgYZmozXA6lIUzx_gjJIbdMUtHUzDvVhCRcGHifghbjGlcnOQ6nFEEC1m-aEE5mBWndZk_FGX5JRhbNZM4mtt4WcWRateTO6MhYNpNyIYlO9bTX9hHjeyvzcCxgt7G8ibJ-ez_sFekFiRxSx2Mi73SaOQKEfyMA=='}], 'role': 'model'}, 'finish_reason': 'STOP', 'usage_metadata': {'cache_tokens_details': [{'modality': 'TEXT', 'token_count': 489}, {'modality': 'IMAGE', 'token_count': 1102}], 'cached_content_token_count': 1591, 'candidates_token_count': 9, 'candidates_tokens_details': [{'modality': 'TEXT', 'token_count': 9}], 'prompt_token_count': 1769, 'prompt_tokens_details': [{'modality': 'IMAGE', 'token_count': 1290}, {'modality': 'TEXT', 'token_count': 573