Skip to content

Conversation

@miyannishar
Copy link

@miyannishar miyannishar commented Dec 9, 2025

Please ensure you have read the contribution guide before creating a pull request.

Link to Issue or Description of Change

1. Link to an existing issue (if applicable):

Problem:

ADK currently provides GcsArtifactService for Google Cloud Storage and FileArtifactService for local development, but there is no native artifact service for Amazon S3. This creates friction for developers and organizations using AWS infrastructure, as they must either:

  • Use FileArtifactService (not production-ready, limited scalability)
  • Use GcsArtifactService (requires cross-cloud setup, additional complexity)
  • Implement their own S3 service from scratch (time-consuming, error-prone)

For AWS-native deployments (EC2, ECS, Lambda, EKS), developers need a first-class S3 artifact storage solution that integrates seamlessly with their existing infrastructure and IAM policies.

Solution:

Implemented S3ArtifactService as a production-ready artifact storage backend that:

  • Extends BaseArtifactService following the same patterns as GcsArtifactService
  • Supports session-scoped artifacts: {app}/{user}/{session}/{filename}/{version}
  • Supports user-scoped artifacts: {app}/{user}/user/{filename}/{version} (using user: prefix)
  • Automatic version management (0, 1, 2, ...)
  • Custom metadata support via S3 object metadata
  • URL encoding for special characters in filenames
  • Async/sync pattern using asyncio.to_thread
  • Works with S3-compatible services (MinIO, DigitalOcean Spaces, etc.)
  • Added s3:// URI scheme support in service registry
  • boto3 added as optional dependency in extensions

Key Files Changed:

  • src/google/adk/artifacts/s3_artifact_service.py (NEW - ~612 lines)
  • src/google/adk/artifacts/__init__.py (export S3ArtifactService)
  • src/google/adk/cli/service_registry.py (register s3:// URI scheme)
  • pyproject.toml (add boto3>=1.28.0 to extensions)
  • tests/unittests/artifacts/test_artifact_service.py (add S3 tests)

Testing Plan

Unit Tests

  • I have added or updated unit tests for my change.
  • All unit tests pass locally. (Run tests using guide below)

Unit Test Summary:

Added comprehensive unit tests for S3ArtifactService:

  1. Mock Infrastructure:

    • MockS3Client: Simulates boto3 S3 client
    • MockS3Bucket: Simulates S3 bucket
    • MockS3Object: Simulates S3 objects with metadata
    • mock_s3_artifact_service(): Factory function for tests
  2. Test Coverage:

    • test_load_empty[S3] - Loading non-existent artifacts
    • test_save_load_delete[S3] - Basic CRUD operations
    • test_list_keys[S3] - Listing artifact keys
    • test_list_versions[S3] - Version management
    • test_list_keys_preserves_user_prefix[S3] - User-scoped artifacts
    • test_list_artifact_versions_and_get_artifact_version[S3] - Metadata operations
    • test_list_artifact_versions_with_user_prefix[S3] - User-scoped metadata
    • test_get_artifact_version_artifact_does_not_exist[S3] - Error handling
    • test_get_artifact_version_out_of_index[S3] - Edge cases
  3. Test Execution:

    # Run S3-specific tests
    pytest tests/unittests/artifacts/test_artifact_service.py -v -k "S3"
    
    # Expected: 9 tests pass
  4. Test Results: (Fill this in after running tests)

image

Manual End-to-End (E2E) Tests

E2E Test 1: Direct Instantiation

Setup:

from google.adk.artifacts import S3ArtifactService
from google.genai import types
import asyncio

async def test():
    service = S3ArtifactService(bucket_name="test-bucket")
    artifact = types.Part.from_text("Hello S3!")
    
    # Save
    version = await service.save_artifact(
        app_name="test-app",
        user_id="test-user",
        session_id="test-session",
        filename="test.txt",
        artifact=artifact
    )
    print(f"Saved version: {version}")
    
    # Load
    loaded = await service.load_artifact(
        app_name="test-app",
        user_id="test-user",
        session_id="test-session",
        filename="test.txt"
    )
    print(f"Loaded: {loaded.text}")

asyncio.run(test())

Expected Output:

Saved version: 0
Loaded: Hello S3!

E2E Test 2: Service Registry URI Configuration

Setup:

from google.adk.cli.service_registry import ServiceRegistry

registry = ServiceRegistry()
service = registry.create_artifact_service("s3://my-bucket")
print(f"Service type: {type(service).__name__}")

Expected Output:

Service type: S3ArtifactService

E2E Test 3: Integration with Runner

Setup:

from google.adk import Agent, Runner
from google.adk.artifacts import S3ArtifactService

artifact_service = S3ArtifactService(bucket_name="my-bucket")
agent = Agent(name="test", model="gemini-2.0-flash")
runner = Runner(agent=agent, artifact_service=artifact_service)

# Test that runner can use S3 artifacts
result = await runner.run_async(
    app_name="test",
    user_id="user1",
    session_id="session1",
    new_message="Test message"
)

Console Output: (Include actual output here)


Checklist

  • I have read the CONTRIBUTING.md document.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have added tests that prove my fix is effective or that my feature works.
  • New and existing unit tests pass locally with my changes. (Run tests - see guide)
  • I have manually tested my changes end-to-end. (Optional but recommended)
  • Any dependent changes have been merged and published in downstream modules. (N/A - new feature)

Additional Context

Code Quality

  • ✅ Follows Google Python Style Guide (2-space indent, 80-char lines)
  • ✅ Uses @override decorators
  • ✅ Comprehensive docstrings
  • ✅ Type hints with from __future__ import annotations
  • ✅ Proper error handling and logging
  • ✅ No linting errors

Design Decisions

  1. Optional Dependency: boto3 is in extensions to avoid breaking existing installations
  2. URL Encoding: Handles special characters in filenames (e.g., /, :, spaces)
  3. Async Pattern: Uses asyncio.to_thread matching GcsArtifactService pattern
  4. Service Registry: Follows same pattern as gs:// for consistency
  5. Feature Parity: All methods from BaseArtifactService implemented

Comparison with Existing Services

Feature S3ArtifactService GcsArtifactService FileArtifactService
Cloud Provider AWS GCP Local
Durability 99.999999999% 99.999999999% Disk-dependent
Scalability Unlimited Unlimited Disk-limited
Production Ready ✅ Yes ✅ Yes ⚠️ Dev/Test Only
URL Encoding ✅ Yes ❌ No ❌ No

Installation

Users can install with:

pip install google-adk[extensions]  # Includes boto3
# Or separately
pip install boto3

Usage Examples

Direct Instantiation:

from google.adk.artifacts import S3ArtifactService

service = S3ArtifactService(
    bucket_name="my-adk-artifacts",
    region_name="us-east-1"
)

URI Configuration:

from google.adk import App, Agent

app = App(
    name="my_app",
    root_agent=Agent(...),
    artifact_service_uri="s3://my-adk-artifacts"
)

Environment Variable:

export ADK_ARTIFACT_SERVICE_URI="s3://my-adk-artifacts"
adk run my_agent

Related Documentation

  • Implementation follows patterns from GcsArtifactService
  • Service registry integration matches gs:// scheme
  • Tests follow existing artifact service test patterns

Future Enhancements (Not in this PR)

  • App-scoped artifacts (app: prefix)
  • Multipart upload for large files (>5GB)
  • Presigned URLs for direct browser access
  • Built-in encryption support

@google-cla
Copy link

google-cla bot commented Dec 9, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @miyannishar, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the application's artifact management capabilities by introducing native support for Amazon S3 as a storage backend. This integration provides users with greater flexibility in choosing their artifact storage solutions, complementing existing file-based and Google Cloud Storage options. The changes encompass the implementation of a dedicated S3 artifact service, its seamless registration within the application's framework, and thorough unit testing to guarantee its reliability and correct operation.

Highlights

  • New S3 Artifact Service: Introduces a new S3ArtifactService class, enabling the storage and retrieval of application artifacts directly within Amazon S3 buckets. This service supports both session-scoped and user-namespaced artifacts.
  • Dependency Addition: The boto3 library, a Python SDK for AWS, has been added as a new dependency to facilitate interaction with Amazon S3.
  • Service Registration: The new S3 artifact service is integrated into the application's service registry, allowing it to be instantiated and used by specifying an 's3://' URI scheme.
  • Comprehensive Testing: Existing unit tests for artifact services have been extended to include the new S3 artifact service, along with the addition of mock classes for S3 interactions, ensuring robust functionality.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@adk-bot
Copy link
Collaborator

adk-bot commented Dec 9, 2025

Response from ADK Triaging Agent

Hello @miyannishar, thank you for creating this PR!

Before we can review your contribution, we need you to address a few items from our contribution guidelines:

  • Sign our Contributor License Agreement (CLA): It looks like the CLA check is failing. Please visit https://cla.developers.google.com/ to sign the agreement.
  • Link to an Issue: Could you please link to an existing issue or describe the bug or feature in the PR description?
  • Testing Plan: Please provide a "testing plan" section in your PR description to detail how you've tested these changes.

You can find more details in our contribution guidelines.

This information will help us to review your PR more efficiently. Thanks!

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new S3ArtifactService for storing artifacts in Amazon S3, along with the necessary configuration and tests. The implementation is well-structured and follows the async-over-sync pattern seen in other artifact services in the project.

A major point of concern is that the pull request title, 'fix: pass context to client inceptors', and the empty description do not reflect the content of the changes. It is crucial to update them to accurately describe the addition of the S3 artifact service for clarity and history tracking.

I've added a few comments with suggestions for improving maintainability, robustness, and code clarity.

Comment on lines +393 to +396
except Exception as e:
logger.error(
"Failed to list session artifacts for %s: %s", session_id, e
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Catching a broad Exception can hide unexpected errors and make debugging difficult, especially when the exception is not re-raised. It's better to catch more specific exceptions that you expect boto3 to raise, such as botocore.exceptions.ClientError. This makes the error handling more robust. This same issue is present in several other try...except blocks in this file where exceptions are not re-raised:

  • _list_artifact_keys_sync (lines 393-396 and 413-414)
  • _delete_artifact_sync (lines 440-441)
  • _list_versions_sync (lines 483-485)
  • _get_artifact_version_sync (lines 538-545)

Please narrow the exception type in all these locations. You'll need to import ClientError from botocore.exceptions.

Suggested change
except Exception as e:
logger.error(
"Failed to list session artifacts for %s: %s", session_id, e
)
except ClientError as e:
logger.error(
"Failed to list session artifacts for %s: %s", session_id, e
)

"""
if self._file_has_user_namespace(filename):
# Remove "user:" prefix before encoding
actual_filename = filename[5:] # len("user:") == 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a magic number 5 for the length of "user:" can be brittle and harms readability. It's better to calculate the length directly using len("user:"). For further improvement and consistency with other artifact services (e.g., file_artifact_service.py), consider defining a module-level constant _USER_NAMESPACE_PREFIX = "user:" and using it here and in _file_has_user_namespace.

Suggested change
actual_filename = filename[5:] # len("user:") == 5
actual_filename = filename[len("user:"):] # len("user:") == 5

Bucket=self.bucket_name, Key=object_key
)

metadata = response.get("Metadata", {}) or {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The or {} is redundant here. response.get("Metadata", {}) will never return a falsy value like None that would cause the or to be evaluated; it will return the value for "Metadata" if it exists, or the default value {} if it doesn't. Removing the redundant part makes the code cleaner.

Suggested change
metadata = response.get("Metadata", {}) or {}
metadata = response.get("Metadata", {})

if service_type == ArtifactServiceType.GCS:
uri = f"gs://test_bucket/{app_name}/{user_id}/user/{user_scoped_filename}/{i}"
elif service_type == ArtifactServiceType.S3:
uri = f"s3://test_bucket/{app_name}/{user_id}/user/document.pdf/{i}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The filename document.pdf is hardcoded here. It's better to derive it from the user_scoped_filename variable to make the test more robust and less reliant on magic values. This ensures that if user_scoped_filename changes, this test won't break unexpectedly.

Suggested change
uri = f"s3://test_bucket/{app_name}/{user_id}/user/document.pdf/{i}"
uri = f"s3://test_bucket/{app_name}/{user_id}/user/{user_scoped_filename.split(':', 1)[1]}/{i}"

@adk-bot adk-bot added the services [Component] This issue is related to runtime services, e.g. sessions, memory, artifacts, etc label Dec 9, 2025
@miyannishar miyannishar changed the title fix: pass context to client inceptors feat: added s3 artifact service Dec 9, 2025
@DeanChensj
Copy link
Collaborator

Hi @miyannishar , thanks for the PR, I think it is a great add to the adk-python-community repo, would you mind moving your change there?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

services [Component] This issue is related to runtime services, e.g. sessions, memory, artifacts, etc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add S3ArtifactService for Amazon S3 artifact storage support

4 participants