Skip to content

[Feature]: Support for AWS S3 Vectors (vector buckets, indexes, and similarity search) #3047

@j-monteiro

Description

@j-monteiro

Feature Category

Other

S3 API Operation

S3 Vectors — CreateVectorBucket, CreateIndex, PutVectors, QueryVectors, and 14 additional operations

Description

First, a genuine thank you for maintaining S3Mock as an open source project — it has been a great tool for our team that saved us time and infrastructure complexity. We really appreciate the work that goes into keeping it aligned with the AWS API.

We would like to request support for AWS S3 Vectors, a new AWS service purpose-built for storing and querying vector embeddings.

S3 Vectors is a distinct service from S3 (service namespace s3vectors, separate SDK client S3VectorsClient, JSON-based RPC API), but is positioned as a natural companion to S3 object storage. As teams adopting S3 Vectors for AI/ML workloads typically test S3 and S3 Vectors together, a mock for integration testing is immediately needed.

The full API surface is 18 operations across five areas:

Area Operations
Vector Bucket management CreateVectorBucket, DeleteVectorBucket, GetVectorBucket, ListVectorBuckets
Vector Index management CreateIndex, DeleteIndex, GetIndex, ListIndexes
Vector CRUD PutVectors, GetVectors, DeleteVectors, ListVectors
Similarity search QueryVectors
Policy & tagging PutVectorBucketPolicy, GetVectorBucketPolicy, DeleteVectorBucketPolicy, TagResource, UntagResource, ListTagsForResource

All operations use HTTP POST with JSON bodies (no XML), which would require a parallel JSON serialization configuration alongside the existing XML stack.

Use Case

At Salsify we migrated from FakeS3 to S3Mock for our integration test suite and have been very happy with it — it covers our S3 object storage needs reliably. We are now adopting AWS S3 Vectors for AI/ML workloads that require vector similarity search and need a corresponding mock for local development and CI integration tests.

Without a local mock for S3 Vectors we cannot run our full integration test suite in isolation, which affects both developer productivity and CI reliability.

Proposed Solution

We are evaluating contributing this feature ourselves. Our current thinking:

  • Serve S3 Vectors operations under a configurable path prefix (e.g. /s3vectors) on the same port as the main S3Mock, so users can point S3VectorsClient.endpointOverride(...) to the mock without running a second process.
  • Add a parallel JSON ObjectMapper bean for the new endpoints, leaving the existing XML stack untouched.
  • Implement QueryVectors as brute-force exact search (euclidean / cosine) — sufficient for a test mock, no ANN library required.
  • Follow the existing Controller → Service → Store layered architecture.
  • Add integration tests using the real S3VectorsClient against the Docker container, consistent with how all existing integration tests are structured.
  • Extend the testsupport module with a createS3VectorsClient() factory, similar to the existing createS3Client() helpers.

Before investing the effort, we want to confirm this is a direction the maintainers are open to and to align on any architectural preferences.

One question on process: we noticed CONTRIBUTING.md already contains a For Agents section. Would AI-assisted contributions be acceptable to the maintainers, and are there any preferences or additional guidelines beyond what is already documented there?

Alternatives Considered

  • local-s3: Another S3 mock that already supports S3 Vectors. Since we are already invested in S3Mock and it better fits our existing test infrastructure, we would strongly prefer to contribute here rather than switch projects — but local-s3 with working S3 Vectors support is our fallback if this feature isn't viable in S3Mock.
  • Testcontainers with real AWS: Not viable for offline or cost-sensitive CI environments.

Additional Context

A few things worth noting before we proceed: we are aware of the Adobe CLA requirement and will sign it before submitting any PRs. We would target the main branch and follow all CI gates (build, unit tests, integration tests, ktlint, checkstyle, Docker).

Metadata

Metadata

Assignees

Labels

No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions