Skip to content

refactor(storage): per-bucket S3 profiles instead of s3+<name> scheme#5

Merged
papaharry merged 1 commit into
mainfrom
feature/s3-bucket-keyed-profiles
Jun 30, 2026
Merged

refactor(storage): per-bucket S3 profiles instead of s3+<name> scheme#5
papaharry merged 1 commit into
mainfrom
feature/s3-bucket-keyed-profiles

Conversation

@papaharry

Copy link
Copy Markdown
Collaborator

Summary

Reworks S3-compatible multi-backend support per review feedback: instead of embedding a config label in the URI scheme (s3+<name>://), which gets persisted into the metastore, route by bucket name via per-bucket profiles under storage.s3.profiles.<bucket>. URIs stay canonical s3://bucket/path — nothing synthetic is persisted; routing is resolved at runtime from node-local config.

Why

The previous s3+<name>:// design baked a config-time label into durable metastore state (the index_uri and per-split metadata). That made profiles un-renameable, tied environments to a shared label convention, and required teaching the Uri type a non-canonical scheme. Keying on bucket name (already intrinsic to the object location) avoids all of that.

Changes

  • quickwit-common/uri.rs: revert all s3+<name> scheme handling to upstream — Uri stays canonical.
  • quickwit-config: namedprofiles (keyed by bucket), NamedS3StorageConfigS3ProfileConfig, is_named_backendis_profile. Drop URL-scheme name validation (a bucket name isn't a scheme).
  • quickwit-storage: resolver extracts the bucket via parse_s3_uri and looks it up in profiles — match → that profile's config + per-bucket client cache; no match → primary backend.
  • docs: rewrite as "Per-bucket S3 profiles", including the bucket-name uniqueness caveat (a bucket name maps to exactly one backend).

Trade-off

Bucket-keyed routing cannot disambiguate the same bucket name across two different endpoints (S3-compatible namespaces are per-endpoint). Documented; acceptable since bucket names are under our control.

Test plan

  • cargo +nightly fmt --all -- --check
  • cargo clippy -p quickwit-config -p quickwit-storage --all-features --tests (clean)
  • cargo test -p quickwit-config storage_config (14 passed) + -p quickwit-common uri (9 passed)
  • End-to-end through the REST stack: ingest → split → upload → publish → search, validated against an in-cluster SeaweedFS profile and an external OVH profile (own credentials, routed purely by bucket name), plus unlisted-bucket fallback to the primary backend.

🤖 Generated with Claude Code

…> scheme

Replaces the named-backend design (which embedded a config label in the
`s3+<name>://` URI scheme, persisting it into the metastore) with per-bucket
profiles keyed by bucket name under `storage.s3.profiles.<bucket>`. URIs stay
canonical `s3://bucket/path`, so nothing synthetic is persisted: routing is by
bucket name, resolved at runtime from node-local config.

- quickwit-common/uri.rs: revert all s3+<name> scheme handling to upstream;
  the Uri type no longer needs to carry a non-canonical scheme.
- quickwit-config: `named` -> `profiles` (keyed by bucket), `NamedS3StorageConfig`
  -> `S3ProfileConfig`, `is_named_backend` -> `is_profile`. Drop the URL-scheme
  name validation (a bucket name is not a scheme).
- quickwit-storage: resolver extracts the bucket via parse_s3_uri and looks it
  up in `profiles`; match -> that profile's config + per-bucket client cache,
  no match -> primary backend.
- docs: rewrite as "Per-bucket S3 profiles", including the bucket-name
  uniqueness caveat (a bucket name maps to exactly one backend).

Validated end-to-end through the REST stack: ingest -> split -> upload ->
publish -> search against in-cluster SeaweedFS and external OVH backends.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@papaharry papaharry merged commit 2a8ab12 into main Jun 30, 2026
@papaharry papaharry deleted the feature/s3-bucket-keyed-profiles branch June 30, 2026 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant