Skip to content

feat: migrate S3 from rust-s3 to rusty-s3#71

Merged
digizeph merged 14 commits intomainfrom
dev/migrate-to-rusty-s3
May 1, 2026
Merged

feat: migrate S3 from rust-s3 to rusty-s3#71
digizeph merged 14 commits intomainfrom
dev/migrate-to-rusty-s3

Conversation

@digizeph
Copy link
Copy Markdown
Member

@digizeph digizeph commented May 1, 2026

Migrates the S3 implementation from rust-s3 to rusty-s3 (sans-IO signing + reqwest HTTP).

Key changes:

  • Shared reqwest HTTP client (no duplicate HTTP stack)
  • Single PUT for small files, multipart for large files
  • Added s3_copy with proper AWS SigV4 Authorization header
  • Added s3_exists and s3_delete
  • New ONEIO_S3_MULTIPART_THRESHOLD env var
  • Added R2 integration tests (ignored on CI)

Breaking changes:

  • s3_bucket() returns S3Bucket
  • s3_stats() returns S3ObjectMetadata
  • Removed rust-s3 type re-exports

Tested against Cloudflare R2.

Fixes CopyObject failures on R2 caused by incorrect presigned URL signing.

digizeph added 2 commits May 1, 2026 07:26
Replace rust-s3 with rusty-s3 (sans-IO signing + reqwest HTTP transport).

Key changes:
- S3 operations use reqwest for HTTP (shared with HTTP operations)
- Single PUT for files < multipart threshold (default: 8MB)
- Multipart upload with auto-calculated part sizing for large files
- Added s3_copy() with proper AWS SigV4 Authorization header
- Added s3_exists(), s3_delete() functions
- Added S3Config/S3Credentials public types
- Added ONEIO_S3_MULTIPART_THRESHOLD env var
- Added integration tests for R2 (require credentials)

Breaking changes:
- s3_bucket() returns S3Bucket instead of rust_s3::Bucket
- s3_stats() returns S3ObjectMetadata instead of HeadObjectResult
- Removed rust-s3 type re-exports
- Fix CompleteMultipartUpload false-success: parse response body for embedded <Error>
- Fix CopyObject false-success: read and validate response body for errors
- Fix multipart upload leak: abort on any part-upload or completion failure
- Fix dotenvy filesystem hit: load .env once via OnceLock instead of every call
- Fix s3_exists masking bucket errors: only NoSuchKey maps to false, not all not-found
- Fix S3 feature without TLS: s3 feature now depends on https
- Fix S3 client bypassing OneIo config: honor ONEIO_CA_BUNDLE and ONEIO_ACCEPT_INVALID_CERTS
- Fix missing rustls crypto provider init: call ensure_default_provider() before S3 client build
- Fix Debug leaking secrets: implement custom Debug that redacts secret_key and session_token
- Fix s3_copy dropping non-default ports: include port in Host header when non-standard
- Fix VirtualHost for dotted buckets: fall back to path-style when bucket name contains dots
- Fix upload_single buffering entire file: stream File directly into reqwest body
- Change default multipart threshold from 8MB to 5MB (S3 minimum part size)
- Add 5GB CopyObject limit check: reject larger objects with clear error
- Add 5-minute request timeout to S3 HTTP client
- Include raw response body in non-XML S3 error messages
@digizeph
Copy link
Copy Markdown
Member Author

digizeph commented May 1, 2026

Addressed MAGI review issues in commit 211dc0f.

Critical fixes:

  • Fixed CompleteMultipartUpload and CopyObject false-success bugs: now parse response body for embedded even on HTTP 200
  • Fixed multipart upload leak: abort multipart on any part-upload or completion failure
  • Fixed dotenvy filesystem hit: load .env once via OnceLock instead of on every S3 call

High priority fixes:

  • Fixed s3_exists masking bucket errors: only NoSuchKey maps to false
  • Fixed S3 feature without TLS: s3 now depends on https
  • Fixed S3 client bypassing OneIo config: now honors ONEIO_CA_BUNDLE and ONEIO_ACCEPT_INVALID_CERTS
  • Fixed missing rustls crypto provider init
  • Fixed Debug leaking secrets: custom Debug redacts secret_key and session_token
  • Fixed s3_copy dropping non-default ports
  • Fixed VirtualHost breaking dotted buckets: fall back to path-style for buckets with dots

Medium priority fixes:

  • upload_single now streams File directly instead of buffering entire file into memory
  • Changed default multipart threshold from 8MB to 5MB (S3 minimum part size)
  • Added 5GB CopyObject limit check
  • Added 5-minute request timeout to S3 HTTP client
  • Non-XML S3 errors now include raw response body text

digizeph added 4 commits May 1, 2026 08:39
- Mark rusty-s3-migration spec as Complete
- Move Spec Index from AGENTS.md to specs/README.md
- Update implementation checklist with completed items
- Remove 300s request timeout from S3 client to prevent breaking streaming downloads
- Add HTTP status validation before body parsing in CompleteMultipartUpload
- Add HTTP status validation before body parsing in CopyObject
- Replace fragile body.contains("<Error>") with direct parse_s3_error_xml call
- Fix multipart upload short-read bug: use take().read_to_end() instead of read()
- Reuse chunk buffer across multipart iterations instead of reallocating per part
- Fix s3_exists string matching fragility: check HTTP status code directly (200/404)
- Fix content_length == 0 rejecting empty S3 objects: removed incorrect zero check
- Bump version to 0.22.0 for breaking API changes
- Map S3 HTTP errors to OneIoError::Status instead of NotSupported
- Add TODO to upstream CopyObject to rusty-s3
@digizeph
Copy link
Copy Markdown
Member Author

digizeph commented May 1, 2026

Round 3 fixes pushed (commit 121f69b).

  • Fix multipart upload short-read bug: use take().read_to_end() instead of read() which could return fewer bytes than requested
  • Reuse chunk buffer across multipart iterations instead of allocating 8MB per part
  • Fix s3_exists string matching fragility: now checks HTTP status code directly (200 = true, 404 = false)
  • Fix content_length == 0 incorrectly rejecting empty S3 objects
  • Bump version to 0.22.0 for breaking API changes
  • Map S3 HTTP errors to OneIoError::Status instead of NotSupported
  • Added TODO to upstream CopyObject to rusty-s3 (rusty-s3 v0.9 does not provide CopyObject action or header-based signing)

digizeph added 2 commits May 1, 2026 09:56
- Add message field to OneIoError::Status variant to preserve S3 XML error detail
- map_parsed_s3_error now includes key/bucket/message in error output
- Replace chunk.clone() with std::mem::take in multipart upload to avoid 8MB copies
- Remove redundant extract_etag fallback for uppercase ETag (reqwest normalizes to lowercase)
- Remove unnecessary authorization.clone() in s3_copy
@digizeph
Copy link
Copy Markdown
Member Author

digizeph commented May 1, 2026

Round 4 fixes pushed (commit d9533b7).

  • Add message field to OneIoError::Status variant to preserve S3 XML error detail
  • map_parsed_s3_error now includes key/bucket/message context in error output (e.g., "Object not found: my-key.txt")
  • Replace chunk.clone() with std::mem::take in multipart upload to avoid copying 8MB per part
  • Remove redundant extract_etag fallback for uppercase ETag (reqwest normalizes to lowercase)
  • Remove unnecessary authorization.clone() in s3_copy

All 13 S3 integration tests pass against R2.

digizeph added 2 commits May 1, 2026 10:13
- Revert version from 0.22.0 back to 0.21.0 (release will be a separate commit)
- Add version reference check to AGENTS.md release checklist
- Use std::mem::replace with pre-allocated Vec in multipart upload to preserve capacity across iterations
- Remove extra HEAD request in s3_copy; rely on S3 error response for oversized objects
- Remove unused S3_COPY_MAX_SIZE constant
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates OneIO’s S3 backend from rust-s3 to a rusty-s3 (signing) + reqwest (HTTP transport) implementation, adds missing S3 operations, and updates docs/tests to validate Cloudflare R2 compatibility.

Changes:

  • Replaced the legacy src/s3.rs implementation with a new src/s3/ module using rusty-s3 signing + shared reqwest::blocking::Client.
  • Added/updated S3 operations (multipart upload thresholding, copy with SigV4 Authorization header, exists/delete/list/stats) and improved S3 error parsing/messaging.
  • Added ignored R2 integration tests and updated CLI/docs/CHANGELOG for the new behavior and breaking API changes.

Reviewed changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
Cargo.toml Swaps rust-s3 for rusty-s3 + XML/signing deps; wires feature flags and adds an S3 integration test target.
src/s3/mod.rs New S3 implementation (signing, reqwest transport, multipart upload, copy/delete/exists/list/stats, XML error parsing).
src/s3/config.rs New S3 config/credential loading and endpoint normalization.
src/s3.rs Removes the previous rust-s3-based implementation.
src/error.rs Extends OneIoError::Status to carry an optional message and removes old rust-s3 conversions.
src/client.rs Updates S3 content-length handling to match new s3_stats() return type.
src/bin/oneio.rs Fixes a short-flag collision for the S3 list delimiter option.
tests/s3_integration.rs Adds ignored Cloudflare R2 integration tests covering uploads/downloads/list/head/copy/delete/error cases.
examples/s3_operations.rs Updates the S3 example to exercise the expanded API and print detailed failures.
src/lib.rs Updates docs/examples and documents S3 environment variables.
README.md Updates version snippets and expands S3 usage/config docs.
CHANGELOG.md Documents the migration, breaking changes, and behavior changes.
specs/rusty-s3-migration/README.md Adds the migration spec and implementation notes.
specs/README.md Adds a specs index.
AGENTS.md Adds repository development/process guidelines for contributors/agents.
.gitignore Ignores .env.local.

Comment thread src/s3/mod.rs
Comment thread src/s3/mod.rs
Comment thread src/s3/mod.rs Outdated
Comment thread src/s3/mod.rs Outdated
Comment thread Cargo.toml Outdated
Comment thread tests/s3_integration.rs Outdated
digizeph added 4 commits May 1, 2026 13:10
- Changed s3 feature to depend on http instead of https for native-tls compatibility
- Fixed s3_copy to error on missing host instead of producing invalid Host header
- Fixed s3_stats to error on missing content-length header instead of defaulting to 0
- Replaced create_temp_file with TempFile RAII wrapper for automatic cleanup

All 13 S3 integration tests pass against Cloudflare R2.
HTTP headers are case-insensitive per RFC 7230. Using reqwest::header::ETAG
instead of hardcoded etag ensures ETag extraction works regardless of
the case returned by the S3 service (ETag, etag, Etag, etc.).

Fixes review issue: Case-sensitive ETag header extraction
Adds a test-s3 job that runs the s3_integration test binary against
Cloudflare R2 using repository secrets. The job runs after build and
test pass, and is skipped on forks to avoid credential exposure.
@digizeph digizeph merged commit 9db52d2 into main May 1, 2026
6 checks passed
@digizeph digizeph deleted the dev/migrate-to-rusty-s3 branch May 1, 2026 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants