Skip to content

feat(proxy): Introduce an S3-aware proxy mode in dfdaemon#1779

Open
YQ-Wang wants to merge 1 commit intodragonflyoss:mainfrom
YQ-Wang:feat/s3-aware-proxy
Open

feat(proxy): Introduce an S3-aware proxy mode in dfdaemon#1779
YQ-Wang wants to merge 1 commit intodragonflyoss:mainfrom
YQ-Wang:feat/s3-aware-proxy

Conversation

@YQ-Wang
Copy link
Copy Markdown
Contributor

@YQ-Wang YQ-Wang commented Apr 10, 2026

Description

Add an S3-aware proxy mode to dfdaemon that classifies incoming S3 requests and routes them
accordingly:

  • GetObject (full and ranged): accelerated through Dragonfly P2P. If the caller's SigV4
    signature explicitly covers the Range header, dfdaemon preserves that original signed
    Range on the source request instead of rewriting it.
  • HeadObject and ListObjectsV2: direct passthrough.
  • Other S3 APIs: passthrough, not accelerated.

Related Issue

#1778

Motivation and Context

Organizations storing large ML models, datasets, or media files in S3 and distributing them
across many hosts cannot leverage Dragonfly P2P today without rewriting download logic to use
Dragonfly's API directly. With S3-aware proxy mode, those workloads are accelerated transparently:
applications only need to set HTTPS_PROXY and trust the proxy CA, no code changes required.

The proxy preserves caller-provided SigV4 headers and presigned URLs end-to-end. dfdaemon does
not discover AWS credentials or re-sign origin requests, keeping the security model unchanged.

Testing

Tested end-to-end against a real S3 object (s3://bucket-name/test_dragonfly/file.mp4,
~50 MB) through a locally deployed Dragonfly cluster (manager, scheduler, seed-client, client via
Docker Compose). All tests ran inside the client container with HTTPS_PROXY and AWS_CA_BUNDLE
configured to route traffic through the patched dfdaemon.

boto3 test matrix (all passed with correct proxy routing confirmed via dfdaemon logs):

Test Method Route
HeadObject s3.head_object() passthrough
ListObjectsV2 s3.list_objects_v2() passthrough
GetObject (full) s3.get_object() P2P via dfdaemon
GetObject (small range) s3.get_object(Range='bytes=0-1023') P2P via dfdaemon
GetObject (nonzero range) s3.get_object(Range='bytes=1048576-2097151') P2P via dfdaemon
Multipart download s3.download_file() P2P via dfdaemon
Presigned HEAD presigned URL + HTTP HEAD passthrough
Presigned GET (full) presigned URL + HTTP GET P2P via dfdaemon
Presigned GET (ranged) presigned URL + HTTP GET + Range header P2P via dfdaemon

s5cmd tests (all passed):

Test Command Route
Default download (full-object signed range) s5cmd cp s3://... . P2P via dfdaemon
Forced multipart (multiple signed partial ranges) s5cmd cp --part-size 5 --concurrency 3 s3://... . P2P via dfdaemon

Presigned URL test: Tested a regional presigned URL (bucket-name.s3.us-east-1.amazonaws.com) for both full GET and ranged GET — both routed correctly.

Recursive folder download: s3://bucket-name/test_dragonfly/ with nested subdirectories — all objects downloaded with correct sizes and routing.

cargo test: Passed after fixing unrelated pre-existing doctest failures in dragonfly-client-util.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 10, 2026

Codecov Report

❌ Patch coverage is 86.55139% with 140 lines in your changes missing coverage. Please review.
✅ Project coverage is 45.08%. Comparing base (bea655e) to head (f286edc).
⚠️ Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
dragonfly-client/src/proxy/mod.rs 86.32% 51 Missing ⚠️
dragonfly-client/src/proxy/s3.rs 91.98% 23 Missing ⚠️
dragonfly-client/src/resource/task.rs 75.00% 22 Missing ⚠️
dragonfly-client/src/resource/piece.rs 0.00% 17 Missing ⚠️
dragonfly-client/src/proxy/header.rs 92.19% 11 Missing ⚠️
dragonfly-client-util/src/request/mod.rs 92.50% 6 Missing ⚠️
dragonfly-client-metric/src/lib.rs 66.66% 5 Missing ⚠️
dragonfly-client-backend/src/http.rs 80.00% 4 Missing ⚠️
dragonfly-client/src/grpc/dfdaemon_download.rs 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1779      +/-   ##
==========================================
+ Coverage   44.43%   45.08%   +0.64%     
==========================================
  Files          91       92       +1     
  Lines       26054    27862    +1808     
==========================================
+ Hits        11578    12562     +984     
- Misses      14476    15300     +824     
Files with missing lines Coverage Δ
dragonfly-client-config/src/dfdaemon.rs 89.51% <100.00%> (+0.19%) ⬆️
dragonfly-client-util/src/cgroups/mod.rs 0.00% <ø> (ø)
dragonfly-client-util/src/sysinfo/cpu.rs 6.19% <ø> (ø)
dragonfly-client/src/grpc/dfdaemon_download.rs 4.79% <0.00%> (ø)
dragonfly-client-backend/src/http.rs 96.70% <80.00%> (-0.41%) ⬇️
dragonfly-client-metric/src/lib.rs 72.35% <66.66%> (-0.10%) ⬇️
dragonfly-client-util/src/request/mod.rs 40.48% <92.50%> (+7.65%) ⬆️
dragonfly-client/src/proxy/header.rs 88.07% <92.19%> (+2.15%) ⬆️
dragonfly-client/src/resource/piece.rs 37.59% <0.00%> (-1.27%) ⬇️
dragonfly-client/src/resource/task.rs 14.04% <75.00%> (+7.98%) ⬆️
... and 2 more

... and 7 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@YQ-Wang YQ-Wang force-pushed the feat/s3-aware-proxy branch from 99d127c to 3de8859 Compare April 10, 2026 07:13
@YQ-Wang YQ-Wang changed the title Introduce an S3-aware proxy mode in dfdaemon that accelerates feat(proxy): Introduce an S3-aware proxy mode in dfdaemon that accelerates Apr 10, 2026
@YQ-Wang YQ-Wang changed the title feat(proxy): Introduce an S3-aware proxy mode in dfdaemon that accelerates feat(proxy): Introduce an S3-aware proxy mode in dfdaemon Apr 11, 2026
@YQ-Wang YQ-Wang force-pushed the feat/s3-aware-proxy branch 2 times, most recently from 12751b1 to de944f8 Compare April 13, 2026 19:02
Signed-off-by: Yiqing Wang <yiqingwang@roblox.com>
@YQ-Wang YQ-Wang force-pushed the feat/s3-aware-proxy branch from de944f8 to f286edc Compare April 13, 2026 22:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant