-
Notifications
You must be signed in to change notification settings - Fork 16.8k
Use presigned URL for delivering logs from worker to central S3 bucket #64254
Description
Description
Add support for pre-signed URL based log I/O as a RemoteLogIO implementation, where:
- API server (reader): Uses the built-in
S3RemoteLogIOwith direct IAM access for reading logs (no change needed). - Worker (uploader): Uses a new
RemoteLogIOthat requests a pre-signed PUT URL from the API server and uploads via plain HTTP. No AWS credentials needed on the worker.
How it works
Worker (after task) API Server S3
| | |
|-- POST /presigned-url ---->| |
| |-- generate PUT URL -->|
|<-- { presigned_url } ------| |
| |
|------------- HTTP PUT log file ------------------->|
The API server endpoint that generates pre-signed URLs can enforce custom authorization rules before issuing the URL - e.g. verifying the worker's service account is only allowed to upload logs for DAGs in its bundle.
The only change a worker deployment needs is to use new functionalit:
[logging]
remote_log_io_role = workerOptional: custom auth hook
By default, the presigned URL endpoints use standard Airflow authentication (the requesting user must be authenticated). For deployments that need additional authorization logic (e.g. bundle-scoped access, tenant isolation), an optional callable can be configured:
[logging]
presigned_url_auth_hook = mypackage.auth.validate_log_accessI'm deploying solution on our side to pruduction and would gladely contribute if it would be accepted (Dont want to go to trouble of getting appoval to opensource it nobody is interested :))
Use case/motivation
Airflow 3.x introduced RemoteLogIO as the protocol for remote log upload/download from the supervisor process. Currently, the only built-in implementation uses direct S3 access (S3RemoteLogIO), which requires the worker to have S3 credentials.
In multi-account or zero-trust deployments, workers run on a separate AWS account and should not be trusted with S3 write credentials. There is no way to plug in a custom log upload/download mechanism that uses pre-signed URLs instead of direct S3 access.
Related issues
No response
Are you willing to submit a PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct