-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Decouple task sdk from airflow core for remote logging #60826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decouple task sdk from airflow core for remote logging #60826
Conversation
jason810496
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! LGTM overall.
potiuk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
With tests passing of course and the few comments by @jason810496 |
08cd558 to
43319c3
Compare
43319c3 to
58e0bc1
Compare
|
Sorry for the noise folks! Bad rebase |
|
@jason810496 are you OK with the changes here? |
jason810496
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jason810496 are you OK with the changes here?
Yeah, LGTM once the CI pass.
|
CI is finally green, merging this. |
Nice! Thanks Amogh. |
Was generative AI tooling used to co-author this PR?
Used Cursor with Claude Sonnet 4.5, mostly for tests
Problem
task-sdk had several imports from airflow-core for remote logging, for instance
from airflow.logging_config import RemoteLogIO, get_remote_task_log, get_default_remote_conn_idfrom airflow.configuration import conf(used atleast 4 times inlog.py)Due to this coupling, client-server separation is difficult where task sdk dfoesn't import from core airflow.
Proposal
To the existing shared library (
logging), I am adding utilities containing the remote logging protocols and discovery logic. Both core and sdk now use this shared code, but each uses its own configuration source (conf)For context as to what's being moved:
The
RemoteLogIOandRemoteLogStreamIOprotocols define the interface that S3, GCS, CloudWatch and other remote log handlers implement. These protocols are now in the shared library where both core and sdk (and providers) can reference them.The discovery logic that found and was responsible for loading the remote log handler from the logging config module is extracted into a helper
discover_remote_log_handler()function. This function takes the config paths and import function as parameters, so core and sdk can each inject their own config.Summary of changes
Airflow core's:
airflow/logging/remote.pyis just a backcompat shim now that re-exports from the shared library. Thelogging_config.pymodule uses the shared discovery function instead of it's earlier logic.For task-sdk,
sdk/log.pygets its own_ActiveLoggingConfigclass and discovery logic exports. All imports from core are replaced with sdk or shared imports. The earlier TODO's are removed too.Impact on remote logging
No breaking changes.
For confidence, entire testing flow is here:
breeze start-airflow --integration localstackawslocaltooNOTE:
While I was at it, I realised that sdk conf didn't have support for expansion variables like
AIRFLOW_HOMEleading to remote log tests failing with:This was because
AIRFLOW_HOMEwas not being expanded:Fixed in 43319c3
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.