Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken task_log_prefix_template by adding it to FileTaskHandler itself. #38709

Closed
wants to merge 14 commits into from

Conversation

abhishekbhakat
Copy link
Contributor

@abhishekbhakat abhishekbhakat commented Apr 3, 2024

As of now the config task_log_prefix_template does not work. Although the below code is specifically written for this purpose, the class is extending a StreamHandler seems obsolete:

class TaskHandlerWithCustomFormatter(logging.StreamHandler):

I recommend removing this StreamHandler. This PR implements the prefix to FileTaskHandler itself. Have also written tests for the same.

The only use-case to discuss will be elasticsearch's write_stdout.

Please recommend if we are good to drop the TaskHandlerWithCustomFormatter.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@abhishekbhakat
Copy link
Contributor Author

CC @pankajkoti

Copy link
Member

@pankajkoti pankajkoti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @abhishekbhakat are we able to test this with well including remote loggers like S3, GCP, Azure, Elasticsearch?

@abhishekbhakat
Copy link
Contributor Author

I've verified this with S3 yet. Will check on others today.

@pankajkoti
Copy link
Member

pankajkoti commented Apr 4, 2024

I've verified this with S3 yet. Will check on others today

Thanks @abhishekbhakat . GCP and Azure would be similar to test like S3. For Elasticsearch, I have detailed one of the ways to test it here #32438 (comment)

@abhishekbhakat
Copy link
Contributor Author

I'm done verifying it on AWS, Azure and GCP. Will confirm for ES in some time.

@abhishekbhakat
Copy link
Contributor Author

So the code changes I made does not accommodate Elasticsearch's write_stdout. It was not working earlier, and so the behavior remains the same.

@eladkal eladkal added the type:bug-fix Changelog: Bug Fixes label Apr 4, 2024
@eladkal eladkal added this to the Airflow 2.9.1 milestone Apr 4, 2024
@abhishekbhakat
Copy link
Contributor Author

To make it working with write_stdout I think I need to fix that StreamHandler (¬_¬;)

@abhishekbhakat
Copy link
Contributor Author

Also, I'm not sure if prefix is even needed on Elasticsearch. If we use ElasticsearchJSONFormatter it already includes fields for dag_id, task_id, etc.
¯\_(ツ)_/¯

Shall we ignore ES for the scope of this PR then ?

@abhishekbhakat
Copy link
Contributor Author

Moving discussions to Issue #39019.

@jedcunningham
Copy link
Member

Instead of adding prefix support to all our of logging contexts, I think we should just keep it scoped to TaskHandlerWithCustomFormatter (it's already documented that way fwiw). My 2c.

@abhishekbhakat
Copy link
Contributor Author

I would say to strip it out completely. That TaskHandlerWithCustomFormatter is unreachable. We don't use StreamHandler for task logs anywhere. That's why asked for a decision on #39019.

@jedcunningham jedcunningham removed this from the Airflow 2.9.1 milestone Apr 26, 2024
@jedcunningham
Copy link
Member

Sorry, didn't mean to split the conversation, but left some more thoughts on #39019.

I've removed this from 2.9.1 since we don't know how we will proceed yet.

@dstandish
Copy link
Contributor

@abhishekbhakat I think you would be able to use TaskHandlerWithCustomFormatter but you'd have to modify your airflow_local_settings.py to reference it in the logging config. to me it feels like we should deprecate this feature and then leave it as is.

@abhishekbhakat
Copy link
Contributor Author

I tried, but couldn't get it working without just using an entirely different Handler.

@abhishekbhakat
Copy link
Contributor Author

Closing this PR in favor of deprecating this and more discussions at #39019.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:logging type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants