Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create TaskHandlersMovedRule to ease upgrade to Airflow 2.0 #11051

Closed
turbaszek opened this issue Sep 21, 2020 · 10 comments · Fixed by #11265
Closed

Create TaskHandlersMovedRule to ease upgrade to Airflow 2.0 #11051

turbaszek opened this issue Sep 21, 2020 · 10 comments · Fixed by #11265
Assignees
Labels
area:upgrade Facilitating migration to a newer version of Airflow good first issue kind:feature Feature Requests

Comments

@turbaszek
Copy link
Member

This issue is part of #8765

Rule

Create TaskHandlersMovedRule which corresponds to

GCSTaskHandler has been moved, WasbTaskHandler has been moved, StackdriverTaskHandler has been moved , S3TaskHandler has been moved, ElasticsearchTaskHandler has been moved, CloudwatchTaskHandler has been moved

entry in UPDATING.md. This rule should allow users to check if their current configuration needs any adjusting
before migration to Airflow 2.0.

How to guide

To implement a new rule, create a class that inherits from airflow.upgrade.rules.base_rule.BaseRule.
It will be auto-registered and used by airflow upgrade-check command. The custom rule class has to have title,
description properties and should implement check method which returns a list of error messages in case of
incompatibility.

For example:

class ConnTypeIsNotNullableRule(BaseRule):
title = "Connection.conn_type is not nullable"
description = """\
The `conn_type` column in the `connection` table must contain content. Previously, this rule was \
enforced by application logic, but was not enforced by the database schema.
If you made any modifications to the table directly, make sure you don't have null in the conn_type column.\
"""
@provide_session
def check(self, session=None):
invalid_connections = session.query(Connection).filter(Connection.conn_type.is_(None))
return (
'Connection<id={}", conn_id={}> have empty conn_type field.'.format(conn.id, conn.conn_id)
for conn in invalid_connections
)

Remember to open the PR against v1-10-test branch.

@turbaszek turbaszek added area:upgrade Facilitating migration to a newer version of Airflow kind:feature Feature Requests good first issue labels Sep 21, 2020
@turbaszek turbaszek added this to the Airflow 1.10.13 milestone Sep 21, 2020
@ephraimbuddy
Copy link
Contributor

I'm interested in this issue

@ephraimbuddy
Copy link
Contributor

But it seems like this could be added to the import changes rule PR #11056?

@turbaszek
Copy link
Member Author

@ephraimbuddy this is more like about checking that users configuration in airflow.cfg uses old task handlers in task_log_reader. Similar to #11067

@ephraimbuddy
Copy link
Contributor

Got it

@turbaszek turbaszek moved this from To do to In progress in Airflow upgrade check Sep 22, 2020
@ephraimbuddy
Copy link
Contributor

The remote logging documentation for the stable release is the same as the master documentation.

2.0 doc: https://airflow.readthedocs.io/en/latest/logging-monitoring/logging-tasks.html
stable doc: https://airflow.readthedocs.io/en/stable/howto/write-logs.html

I'm finding it difficult looking for what to check in previous version that has changed in 2.0.
What I have found so far is the change in import path as can be seen here :
https://github.com/apache/airflow/blob/master/UPDATING.md#airflowcontributilslog-has-been-moved.

What do you think?

@turbaszek
Copy link
Member Author

What I have found so far is the change in import path as can be seen here :
https://github.com/apache/airflow/blob/master/UPDATING.md#airflowcontributilslog-has-been-moved.

That's exactly the change. I see that the UPDATING.md has changed a bit since we created the list of rules... We should check if users configuration uses one of those handlers and warn them that this will change in 2.0

@ephraimbuddy
Copy link
Contributor

I'm suggesting that we add it in import change PR #11056 just like we added it in the deprecated classes here:

ALL = HOOKS + OPERATORS + SECRETS + SENSORS + TRANSFERS + UTILS + LOGS
,

This will help us not to duplicate code. What do you think?

LOGS = [
(
"airflow.providers.amazon.aws.log.s3_task_handler.S3TaskHandler",
"airflow.utils.log.s3_task_handler.S3TaskHandler"
),
(
'airflow.providers.amazon.aws.log.cloudwatch_task_handler.CloudwatchTaskHandler',
'airflow.utils.log.cloudwatch_task_handler.CloudwatchTaskHandler'
),
(
'airflow.providers.elasticsearch.log.es_task_handler.ElasticsearchTaskHandler',
'airflow.utils.log.es_task_handler.ElasticsearchTaskHandler'
),
(
"airflow.providers.google.cloud.log.stackdriver_task_handler.StackdriverTaskHandler",
"airflow.utils.log.stackdriver_task_handler.StackdriverTaskHandler"
),
(
"airflow.providers.google.cloud.log.gcs_task_handler.GCSTaskHandler",
"airflow.utils.log.gcs_task_handler.GCSTaskHandler"
),
(
"airflow.providers.microsoft.azure.log.wasb_task_handler.WasbTaskHandler",
"airflow.utils.log.wasb_task_handler.WasbTaskHandler"
)
]

@turbaszek
Copy link
Member Author

The difference is that the #11056 is for DAGs, task handlers are used in airflow.cfg that is not validated in #11056 So we need to look for problems in two different places

@ephraimbuddy
Copy link
Contributor

Ok. It's clear now

@turbaszek
Copy link
Member Author

Closed via #11265

Airflow upgrade check automation moved this from In progress to Done Oct 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:upgrade Facilitating migration to a newer version of Airflow good first issue kind:feature Feature Requests
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

3 participants