-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Map File
& Table
classes to Dataset (AIP-48)
#786
Conversation
Current test failures are because we exceeded rate limit, will re-run in few hours:
|
@kaxil I think we need to pin attrs to the correct version |
aah yes, looks like Airflow has a limit set for 2.3.3 - https://github.com/apache/airflow/blob/2.3.3/setup.cfg#L88-L89 but not for 2.3.4. If you upgrade to Airflow 2.3.4 it will work -- but yeah I will change it to using |
@utkarsharma2 Updated in ee1389f |
Not yet, check the PR description - "As a follow-up to this PR, I will create a PR to automatically add "inlets" and "outlets" to all of the operators in SDK so that users can see the datasets that are produced or consumed by tasks." |
Codecov Report
@@ Coverage Diff @@
## main #786 +/- ##
==========================================
- Coverage 93.31% 93.30% -0.01%
==========================================
Files 43 44 +1
Lines 1869 1898 +29
Branches 234 235 +1
==========================================
+ Hits 1744 1771 +27
- Misses 97 99 +2
Partials 28 28
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
@@ -51,6 +51,10 @@ | |||
"show-inheritance", | |||
"show-module-summary", | |||
] | |||
|
|||
suppress_warnings = [ | |||
"autoapi.python_import_resolution", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kaxil why are we suppressing this warning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I should have added a comment -- this is because airflow.dataset
is still not available for Airflow -- so Sphinx fails to resolve it -- which is why I am suppressing that error.
@@ -0,0 +1,9 @@ | |||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we planning to introduce more things inside the package airflow
inside Astro?
Since this is a configuration DATASET_SUPPORT
), to some extent, have you considered adding this to settings.py
or constants.py
(AIRFLOW_DATASET_SUPPORT
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I am thinking of adding kwargs_with_datasets
--> same as https://github.com/astronomer/astro-sdk/blob/aip-48-mapping/src/astro/utils/compat.py and setting.py
for it feels wrong and I am trying to avoid utils
.
I made that airflow
module to put "custom" airflow related things which in future might include Custom XCom backend or Custom Dataset Event Manager - probably 🤷
# Airflow >= 2.4 | ||
from airflow.datasets import Dataset | ||
|
||
DATASET_SUPPORT = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is probably worth to run a test in Nox, so we cover this branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What sort of test are you thinking about for it?
@tatiana To identify the type of connection from astro-sdk/python-sdk/tests/sql/test_table.py Lines 78 to 107 in f8cb259
|
a783557
to
5fdd872
Compare
Part of #611 This PR will map `File` and `Table` classes to inherit from `Dataset` object (if it is available - will be released in Airflow 2.4). This will allow users to schedule DAGs on a `File` and `Table` object when used with Airflow 2.4 and above. As a follow-up to this PR, I will create a PR to automatically add "inlets" and "outlets" to all of the operators in SDK so that users can see the datasets that are produced or consumed by tasks. Fix doc build error Import from `attr` instead of `attrs` Looks like Airflow has a limit set for 2.3.3 - https://github.com/apache/airflow/blob/2.3.3/setup.cfg#L88-L89 but not for 2.3.4. This was fixed in Airflow 2.3.4 but I will change it to using attr. `attrs` is a new namespace while `attr` is the old but working one
Part of #611
This PR will map
File
andTable
classes to inherit fromDataset
object (if it is available - will be released in Airflow 2.4). This will allow users to schedule DAGs on aFile
andTable
object when used with Airflow 2.4 and above.As a follow-up to this PR, I will create a PR to automatically add "inlets" and "outlets" to all of the operators in SDK so that users can see the datasets that are produced or consumed by tasks.
Does this introduce a breaking change?
No
Checklist