Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LoadFile should return a dataframe if not using XCom backend #1348

Merged
merged 2 commits into from
Dec 5, 2022

Conversation

kaxil
Copy link
Collaborator

@kaxil kaxil commented Dec 1, 2022

Currently, the LoadFileOperator converts dataframe to a File object when returning. However, this should be handled by the Custom XCom backend and not in LoadFileOperator. Otherwise, we won't be consistent with dataframe and transform decorators who, too returns dataframes.

From Airflow 2.5 and above, this should be handled automatically, for older versions, users should use our Custom XCom backend so that their XCom table isn't fully of GBs of data.

part of #1337

Does this introduce a breaking change?

No, if you are using our XCom backend. Yes, if you are not! but this fixes the behaviour and makes it consistent with other decorators and operators that return dataframes

Checklist

  • Created tests which fail without the change (if possible)
  • Extended the README / documentation, if necessary

Currently, the `LoadFileOperator` converts dataframe to a File object when returning. However, this should be handled by the Custom XCom backend and not in `LoadFileOperator`. Otherwise, we won't be consistent with dataframe and transform decorators who, too returns dataframes.

From Airflow 2.5 and above, this should be handled automatically, for older versions, users should use our Custom XCom backend so that their XCom table isn't fully of GBs of data.
@codecov
Copy link

codecov bot commented Dec 1, 2022

Codecov Report

Base: 95.71% // Head: 94.70% // Decreases project coverage by -1.01% ⚠️

Coverage data is based on head (8f6c5a3) compared to base (0b7dc13).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1348      +/-   ##
==========================================
- Coverage   95.71%   94.70%   -1.02%     
==========================================
  Files          19       73      +54     
  Lines         677     3513    +2836     
  Branches       68      402     +334     
==========================================
+ Hits          648     3327    +2679     
- Misses         18      114      +96     
- Partials       11       72      +61     
Impacted Files Coverage Δ
python-sdk/src/astro/sql/operators/load_file.py 96.70% <100.00%> (ø)
python-sdk/src/astro/utils/typing_compat.py 100.00% <0.00%> (ø)
...hon-sdk/src/astro/files/locations/google/gdrive.py 97.61% <0.00%> (ø)
python-sdk/src/astro/custom_backend/serializer.py 85.18% <0.00%> (ø)
python-sdk/src/astro/dataframes/pandas.py 87.09% <0.00%> (ø)
python-sdk/src/astro/utils/table.py 94.28% <0.00%> (ø)
python-sdk/src/astro/lineage/facets.py 100.00% <0.00%> (ø)
python-sdk/src/astro/utils/dataframe.py 91.66% <0.00%> (ø)
python-sdk/src/astro/databases/snowflake.py 95.52% <0.00%> (ø)
python-sdk/src/astro/files/locations/google/gcs.py 97.29% <0.00%> (ø)
... and 45 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@feluelle feluelle merged commit 1283394 into main Dec 5, 2022
@feluelle feluelle deleted the fix-loadfile branch December 5, 2022 10:23
kaxil added a commit that referenced this pull request Dec 13, 2022
Currently, the `LoadFileOperator` converts dataframe to a File object
when returning. However, this should be handled by the Custom XCom
backend and not in `LoadFileOperator`. Otherwise, we won't be consistent
with dataframe and transform decorators who, too returns dataframes.

From Airflow 2.5 and above, this should be handled automatically, for
older versions, users should use our Custom XCom backend so that their
XCom table isn't fully of GBs of data.

part of #1337

## Does this introduce a breaking change?
No, if you are using our XCom backend. Yes, if you are not! but this
fixes the behaviour and makes it consistent with other decorators and
operators that return dataframes

### Checklist
- [x] Created tests which fail without the change (if possible)
- [x] Extended the README / documentation, if necessary

(cherry picked from commit 1283394)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants