Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug with datetime not being serialized #1334

Merged
merged 2 commits into from
Dec 1, 2022
Merged

Fix bug with datetime not being serialized #1334

merged 2 commits into from
Dec 1, 2022

Conversation

kaxil
Copy link
Collaborator

@kaxil kaxil commented Nov 30, 2022

Since Airflow still doesn't support serializing sets or datetime, it would be better to serialize dataframes to string instead of dict.

This PR also adds pre-commit hook so we don't miss test_ prefix in the test files

@codecov
Copy link

codecov bot commented Nov 30, 2022

Codecov Report

Base: 95.71% // Head: 94.68% // Decreases project coverage by -1.03% ⚠️

Coverage data is based on head (36202b5) compared to base (1c506c8).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1334      +/-   ##
==========================================
- Coverage   95.71%   94.68%   -1.04%     
==========================================
  Files          19       74      +55     
  Lines         677     3516    +2839     
  Branches       68      401     +333     
==========================================
+ Hits          648     3329    +2681     
- Misses         18      114      +96     
- Partials       11       73      +62     
Impacted Files Coverage Δ
python-sdk/src/astro/dataframes/pandas.py 87.09% <100.00%> (ø)
python-sdk/src/astro/files/types/csv.py 100.00% <0.00%> (ø)
python-sdk/src/astro/files/__init__.py 83.33% <0.00%> (ø)
python-sdk/src/astro/sql/operators/transform.py 87.09% <0.00%> (ø)
python-sdk/src/astro/sql/operators/drop.py 100.00% <0.00%> (ø)
python-sdk/src/astro/sql/table.py 100.00% <0.00%> (ø)
python-sdk/src/astro/table.py 100.00% <0.00%> (ø)
python-sdk/src/astro/sql/operators/merge.py 95.83% <0.00%> (ø)
...ython-sdk/src/astro/sql/operators/base_operator.py 100.00% <0.00%> (ø)
...thon-sdk/src/astro/sql/operators/base_decorator.py 95.00% <0.00%> (ø)
... and 46 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Contributor

@pankajkoti pankajkoti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to correct the assert in the tests here:

assert s_df == {"id": {0: 1}, "name": {0: "xyz"}}

to assert s_df == {"data": {"id": {0: 1}, "name": {0: "xyz"}}}

Also look like the tests are not running for that file as the file name is missing the test_ prefix, we will need to rename it to test_pandas.py and hence codecoverage is reporting that the patched lines are not covered

@pankajkoti
Copy link
Contributor

pankajkoti commented Nov 30, 2022

Also seeing another issue where self.to_json()s output is different than that of self.to_dict(). self.to_json() has the inner keys as strings whereas self.to_dict() has the inner keys as integers. Will this have an impact somewhere?

Screenshot 2022-11-30 at 11 46 12 PM

@kaxil kaxil force-pushed the fix-df-error branch 2 times, most recently from af92bbc to d45598f Compare November 30, 2022 23:57
@kaxil
Copy link
Collaborator Author

kaxil commented Dec 1, 2022

Also seeing another issue where self.to_json()s output is different than that of self.to_dict(). self.to_json() has the inner keys as strings whereas self.to_dict() has the inner keys as integers. Will this have an impact somewhere?

Airflow's XCom still can't serialize datetime, set and other types, so we can't use self.to_dict() whereas self.to_json() converts it to string, so we are good.

Also look like the tests are not running for that file as the file name is missing the test_ prefix, we will need to rename it to test_pandas.py and hence codecoverage is reporting that the patched lines are not covered

Added pre-commit hook so that we detect such cases :)

@kaxil kaxil added this to the 1.3.0 milestone Dec 1, 2022
@kaxil kaxil merged commit 8fd6cdd into main Dec 1, 2022
@kaxil kaxil deleted the fix-df-error branch December 1, 2022 01:12
sunank200 pushed a commit that referenced this pull request Dec 1, 2022
Since Airflow still doesn't support serializing sets or datetime, it
would be better to serialize dataframes to string instead of dict.

This PR also adds pre-commit hook so we don't miss `test_` prefix in the
test files

(cherry picked from commit 8fd6cdd)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants