Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: from_dataframe with numpy==1.26.1 and type handling in python 3.9 #1823

Merged
merged 6 commits into from
Oct 23, 2023

Conversation

JohannesMessner
Copy link
Member

@JohannesMessner JohannesMessner commented Oct 18, 2023

We perform a check that doesn't work anymore with new versions of numpy, because they switches equality semantics to broadcast over every element in the array.

This PR fixes that by treating np arrays as a special cases in the affected check. It also does the same for tf, torch, and jax, just because == semantics on tensor-like objects can always be a bit surprising.

The numpy version in question (1.26.1) doesn't support Python 3.8 or down, but we do. Therefore, poetry can't lock to that version. To circumvent this, I am here manually installing the "problematic" numpy version in the CI. For the same reason I am bumping some tests to run on Python 3.9.

No new tests since the existing tests already fail with the numpy version in question, and are fixed by the fix (see failing check of the CI run after the new numpy version, but before the fix was pushed, here).

Also includes a fix to handle typing changes in python 3.9 and above, by using the get_args helper instead of .__args__.

closes #1821

Signed-off-by: Johannes Messner <messnerjo@gmail.com>
Signed-off-by: Johannes Messner <messnerjo@gmail.com>
@codecov
Copy link

codecov bot commented Oct 18, 2023

Codecov Report

Attention: 4 lines in your changes are missing coverage. Please review.

Comparison is base (7479f59) 84.48% compared to head (b70a669) 85.04%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1823      +/-   ##
==========================================
+ Coverage   84.48%   85.04%   +0.55%     
==========================================
  Files         135      135              
  Lines        9006     9028      +22     
==========================================
+ Hits         7609     7678      +69     
+ Misses       1397     1350      -47     
Flag Coverage Δ
docarray 85.04% <86.66%> (+0.55%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
docarray/base_doc/mixins/io.py 90.73% <100.00%> (+0.48%) ⬆️
docarray/display/document_summary.py 86.25% <100.00%> (ø)
docarray/base_doc/doc.py 92.18% <60.00%> (ø)
docarray/helper.py 93.00% <90.90%> (-0.59%) ⬇️

... and 6 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Johannes Messner <messnerjo@gmail.com>
Signed-off-by: Johannes Messner <messnerjo@gmail.com>
Signed-off-by: Johannes Messner <messnerjo@gmail.com>
@JohannesMessner JohannesMessner changed the title fix: from_dataframe with numpy==1.26.1 fix: from_dataframe with numpy==1.26.1 and type handling in python 3.9 Oct 18, 2023
@JohannesMessner JohannesMessner marked this pull request as ready for review October 18, 2023 13:49
Copy link
Member

@JoanFM JoanFM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will a np.array of one dimension with 0 be considered a None?

@JohannesMessner
Copy link
Member Author

Will a np.array of one dimension with 0 be considered a None?

No, it will not. It also wasn't before.

@JoanFM
Copy link
Member

JoanFM commented Oct 18, 2023

Will a np.array of one dimension with 0 be considered a None?

No, it will not. It also wasn't before.

can we add a test for it?

Signed-off-by: Johannes Messner <messnerjo@gmail.com>
@github-actions
Copy link

📝 Docs are deployed on https://ft-fix-from-dataframe--jina-docs.netlify.app 🎉

@JoanFM
Copy link
Member

JoanFM commented Oct 20, 2023

We need to add a matrix and test with python 3.8, since is the minimal version we mantained we need to make sure we test it.

@JoanFM JoanFM merged commit 98d1f1f into main Oct 23, 2023
38 checks passed
@JoanFM JoanFM deleted the fix-from-dataframe branch October 23, 2023 07:37
@JoanFM JoanFM mentioned this pull request Oct 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: deserializing from dataframe with numpy==1.26.1
2 participants