Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Change default dtype of all nulls column from float to object #9803

Merged
merged 27 commits into from Dec 14, 2021

Conversation

galipremsagar
Copy link
Contributor

@galipremsagar galipremsagar commented Nov 30, 2021

Fixes: #9337

  • This PR changes the default dtype of all-nulls column to object dtype from float64 dtype.
  • To make np.nan values read as float column nan_as_null has to be passed as False in cudf.DataFrame constructor - This change is in-line with what is already supported by cudf.Series constructor.
  • Added has_nans & nan_count property which is needed for some of the checks.
  • Cached the nan_count since it is repeatedly used in math operations and clearing the cache in the regular _clear_cache call.
  • Fixes pytests that are going to break due to this breaking change of types.

@github-actions github-actions bot added the cuDF (Python) Affects Python cuDF API. label Nov 30, 2021
@galipremsagar galipremsagar added this to PR-WIP in v22.02 Release via automation Dec 1, 2021
@galipremsagar galipremsagar added breaking Breaking change bug Something isn't working labels Dec 2, 2021
@codecov
Copy link

codecov bot commented Dec 3, 2021

Codecov Report

Merging #9803 (d5404c1) into branch-22.02 (967a333) will decrease coverage by 0.08%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff                @@
##           branch-22.02    #9803      +/-   ##
================================================
- Coverage         10.49%   10.40%   -0.09%     
================================================
  Files               119      119              
  Lines             20305    20501     +196     
================================================
+ Hits               2130     2134       +4     
- Misses            18175    18367     +192     
Impacted Files Coverage Δ
python/dask_cudf/dask_cudf/sorting.py 92.30% <0.00%> (-0.61%) ⬇️
python/cudf/cudf/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/frame.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/index.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/parquet.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/series.py 0.00% <0.00%> (ø)
python/cudf/cudf/utils/utils.py 0.00% <0.00%> (ø)
python/cudf/cudf/utils/ioutils.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/dataframe.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/multiindex.py 0.00% <0.00%> (ø)
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b3b299a...d5404c1. Read the comment docs.

@galipremsagar galipremsagar changed the title [WIP] Change default dtype of all nulls column from float to object [REVIEW] Change default dtype of all nulls column from float to object Dec 7, 2021
@galipremsagar galipremsagar self-assigned this Dec 7, 2021
@galipremsagar galipremsagar moved this from PR-WIP to PR-Needs review in v22.02 Release Dec 7, 2021
@galipremsagar galipremsagar marked this pull request as ready for review December 7, 2021 17:06
@galipremsagar galipremsagar requested a review from a team as a code owner December 7, 2021 17:06
Copy link
Contributor

@brandon-b-miller brandon-b-miller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple q's otherwise LGTM

v22.02 Release automation moved this from PR-Needs review to PR-Reviewer approved Dec 9, 2021
@galipremsagar galipremsagar added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 4 - Needs cuDF (Python) Reviewer labels Dec 14, 2021
@galipremsagar
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 2627153 into rapidsai:branch-22.02 Dec 14, 2021
v22.02 Release automation moved this from PR-Reviewer approved to Done Dec 14, 2021
rapids-bot bot pushed a commit to rapidsai/cuspatial that referenced this pull request Dec 18, 2021
Following breaking change in `cuDF` [here](rapidsai/cudf#9803), this test needs to be updated

Authors:
  - Jordan Jacobelli (https://github.com/Ethyling)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Paul Taylor (https://github.com/trxcllnt)

URL: #472
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge breaking Breaking change bug Something isn't working cuDF (Python) Affects Python cuDF API.
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

[BUG] Assigning scalar boolean to a Series w/ nulls results in wrong data type
3 participants