Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider numpy scalars and 0d arrays as scalars when padding #9653

Merged
merged 6 commits into from Mar 7, 2023

Conversation

keewis
Copy link
Contributor

@keewis keewis commented Nov 14, 2022

As discussed in #9645, this adds getattr(pad_value, "ndim", None) == 0 to the condition that checks if the pad value is scalar.

While implementing the code, I noticed that while 0d arrays are definitely not instances of Number, for numpy scalars this is a bit more tricky: scalars with int / float / complex / timedelta64 dtypes are already considered as scalars (the isinstance check evaluates to True), while datetime64, bool_, str_ and probably others are not.

As such, I'm not sure what the best way to test scalars would be. I'd be tempted to just write new tests specifically for expand_pad_value, but that might not fit the existing test strategy (which for dask.array.creation seems to just check the high-level functions).

@GPUtester
Copy link
Collaborator

Can one of the admins verify this patch?

Admins can comment ok to test to allow this one PR to run or add to allowlist to allow all future PRs from the same author to run.

@github-actions github-actions bot added the array label Nov 14, 2022
@ncclementi
Copy link
Member

add to allowlist

@keewis
Copy link
Contributor Author

keewis commented Dec 15, 2022

since I couldn't come up with a better way to test this, I decided to just add a new test that checks a bunch of different dtype / scalar or 0d array combinations for mode="constant". Not sure if we'd need to check other modes as well, but then testing expand_pad_value directly would probably be a better idea.

@keewis keewis closed this Jan 18, 2023
@keewis keewis reopened this Jan 18, 2023
@keewis
Copy link
Contributor Author

keewis commented Jan 18, 2023

gentle ping. I think the failing tests are unrelated (some FutureWarning for dask.dataframe), so this should be ready for review.

@keewis
Copy link
Contributor Author

keewis commented Feb 13, 2023

gentle ping. @jrbourbeau, this is ready for a review.

@keewis
Copy link
Contributor Author

keewis commented Mar 7, 2023

gentle reminder that this PR is still waiting for a review (and the only actual change is a single line). It would be really great if this could be merged any time soon, because that would allow fixing a few bugs in downstream libraries (most notably xarray and datatree).

@jrbourbeau
Copy link
Member

Apologies for the lack of response @keewis. Taking another look now

@jrbourbeau jrbourbeau changed the title consider numpy scalars and 0d arrays as scalars when padding Consider numpy scalars and 0d arrays as scalars when padding Mar 7, 2023
Copy link
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @keewis -- this is in. Again, apologies for the delayed response

@jrbourbeau jrbourbeau merged commit a6a57b8 into dask:main Mar 7, 2023
@keewis keewis deleted the expand_tab_value-scalar-0d branch March 7, 2023 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

padding datetime arrays with missing values
4 participants