Split Quantity into scalar and sequence classes #764

andrewgsavage · 2019-02-12T22:49:59Z

No description provided.

andrewgsavage · 2019-02-12T22:56:52Z

So that's with BaseQuantity having the methods/attributes shared across sequence and scalar, with a mixin (is that the right name?) having the sequence specific ones.

Quantity inherits from BaseQuantity and returns either a QuantityScalar or QuantitySequence when initiated, depending on the value.

QuantitySequence and QuantityScalar both inherit from Quantity, with QuantitySequence also inheriting the QuantitySequenceMixin

hgrecco · 2019-02-13T00:46:19Z

Great start. I do have a lot of questions (I put them within the code).

hgrecco · 2019-02-13T00:48:06Z

pint/quantity.py

@@ -78,7 +78,7 @@ def wrapped(self, *args, **kwargs):


 @fix_str_conversions
-class _Quantity(PrettyIPython, SharedRegistryObject):
+class BaseQuantity(PrettyIPython, SharedRegistryObject):


Why is this classe "public" (i.e. no underscore)? Is this the class users should check against when using isinstance?

Hadn't thought of this when going through it!

It should be used when the user does know the UnitRegistry. In the current version users should be checking with _Quantity (as done in pint-pandas)
Using isinstance with a Quantity from one UnitRegistry and another UnitRegistry yields false:
'''
ureg=pint.UnitRegistry()

ureg2 = pint.UnitRegistry()

isinstance(my_array,ureg2.Quantity)
False
'''
Should this be public?

hgrecco · 2019-02-13T00:49:26Z

pint/quantity.py


+def build_quantity_class(registry, force_ndarray=False):
+
+    class Quantity(BaseQuantity):


Or this is class to be used when calling isinstance

hgrecco · 2019-02-13T01:03:02Z

Shall we split format between scalar and array code here
Shall we change all the inplace code to be really in place for the array as well (e.g. here
Is all these needed for scalar here? I guess yes because you can still do numpy.cos(34). Is there another, cleaner, way?
Can we move fill, clip, etc to QuantitySequence?
Will this be reached at all? code
Is __iter__ the only thing to test for? code

I guess that all the questions point to the same thing: How much can we simplify the code and making it more maintainable with this change?

andrewgsavage · 2019-02-13T08:15:55Z

So this was very much a 'get it behaving as you expect' before spending time on those more specific things - I was just moving lines around before spending time working out what paths change/need changing.

1, 2, If the behavior for ndarrays is what's expected from a sequence then yes. I guess this applies everywhere ndarrays are treated differently.

Haven't looked at this much. I think array_function may serve us better for some functions, eg add, but won't be all that different when units aren't relevant

4.Yea, missed them - are there any others?

Haha no.
The main thing that slips through that is strings, which are handled previously.

There should be little difference with this change, from here onwards I can only see moving things around / removing checks like 2.

hgrecco · 2019-02-13T21:22:20Z

And how about the connection to pandas ... will this make pint-pandas easier to implement?

andrewgsavage · 2019-02-13T22:53:11Z

Came across this when looking for numpy functions
#484

y = np.arange(0, 1 * ureg.meter, 0.1 * ureg.meter)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: setting an array element with a sequence.

The main reason I'd like this was because of that error message - I hit it many times in pint-pandas and spent a while working around it, so it'll make pint-pandas easier, along with checking for scalar/sequence.
Although I note at present to do means comparing to QuantitySequenceMixin, which I guess needs renaming if it's intended for that use..

Back to that error message - I think it's because somewhere in numpy there's a check for iter or getitem, which makes it think it's a sequence and take a wrong path. Splitting into QuantityScalar fails that check so takes a different path:

y = np.arange(0, 1 * ureg.meter, 0.1 * ureg.meter)
y
array([0, <Quantity(0.1, 'meter')>, <Quantity(0.2, 'meter')>,
       <Quantity(0.30000000000000004, 'meter')>, <Quantity(0.4, 'meter')>,
       <Quantity(0.5, 'meter')>, <Quantity(0.6, 'meter')>,
       <Quantity(0.7, 'meter')>, <Quantity(0.7999999999999999, 'meter')>,
       <Quantity(0.8999999999999999, 'meter')>], dtype=object)

I'm surprised that works with 0 and not 0*ureg.m

andrewgsavage · 2019-02-20T20:53:11Z

So that's the tests passing again, except the uncertainties ones. What's the plan there - are they being split into another module?

I'm in two minds as to how to go forward with __array_function. Functions like arrange
numpy.arange([start, ]stop, [step, ]dtype=None)
look tricky to implement quantity conversions/checks accounting for the various optional args/ optional first arg.

Alternatively we could convert only the args that are quantities to consistent/root units. This would make it easy to implement many functions https://github.com/hgrecco/pint/pull/764/files#diff-d9924213798d0fc092b8cff13928d747R115 but prevents function specific checks.

The other thing I've noticed is the double underscored attributes, eg __used have become _BaseQuantity__used with the inheritance. Not sure if this is an issue.

andrewgsavage · 2019-09-03T01:07:11Z

Overall, I would think maintaining version and configuration compatibility is worth the complexity of supporting the differing interfaces, especially since that's what other libraries in the community seem to be doing.

Are there any libraries that see such a change between supporting/not supporting __array_function__, where users will have to change their code?

This is looking really good. Maybe we can try to define a roadmap and put a date for a release.

Assuming people are happy to only support numpy >=1.16 in the future, I'd suggest the following roadmap:

0.10 - Split Quantity into scalar and sequence classes (would split this PR into one splitting classes and another implementing __array_function__. )
0.11 - Support __array_function__, extas_require numpy>=1.16
We could backport changes to 0.10.x if necessary to maintain support for numpy <1.16

andrewgsavage · 2019-09-03T01:21:15Z

Based on tests in pydata/xarray#3238, it looks like there is inconsistent behavior in pint between numpy functions that go through __array_function__ and those that go through __array_ufunc__, especially with regards to non-explicitly-handled functions. In that case, the __array_function__ implementation returns NotImplemented, whereas the __array_ufunc__ implementation returns a result but strips units without a warning.

Should __array_ufunc__ be brought into consistency with how __array_function__ is implemented here by only operating on explicitly-handled ufuncs?

Maybe. This will break many functions that work but aren't explicitly handled. Are people likely to submit PRs to handle these?

jthielen · 2019-09-03T02:30:55Z

Are there any libraries that see such a change between supporting/not supporting __array_function__, where users will have to change their code?

I think it depends on how much you consider "such a change." With any numpy function that internally calls np.asarray() or np.asanyarray(), an overriding __array_function__ implementation that preserves non-subclass type will be a breaking change in any library supporting __array_function__. Where pint may be different from other non-ndarray-subclass implementers I've seen (dask, sparse, cupy, etc.) is that casting to a plain ndarray and reconstructing afterwards is not an expensive operation, so downstream users are more likely to try working around numpy functions that cast to ndarray by assuming the casting occurs and correcting afterwards, rather than implementing a custom function (this is what MetPy assumed and is now trying to fix, xref Unidata/MetPy#1144).

Assuming people are happy to only support numpy >=1.16 in the future, I'd suggest the following roadmap:

0.10 - Split Quantity into scalar and sequence classes (would split this PR into one splitting classes and another implementing __array_function__. )
0.11 - Support __array_function__, extas_require numpy>=1.16
We could backport changes to 0.10.x if necessary to maintain support for numpy <1.16

Now that you mention it, I like the idea of splitting this PR into one splitting classes and another implementing __array_function__. Right now all the complications relating to __array_function__ do seem to be derailing and holding back the original titled focus of the PR 😁. Let me know if you'd like me to submit the new __array_function__ PR based on my changes in andrewgsavage#6 (which I can close) after you remove the __array_function__ implementation here.

My one concern with putting off __array_function__ until v0.11 is the delay in the __array_function__ implementation making a release. Based on engagement with pydata/xarray#525, #849, and related issues, it seems people are anxious to see this in action (I know I'm definitely one of them). For example, it seems like this would really help MetPy for its upcoming v1.0 release, which is scheduled to be out by the end of 2019. So, if v0.11 is going to be the target release for __array_function__ support, would it be okay to commit to it being released on a relatively short timeline (within the next couple months)? I'd be glad to do what I can to help make it happen.

Otherwise, this roadmap makes sense to me!

Maybe. This will break many functions that work but aren't explicitly handled. Are people likely to submit PRs to handle these?

I would think this will break non-explicitly-handled ufuncs just like the __array_function__ implementation breaks non-explicitly-handled functions. I believe it's crucial that the behavior be consistent for the sake of downstream libraries/users (xref pydata/xarray#3238). What that consistent behavior is (fail if not implemented or strip units and raise UnitStrippedWarning), however, I could be convinced either way. As far as PRs to handle these, I think it is the same situation between numpy array ufuncs and functions, and I don't think it would be difficult for people to submit implementations where any are missing.

keewis · 2019-10-30T21:54:03Z

what is the status on this? Now that both pydata/xarray#3238 and pydata/xarray#3447 are either merged or close to being merged, the support for __array_function__ in pint is one of the major blocks to getting pint to work with xarray.

I would be interested in and/or willing to help with getting this moving again or starting a new PR that only works on __array_function__ support.

hgrecco · 2019-10-30T22:06:58Z

If all the interested parts are sattisfied with the currenbt solution, I think we should merge it. (after fixing the pending conflict)

jthielen · 2019-10-30T22:13:22Z

There are a couple reasons I think this shouldn't be merged yet:

there is some strong opposition to the Quantity class split in its current state in Split in two Quantity Classes for Scalar and Array objects #753 / Split quantity (No __array_function__) #875 by @shoyer and @crusaderky
there are some needed changes in Additional numpy __array_function__ implementations andrewgsavage/pint#6 that haven't been included here yet

Instead, I'd be in favor of taking the __array_function__ implementation here and in andrewgsavage#6 and putting in a new PR separate from the Quantity class split. However, to avoid merge conflicts, I've been waiting on #880 to be merged into master before finalizing my implementation of it and submitting that PR.

shoyer · 2019-10-30T22:33:08Z

To be clear, I am happy to advise based off my cursory understanding of pint, but don't take my advice too seriously -- I'm not the one actually implementing this! You can certainly make __array_function__ overrides work with either a single class or separate classes for scalars and arrays. Ultimately this is a decision for pint's maintainers and whoever is implementing this.

…

On Wed, Oct 30, 2019 at 3:13 PM keewis ***@***.***> wrote: if I remember correctly there was some discussion about whether the split into scalars / sequences was a good idea, so I think we should probably remove all code relating to that first (see #753 <#753> and #875 <#875>). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#764?email_source=notifications&email_token=AAJJFVSD5ISP5SCJ6QOOE63QRIBINA5CNFSM4GXA3XUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECV6CGY#issuecomment-548135195>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJJFVSAPITG7ZUXUY3BTVLQRIBINANCNFSM4GXA3XUA> .

hgrecco · 2019-10-30T22:43:20Z

I just made my comments on #880 Short version, I am happy to merge it after a minor change.

jthielen · 2019-10-30T23:58:58Z

In case it is useful to anyone at this point, a branch I have with a rough implementation of __array_function__ based on #880 and what was done here is up at https://github.com/jthielen/pint/tree/nep18-1.

Current To-Do List:

Definitely address now
- Merge/modify previous changes
- Address np.power
- Address np.searchsorted (see tests for arrays with units pydata/xarray#3238 (comment))
- Verify ndarray-returning functions don't mistakenly have units attached (see Additional numpy __array_function__ implementations andrewgsavage/pint#6 (comment))
Maybe address now, maybe follow-up PR
- Mixed argument types (Split Quantity into scalar and sequence classes #764 (comment))
  - np.compress failing test
  - np.where, np.histogram (Split Quantity into scalar and sequence classes #764 (comment))
- Ensure consistent behavior between __array_function__ and __array_ufunc__ (Split Quantity into scalar and sequence classes #764 (comment))
- Wrapping non-ndarray NEP18-compatible arrays (Insuring interoperability of pint with other (non-unit) array-like types #845 / NEP18 trouble when pint is being wrapped #878)

keewis · 2019-11-06T16:58:02Z

given that #880 has been merged, how do we want to continue with this? Would it be better to wait on a decision on #753 and #875 or should we go ahead and implement support for __array_function__ and worry about compatibility with QuantityScalar / QuantitySequence afterwards should that become necessary? If implementing directly is the consent: @jthielen, do you have the time to convert your branch to a PR?

jthielen · 2019-11-06T17:00:52Z

@keewis My apologies for the delays this past week! I'd like to submit the branch I have as a PR, I just haven't had the time to complete the remaining checklist items yet. I should though have the time in the next few days (just probably not today).

keewis · 2019-11-06T17:13:56Z

I don't think you have to have completed those items when you create the PR (but good to hear you are working on this)

jthielen · 2019-11-11T08:34:23Z

A status update: unfortunately, the current implementation of this PR (with implement_func) has not been amendable to handling mixed argument types, so I am still working on re-implementing alternatives that can be more signature-aware. So, it is probably a good thing I didn't submit the PR with what I had, since much of it is still in flux. I've also been finding small errors in function behavior that I've been slowly correcting along the way as well.

Also, a couple side questions:

While I've been making do without it by writing many unique @implements implementations, it would be nice to try using inspect.signature to reduce repeated code that only differs by the signature. Would this be enough justification to drop Python 2.7 now, given that it still hasn't been dropped in master yet?
Some of the NumPy functions of interest have signatures that vary across versions, so before I get too deep into ensuring compatibility for those, what should pint target as its minimum supported NumPy version? I would think following NEP-29 would be best (so, v1.14 until January 7, 2020), but I wanted to ask before jumping to any conclusions.

keewis · 2019-11-27T22:36:16Z

@jthielen, I tried to use your branch to run the tests from xarray and ran into some issues (among others something about having "dtype" as dtype or unexpected kwargs). What do you think about opening a PR using the current state so we can discuss these and other issues? I think it doesn't matter if you decide to iterate on the code design, but it would allow others to provide feedback on the choices (which could potentially save you some effort). I'd also like to help with investigating and squashing bugs, but right now I'm not sure where to suggest changes (should I create PRs against your branch?).

Considering official python 2.7 support expires in a little more than a month, I'd say let's drop it, at least on master (but that's only my personal opinion). @hgrecco, what do you think?

jthielen · 2019-11-29T17:05:06Z

@keewis I'm sorry for the continued delays! I've opened #905 as a draft PR with my current status and some notes about what I see as remaining high-level tasks for it. I really need to be making progress on it (if there is still to be hope of having this in a Pint release by the end of the year), but I'm not sure when I'll get the time needed with all my other end-of-semester tasks...hopefully soon though.

As far as suggesting changes, in absence of a better system, I think PRs to my branch would be best for large changes, whereas suggested changes in a review to #905 would be best for small changes?

hgrecco · 2019-12-02T18:04:06Z

See #906

create quantityscalar and sequence classes

3ad6e39

andrewgsavage mentioned this pull request Feb 12, 2019

Split in two Quantity Classes for Scalar and Array objects #753

Open

hgrecco reviewed Feb 13, 2019

View reviewed changes

andrewgsavage added 13 commits February 14, 2019 20:00

add __array_function__

0d0097d

Merge branch 'master' into split_quantity

cfc5d0d

replace _Q with BaseQuantity

431d5b4

change np ufuncs to check x2 is a BaseQuantity

f62e946

revert changes in test_quantity

bbb44d6

revert some more

d76171f

some more reverts

30d542a

revert more

5b9e473

revert getattr

c21bfbd

__used -> _used

def8348

fix path without _used

65905e0

fix setitem

7fb3800

fix put

e07e237

andrewgsavage added 6 commits February 21, 2019 22:26

some array manip tests

f6373fb

remove broadcast

0aea9bc

atleast nd

077d12b

squuze

da9dcb8

uprev numpy, disable less useful versions

2727d21

downrev numpy to one that may exist

fef52dd

jthielen mentioned this pull request Sep 2, 2019

Quantity.prod() result unit is incorrect with axis or where argument #867

Closed

andrewgsavage mentioned this pull request Sep 10, 2019

Split quantity (No __array_function__) #875

Open

hgrecco mentioned this pull request Sep 12, 2019

NEP18 trouble when pint is being wrapped #878

Open

This was referenced Sep 13, 2019

Sphinx errors #881

Closed

Integration with Dask (add tests; implement the Dask collection interface on Quantity) #883

Closed

This was referenced Oct 13, 2019

[Release] 0.14 pydata/xarray#3380

Closed

setting warnings to errors changes the behaviour of np.asarray on quantities #892

Closed

jthielen mentioned this pull request Nov 11, 2019

NEP 18, physical units, uncertainties, and the scipp library? pydata/xarray#3509

Closed

jthielen mentioned this pull request Nov 29, 2019

NEP-18 Compatibility #905

Merged

bors bot closed this in 43fbae2 Dec 11, 2019

jthielen mentioned this pull request Jan 8, 2020

IndexError on assigning units to empty Numpy array (Pint 0.10) #973

Closed

znicholls mentioned this pull request Jul 2, 2020

Shall we create a QuantityArray class? #1128

Open

burnpanck mentioned this pull request Dec 14, 2023

avoid setting force_ndarray_like = True xarray-contrib/pint-xarray#216

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split Quantity into scalar and sequence classes #764

Split Quantity into scalar and sequence classes #764

andrewgsavage commented Feb 12, 2019

andrewgsavage commented Feb 12, 2019

hgrecco commented Feb 13, 2019

hgrecco Feb 13, 2019

andrewgsavage Feb 13, 2019

hgrecco Feb 13, 2019

hgrecco commented Feb 13, 2019

andrewgsavage commented Feb 13, 2019

hgrecco commented Feb 13, 2019

andrewgsavage commented Feb 13, 2019

andrewgsavage commented Feb 20, 2019

andrewgsavage commented Sep 3, 2019 •

edited

andrewgsavage commented Sep 3, 2019

jthielen commented Sep 3, 2019 •

edited

keewis commented Oct 30, 2019 •

edited

hgrecco commented Oct 30, 2019

jthielen commented Oct 30, 2019

shoyer commented Oct 30, 2019 via email

hgrecco commented Oct 30, 2019

jthielen commented Oct 30, 2019 •

edited

keewis commented Nov 6, 2019

jthielen commented Nov 6, 2019

keewis commented Nov 6, 2019

jthielen commented Nov 11, 2019

keewis commented Nov 27, 2019 •

edited

jthielen commented Nov 29, 2019

hgrecco commented Dec 2, 2019


		def build_quantity_class(registry, force_ndarray=False):

		class Quantity(BaseQuantity):

Split Quantity into scalar and sequence classes #764

Split Quantity into scalar and sequence classes #764

Conversation

andrewgsavage commented Feb 12, 2019

andrewgsavage commented Feb 12, 2019

hgrecco commented Feb 13, 2019

hgrecco Feb 13, 2019

Choose a reason for hiding this comment

andrewgsavage Feb 13, 2019

Choose a reason for hiding this comment

hgrecco Feb 13, 2019

Choose a reason for hiding this comment

hgrecco commented Feb 13, 2019

andrewgsavage commented Feb 13, 2019

hgrecco commented Feb 13, 2019

andrewgsavage commented Feb 13, 2019

andrewgsavage commented Feb 20, 2019

andrewgsavage commented Sep 3, 2019 • edited

andrewgsavage commented Sep 3, 2019

jthielen commented Sep 3, 2019 • edited

keewis commented Oct 30, 2019 • edited

hgrecco commented Oct 30, 2019

jthielen commented Oct 30, 2019

shoyer commented Oct 30, 2019 via email

hgrecco commented Oct 30, 2019

jthielen commented Oct 30, 2019 • edited

keewis commented Nov 6, 2019

jthielen commented Nov 6, 2019

keewis commented Nov 6, 2019

jthielen commented Nov 11, 2019

keewis commented Nov 27, 2019 • edited

jthielen commented Nov 29, 2019

hgrecco commented Dec 2, 2019

andrewgsavage commented Sep 3, 2019 •

edited

jthielen commented Sep 3, 2019 •

edited

keewis commented Oct 30, 2019 •

edited

jthielen commented Oct 30, 2019 •

edited

keewis commented Nov 27, 2019 •

edited