Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't instantiate Quantity with logarithmic units. #5851

Closed
weaverba137 opened this issue Mar 1, 2017 · 19 comments
Closed

Can't instantiate Quantity with logarithmic units. #5851

weaverba137 opened this issue Mar 1, 2017 · 19 comments

Comments

@weaverba137
Copy link
Member

I'm trying to create a Quantity that represents units of wavelength in log_10(Angstrom), so I figured dex(Angstrom) would work fine. It only sort of does.

>>> foo = np.linspace(3., 5., 10) * u.Unit("dex(Angstrom)")
>>> foo
<Dex [ 3.        , 3.22222222, 3.44444444, 3.66666667, 3.88888889,
       4.11111111, 4.33333333, 4.55555556, 4.77777778, 5.        ] dex(Angstrom)>
>>> foo.physical
<Quantity [   1000.        ,   1668.1005372 ,   2782.55940221,
              4641.58883361,   7742.63682681,  12915.49665015,
             21544.34690032,  35938.13663805,  59948.42503189, 100000.        ] Angstrom>
>>> bar = Quantity(np.linspace(3., 5., 10), unit=u.Unit("dex(Angstrom)"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/weaver/Documents/local/products/anaconda/envs/py3/lib/python3.5/site-packages/astropy/units/quantity.py", line 332, in __new__
    value._set_unit(value_unit)
  File "/Users/weaver/Documents/local/products/anaconda/envs/py3/lib/python3.5/site-packages/astropy/units/quantity.py", line 722, in _set_unit
    .format(type(self).__name__, UnitBase, type(unit)))
astropy.units.core.UnitTypeError: Quantity instances require <class 'astropy.units.core.UnitBase'> units, not <class 'astropy.units.function.logarithmic.DexUnit'> instances.

So why can't DexUnit instances inherit from UnitBase? This is a pretty serious problem for me because I use dex(Angstrom) units all the time. For example, basically every SDSS spectrum file uses these units.

@pllim pllim added the units label Mar 1, 2017
@mhvk
Copy link
Contributor

mhvk commented Mar 1, 2017

@weaverba137 - before anything else, very happy to hear the logarithmic units actually get used!

The problem you encounter is actually not as much about logarithmic units not being subclasses of regular ones, but more directly about logarithmic and normal units having different rules about how they combine (e.g., normal units can be multiplied and divided together, while logarithmic ones have to be added or subtracted, so the two cannot share methods for operations). The same holds for quantities, which is why one cannot construct a Quantity using a logarithmic unit: it's methods would all do the wrong thing.

Now, the regular way to construct a logarithmic quantity would be to either use the multiplication as you did in your first example, or, if you want to construct it directly, use the appropriate class, i.e., do

u.Dex(np.linspace(3., 5., 10), u.dex(u.AA))

Is there a reason this does not work for you?

Note that, in principle, it is possible to make the initialization with Quantity work, if we give Quantity the option to return a Dex instance (it all goes through Quantity.__new__, so one does not have to return a Quantity instance). I've not done that so far, since I feared the risk of confusion outweighed the benefit. But I might be convinced otherwise. Alternatively, if you'd want something like that, for now you could simply make a function yourself like the following:

def q(value, unit=None, **kwargs):
    if unit is not None:
        unit  = u.Unit(unit)
    return getattr(unit, '_quantity_class', u.Quantity)(value, unit, **kwargs)

q(np.linspace(3., 5., 10), u.AA)
# <Quantity [ 3.        , 3.22222222, 3.44444444, 3.66666667, 3.88888889,
#             4.11111111, 4.33333333, 4.55555556, 4.77777778, 5.        ] Angstrom>
q(np.linspace(3., 5., 10), u.dex(u.AA))
# <Dex [ 3.        , 3.22222222, 3.44444444, 3.66666667, 3.88888889,
#       4.11111111, 4.33333333, 4.55555556, 4.77777778, 5.        ] dex(Angstrom)>

I've played with the idea of implementing this more generally (i.e., one would have u.q(value, unit, ...)), but wasn't sure that this would be useful overall, since it seemed that in general one better was sure whether something had a regular or logarithmic unit.

@weaverba137
Copy link
Member Author

The problem you encounter is actually not as much about logarithmic units not being subclasses of regular ones, but more directly about logarithmic and normal units having different rules about how they combine (e.g., normal units can be multiplied and divided together, while logarithmic ones have to be added or subtracted, so the two cannot share methods for operations). The same holds for quantities, which is why one cannot construct a Quantity using a logarithmic unit: it's methods would all do the wrong thing.

All right, I understand that, so can you point me to the documentation on this? I've looked through the Quantity class API, and can't find anywhere it says that

q = Quantity(value, unit=u.Unit('dex(Angstrom)')

is forbidden. Naively, one would assume that would just do the right thing. In fact that is exactly what the intermediate code that triggered this is doing.

@mhvk
Copy link
Contributor

mhvk commented Mar 1, 2017

At present, the documentation indeed does little more than point you to what would be the right way to do it, both in the getting started section and in the actual section of logarithmic units. It does indeed say little about mixing; I obviously didn't think about that when I wrote the documentation, perhaps reflecting that, generally, documentation written by developers is often not quite what someone new to the project would like to see. Of course, this also means I'm not the right person to try to change it -- PRs most welcome! (Especially if you can find a way to phrase this not just as a "warning, don't do this", which is pretty useless for documentation.)

p.s. Just in case you weren't aware, we do support for serializing to/from yaml (#5486)

@eteq
Copy link
Member

eteq commented Mar 1, 2017

@mhvk - I think I agree with the fundamental point @weaverba137 is making here. That is to say, most users I know of (myself included) have in their head that:

somenumber * unit  === u.Quantity(somenumber, unit)

I think it's counter-intuitive to violate that in some cases and not others. I didn't quite understand what you were saying is confusing about that in #5851 (comment) ? (Or am I mis-understanding and you're not saying that?)

documentation written by developers is often not quite what someone new to the project would like to see ... PRs most welcome

100% agree there!

@mhvk
Copy link
Contributor

mhvk commented Mar 1, 2017

My worry is simply about breaking the expectation (i.e., another thing that users likely have in their head) that when one uses a class to initialize an instance, the result is an instance of that class, i.e., type(cls(...)) is cls. I think this expectation is fulfilled pretty much anywhere in python, so one should have a good reason to break it.

Now it is true that in our case one would still have that isinstance(cls(...), cls) be true, since Dex, Magnitude, etc., are Quantity subclasses. Furthermore, since we subclass ndarray and thus use __new__ rather than __init__, it is not difficult in principle to return Quantity subclasses as dictated by the unit. My hesitation remains that while these are Quantity subclasses, they still behave quite differently, have different arithmetic, etc., not unlikely to break other code. I.e., I see a benefit in forcing users to think about the type of quantity they're dealing with, and not try too hard to magically make everything work. In this respect, it is worth noting that while multiplication with a logarithmic unit and a regular one works the same way, the two are not the same thing -- the logarithmic unit is really a container of a regular one, with different behaviour.

Given this, if we do want a standard way to initialize, my tendency would be to make a new factory function and update our documentation to mostly use q(value, unit) instead of Quantity(value, unit). One could then even think of, e.g., using Angle if the unit were angular.

@mhvk
Copy link
Contributor

mhvk commented Mar 1, 2017

Thinking a bit further: Right now, Quantity(Angle(...)) returns a Quantity. If we allow the constructor to produce subclasses, we would probably have to break this. But we could also go the other route, insisting that we get a Quantity out whenever possible. In that case, Quantity(Dex(...)) should perhaps just return Dex(...).physical. From the perspective of a user who does feel a clear physical difference between a logarithmic and regular unit, this may even be more logical than using a property on the class.

Just to see what works now:

w = u.Dex(1*u.AA)
w
# <Dex 0.0 dex(Angstrom)>
u.Quantity(w)
# ... TypeError
u.Quantity(w, u.AA)
# <Quantity 1.0 Angstrom>

Here, the TypeError is raised because if no unit is passed in, it is taken from the value (i.e., is dex(AA) in this case). It could be changed to be the physical unit of the value. Of course, this would not solve problems like those of @weaverba137, but might more easily lead one to find the solution.

@pllim
Copy link
Member

pllim commented Mar 2, 2017

I feel like this is the same issue I brought up some time ago, which incidentally interrupted Marten's vacation... #5178 (comment) 😅

@eteq
Copy link
Member

eteq commented Mar 2, 2017

My worry is simply about breaking the expectation (i.e., another thing that users likely have in their head) that when one uses a class to initialize an instance, the result is an instance of that class, i.e., type(cls(...)) is cls. I think this expectation is fulfilled pretty much anywhere in python, so one should have a good reason to break it.

@weaverba137, what do you think about this? (Asking persuent to @mhvk's point that the non-expert on Quantity is best posed to provide context of "user expectations")

@weaverba137
Copy link
Member Author

Well, to be honest, I think that is "Python developer"-oriented thinking, rather than "End user"-oriented thinking. I am both, and I can see the viewpoint of both, but I think that the majority of users would tend to want Quantity to just do the right thing, rather than have class/type purity.

If you want class/type purity and want to just do the right thing, you could replace Quantity with a factory function that would return a pure Quantity for regular units, and some other class for logarithmic units.

@mhvk
Copy link
Contributor

mhvk commented Mar 2, 2017

I don't think "the right thing" is all that obvious, but it seems we're converging on that it would be nice to have a factory function that, e.g., avoids the copy that one gets by just multiplying a value with a unit. I note that, including changes to the documentation, this is not that minor an endeavour. It may be best to raise a separate issue for it where we can collect a list of things this function should do and what the ramiifcations are for it, so that someone who is interested could tackle it.

@mhvk
Copy link
Contributor

mhvk commented Mar 2, 2017

Another thought: in context of the zen of python -- There should be one-- and preferably only one --obvious way to do it -- yet another suggestion is simply to state that multiplication with a unit is the right way to instantiate a Quantity (with perhaps still that long-promised context manager that allows one to avoid copies).

@eteq
Copy link
Member

eteq commented Mar 3, 2017

@mhvk - hmm, I see the argument there... But that doesn't work for the use case of "I have a thing that I want to convert into a Quantity but I'm not sure what it is" use case. But maybe that already is a "developer" case?

So along those lines... @weaverba137, what was your need of the Quantity constructor that led to this in the first place?

@weaverba137
Copy link
Member Author

Good question. I first encountered this when trying to plot SDSS data in specviz. SDSS almost always stores wavelength values in log(Angstrom), so it seemed like telling specviz that the units were dex(Angstrom) would just work. Until a very recent bug fix, specviz was directly instantiating the wavelength array using Quantity(wavelength, unit=wavelength_units) and wavelength_units = dex(Angstrom) broke that.

@eteq
Copy link
Member

eteq commented Mar 9, 2017

Ah, ok, and I'm pretty sure the specviz use case is indeed the "I have a thing that I want to convert into a Quantity but I'm not sure what it is" use case... so that's a pretty legit concern then.

So it might be that the factory function should be something explicitly tailored to that sort of as a developer tool - e.g. create_quantity_from_any(...), but tell users that the typical case should just be val * unit? Then it's not violating the maxim @mhvk rightly pointed out, because it's two ways to do two different things?

@pllim
Copy link
Member

pllim commented Mar 9, 2017

@eteq , it wouldn't be so hard for SpecViz to refactor a little to just use the * in their parser, right?

@eteq
Copy link
Member

eteq commented Mar 9, 2017

@pllim - I think it's a subtly different use case, because it's a case of "if it's already a quantity in that unit, it's OK as-is, but if it's an array, assign it the units of unit", I think. And once you roll in the possibility of various other numpy arguments like ensuring a certain dimension, it becomes necessary to have more than two arguments (which is, of course, the limitation of value * unit).

But maybe @nmearl can comment more on that use case?

@mhvk
Copy link
Contributor

mhvk commented Apr 1, 2017

@weaverba137 - somehow I only remembered now that Quantity has the ability to deliver subclasses, if one uses subok=True. Now, in current master this does not work quite the way you would want it, but there doesn't seem to be a particular reason not to extend that usage to your case. So... see #5928.

@weaverba137
Copy link
Member Author

This seems reasonable. I'll take a look at the PR.

@mhvk
Copy link
Contributor

mhvk commented Apr 19, 2017

a solution was merged with #5928

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants