Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Update `units` docstrings and doc pages to reflect new None and UnrecognizedUnit stuff #419

Merged
merged 2 commits into from Nov 13, 2012

Conversation

Projects
None yet
6 participants
Member

mdboom commented Oct 12, 2012

The docstrings and doc pages for the units package needs to be updated to reflect the changes made in syntax from #397 et al.

@ghost ghost assigned mdboom Oct 10, 2012

Owner

eteq commented Oct 14, 2012

@mdboom - can you also update the documentation itself? I think it's important that it be very obviously (and perhaps multiple times) emphasized that UnrecognizedUnit, Unit("") and None have subtly different meanings. (Or let me know if you don't have time, and I can try to take a whack at it.)

@demitri demitri referenced this pull request in astropy/astropy-api Oct 15, 2012

Merged

Add quantity API #2

Contributor

demitri commented Oct 15, 2012

Replying to astropy/astropy-api#2 (comment)

@astrofrog Sorry, I don't see that at all. No one would ever write

5 '' * 5 m

, so we shouldn't make people code that. Again, the FITS standard has no relevance on how Unit is implemented, but the empty string is used as such since they didn't bother to define a value that represents no unit. I feel that Unit("") isn't clear to someone who isn't intimately familiar with these classes (i.e. a greener coders).

Can someone identify a need for a Unit that has no unit? In NDData, there is a unit property, so it can be set to null in the event there is no unit. That's logically and code-consistent. A Quantity object that has no unit is not a Quantity by definition; it's a scalar value. I can't define an Angle with no value.

Contributor

demitri commented Oct 15, 2012

@mdboom I noticed in this pull request there are two functions get_equivalent_units and print_equivalent_units.

  • get_equivalent_units : I propose this be a method on Unit, not a standalone function. I don't see a reason to have

    get_equivalent_units(u.m)

over

u.m.equivalent_units

The latter is more OO. (Also, the "get_" should be dropped.)

  • print_equivalent_units : Similarly, there is no need for this function at all. The method above returns a set (though I propose the last line (1068) should be)

    return list(equivs)

If you want to print it out, use

print("\n".join(u.m.equivalent_units)

or

print(" ".join(u.m.equivalent_units)

The docs say the point of the function is to print to the console, but if you're on the console you can simply type:

>>> u.m.equivalent_units
Member

adrn commented Oct 15, 2012

@demitri @astrofrog Hm, actually you could have a quantity without a unit -- what distinguishes it from a scalar is that it may have some uncertainty (for example, fractional mass). I have to agree though that unit=None reads as english, whereas unit="" is a bit ambiguous.

Contributor

demitri commented Oct 15, 2012

@adrn That's a measurement (or Measurement :), not a Quantity.

Member

mdboom commented Oct 15, 2012

This discussion is helpful as to how the proposed documentation section is written.

As I see it, a unit of 'None' reads as "a unit was not provided" or "undefined". A dimensionless unit is something else: it is for quantities that are defined to be unitless -- for example the result of the question "how many meters in a lightyear"? In that case, we are certain that has a unit, but that unit is dimensionless.

Also as an implementation detail, a dimensionless unit is also required for arithmetic operations to work. Think of the question "how long does it take to travel 100 kilometers at 30 kilometers an hour?" You end up with the unit expression (km / (km / h). Unless the order of operations is carefully done, you end up with a temporary of (km / km) (i.e. dimensionless). This should not be invalid or raise an exception.

I agree that it might be better to have some token indicate the dimensionless unit, that is not how the FITS standard defines it. The FITS standard has a great deal of bearing on the behavior of the unit framework. One of its important design virtues is that is comforms and supports to an existing standard that is well-used (for better or worse) in astronomy. Inventing our own language will only hurt interoperability.

Member

mdboom commented Oct 15, 2012

As for get_equivalent_units and print_equivalent_units: @demitri: I'm sure you're not saying that

In [2]: u.get_equivalent_units(u.m)
Out[2]: 
set([Unit("pc"),
     Unit("inch"),
     Unit("AU"),
     Unit("m"),
     Unit("solRad"),
     Unit("mi"),
     Unit("yd"),
     Unit("micron"),
     Unit("ft"),
     Unit("Angstrom"),
     Unit("lyr")])

is preferable to:

In [3]: u.print_equivalent_units(u.m)
Primary name | Unit definition | Aliases     
AU           | 1.495979e+11 m  | au          
Angstrom     | 1.000000e-10 m  | AA, angstrom
ft           | 3.048000e-01 m  | foot        
inch         | 2.540000e-02 m  |             
lyr          | 9.460730e+15 m  |             
m            | irreducible     | meter       
mi           | 1.609344e+03 m  | mile        
micron       | 1.000000e-06 m  |             
pc           | 3.085678e+16 m  | parsec      
solRad       | 6.955080e+08 m  |             
yd           | 9.144000e-01 m  | yard        

There is precedence in the stdlib (e.g. the traceback module) and within astropy (e.g. the table module) for convenience functions for printing things in a more convenient way at the console.

If we want to roll this into one function, this could be handled with the "trick" of returning a class that inherits from list and overrides __repr__. That would be acceptable, but I generally don't care much for "tricks" like this except where absolutely necessary.

Owner

astrofrog commented Oct 15, 2012

@demitri - I agree you'll never write 5 '' but I was just trying to make the point, as @mdboom did more clearly after, that we're talking about dimensionless quantities, not values with no units. One could almost argue that Unit('') looks more like a dimensionless unit, whereas Unit(None) is ambiguous... Another way to look at this is that the string representation of a Unit should be able to initialize Unit, and str(Unit("km") / Unit("km")) is '' so '' should initialize dimensionless units. In any case, I'm not saying Unit('') should be the only way, but I don't think we should be debating about that one, since it's actually the only one that's not ambiguous.

Member

adrn commented Oct 15, 2012

@astrofrog I agree +1

Owner

eteq commented Oct 15, 2012

I agree with @astrofrog that Unit(None) is ambiguous. That's one of the points of this issue, to clear up that confusion in the docs, and make it apparent that None by itself means no unit, and Unit(None) is invalid because it's confusing whether or not you meant to say Unit(None) or just None. I also agree with @mdboom that we do need to continue supporting FITS units, as much as it pains me to say so. @mdboom, does VOTable have a similar rule that the empty string means dimensionless? I couldn't figure out for sure after looking at the spec...

That said, I also see @demitri's point that Unit("") is also a bit ambiguous, as it could also be interpreted as "no unit" (e.g., "empty unit"). Unit(1), as @wkerzendorf suggested, seems less ambiguous to me. I see it is currently already supported - i.e., Unit("") == Unit(1) in the currently existing code. Perhaps we should agree that all examples and internal Astropy code should use Unit(1) instead of Unit("")? If so, perhaps repr(Unit("")) should give 'Unit(1)'instead of 'Unit("")' as it does now?

Contributor

demitri commented Oct 15, 2012

Look, this whole thing is being made far more complicated than it need be. Modern code doesn't do things like "a blank string means x" or "-999 means undefined" or "1 means nothing". What we want is a dimensionless Unit, so make one:

class UnitDimensionless(Unit):
    ...singleton object...

It's a special object, we recognize and use it as such. If you really want this:

u1 = Unit("") # returns u.UnitDimensionless object

then fine, but there's so much functionality being shoehorned into the same field. That's why there's so much ambiguity / lack of a clear solution here.

Since it's a subclass, it can be used everywhere a Unit can be.

Member

mdboom commented Oct 15, 2012

There's simply no need for another subclass. A CompositeUnit contains a list of bases and powers. If those lists are empty, it's a dimensionless unit by definition. I see no need to add additional classes for things that are already handled naturally by the existing design. It's like doing this:

   class Zero(int):
      pass

The dimensionless unit is a value, not a type.

Member

embray commented Oct 15, 2012

@mdboom +1

Contributor

demitri commented Oct 15, 2012

"zero" (a value) lives on the continuum of ints. "undefined" or "null" (a lack of a value) does not. I don't see it as the same thing. You must be Roman. :)

But regardless, Unit(1) makes no sense (reading it) to me. Unit("") isn't clear either.

Owner

eteq commented Oct 15, 2012

@demitri - your example illustrates exactly the point - "zero" is indeed an int... In the same way that a dimensionless unit is in the set of all possible units. After all, it's equivalent to e.g. "meters per meter", which is another way of phrasing what @mdboom said. In general, people expect their common concepts to have closure (in the mathematical sense), and that's all that we're invoking here. Thus, to me, Unit(1) makes perfect sense. I agree with @mdboom that adding a subclass seems unnecessary and possibly confusing because it implies that dimensionless is somehow "beyond" units.

That said, it probably does make sense to put dimensionless = Unit(1) in the units package so that people can do Quantity(1/137., u.dimensionless) in their code. And maybe even change the __str__ and/or __repr__ to explicitly yield "Unit(dimensionless)" or something like that?

The dimensionless unit is distinct from having no unit, though. That's exactly why the None unit was replaced by just None. It is perfectly reasonable to define an array of raw numbers without a unit. There are entire sub-fields of mathematics that deal with numbers completely independent of whether they have units. Perhaps units will later be attached, but the operations on those objects are still valid even independent of the pysical meaning of the values (a key insight of group theory!). There's no reason to ignore those applications.

Member

mdboom commented Oct 15, 2012

@eteq wrote:

That said, it probably does make sense to put dimensionless = Unit(1) in the units package so that people can do Quantity(1/137., u.dimensionless) in their code. And maybe even change the __str__ and/or __repr__ to explicitly yield "Unit(dimensionless)" or something like that?

That is a fantastic idea, IMHO.

Owner

astrofrog commented Oct 15, 2012

Contributor

demitri commented Oct 15, 2012

I'm happy with u.dimensionless since it's clear and has no abstraction (which was my real complaint). I advocate for using u.dimensionless wherever possible in tutorials and examples rather than Unit(""). That form is perfectly fine to support in the internals of whatever (e.g. FITS table reading), but I'd like to encourage the use of the singleton everywhere else.

Owner

eteq commented Oct 15, 2012

@demitri - yeah, I definitely agree that using u.dimensionless for example code is better than any of the Unit(1) or Unit("") or whatever cases.

Member

mdboom commented Oct 24, 2012

FWIW: I think this is good to go now and seems to address everyone's comments above.

Owner

eteq commented Oct 25, 2012

Oh, there's now a conflict (probably from my merging #431)

Owner

eteq commented Oct 25, 2012

In the new documentation section on the dimensionless unit, I think it might be wise to specifically mention that "dimensionless" is distinct from "no unit", and that we recommend the latter should be represented using None.

Oh, and perhaps also add an extra line in the "getting started" examples along the lines of:

u.m/u.m
>>> Unit(dimensionless)

Other than that, looks good to me.

Contributor

demitri commented Oct 25, 2012

The docs say

~astropy.units.equivalencies.sp is a function that returns an equivalency list to handle conversions

But what does "sp" mean or stand for? I assume it stands for something; it should be spelled out so that people at least have a mnemonic. Maybe the whole function name should be spelled out. I don't know if passing a function like that is the most natural way to do things - it's certainly not intuitive to me. I think adding a third unlabelled parameter to this

u.nm.to(u.Hz, [1000, 2000], u.sp())

is losing the Python-ness - it's not at all clear without reading the docs what's going on. At the very least make it something like this:

u.nm.to(u.Hz, [1000, 2000], equiv=u.sp())

or

u.nm.to(u.Hz, [1000, 2000], unit_equiv=u.sp())

Can that term be a lambda function? That would make a good example.

And I still find the unit-unit-value non-intuitive as well. It's like thinking in RPN. We think of conversions in terms of value-unit-unit, e.g. "12 inches to centimeters", not "inches, centimeters, 12". This is where the Quantity class came in:

(12*u.in).to(u.cm)

is more like how we think. While the code supports both, I think the docs should emphasize this form first and foremost, and mention afterwards the unit-unit-value format. If you just want the conversion value, it's clear that

(1*u.in).to(u.cm)

naturally becomes

u.in.to(u.cm)

In this example:

>>> liters_water = [
   (u.l, u.g, lambda x: 1000.0 * x, lambda x: x / 1000.0)
]

please explain in more detail what's going on here. You're building a list, but of one object, a tuple... I don't know what you're doing here. And also please change:

For example, in an old definition of the metric liter, it was defined as the volume of 1000 grams of water at 4∘C, a standard pressure and a bunch of other assumptions.

to

For example, until 1964 the metric liter was defined as the volume of 1kg of water at 4°C at 760mm mercury pressure.

I might also suggest choosing a better example than a half-century old definition of the liter.

Contributor

demitri commented Oct 25, 2012

>>> u.m.is_equivalent(u.foot)
True

Well, this is not true, is it? :) I propose changing this to

>>> u.m.convertible_to(u.foot)
True
>>> u.m.convertible_to("second")
False

Note that the language that you yourself use in the documentation is "convert" and "incompatible"; you don't use the word "equivalent" at all when describing it.

I'd expect

>>> u.m.is_equivalent(u.km/1000)
True

but then, we use "=" for equivalency.

Contributor

demitri commented Oct 25, 2012

Under "Obtaining a Conversion Function", I think this can be made more clear. speed_unit is ambiguous since both cm/s and m/h are both "speed units". It seems more natural to do something like this instead:

speed_conversion = u.converter(from=(u.cm/u.s), to=(u.m/u.hour))

And this is better named

cms_to_mph = u.converter(from=(u.cm/u.s), to=(u.m/u.hour))
cms_to_mph(100)
cms_to_mph([1000,2000])

This makes it more clear as to what's happening.

But honestly, this example doesn't demonstrate why this is more useful than

([1000, 2000]*u.cm/u.s).to(u.m/u.hour)

If it's something that you intend to be reusable, then explicitly say that. You mention "for example when there are equivalencies in use", but don't actually provide an example where I think that would be illustrative.

Contributor

demitri commented Oct 25, 2012

Why not make these

print fluxunit.to_string('console')
print u.Ry.decompose().to_string('unicode')

functions?

print fluxunit.console()
print u.Ry.decompose().unicode()
Contributor

demitri commented Oct 25, 2012

u.Unit("m/s/s")

I would argue that this is unambiguous, and so should not raise an error. The fact that the FITS format doesn't allow it doesn't mean that we should be constrained by that. I'd propose:

u.Unit("m/s/s", format="fits")

if you want to be strict to a particular format. You mention three different ones in the docs, so it makes sense that you can choose a particular one. Who's to say that there may not be incompatible interpretations of strings between these formats and future ones? Then when you specify a certain format, if there is a problem you raise a specific exception. I think the raise, warn, silent options are still odd - the Pythonic way is to throw an exception. The user (i.e. developer) has the option to either handle it, issue a warning, or pass. How Units are read by a FITS package is not a concern of Unit.

The "generic" format should be what astropy handles (default), not mapped strictly to FITS (which is almost but already isn't).

Member

mdboom commented Oct 25, 2012

@eteq wrote:

In the new documentation section on the dimensionless unit, I think it might be wise to specifically mention that "dimensionless" is distinct from "no unit", and that we recommend the latter should be represented using None.

Oh, and perhaps also add an extra line in the "getting started" examples along the lines of:

   u.m/u.m
   >>> Unit(dimensionless)

Other than that, looks good to me.

Ok. That all makes sense.

@demitri said:

But what does "sp" mean or stand for?

See #412

Can we do "u.nm.to(u.Hz, [1000, 2000], equivs=u.sp())"

Yes -- the kwarg in the examples can be made explicit. Note it's pluralized for a reason, too. We could also show equivs=u.sp() + u.sd().

Can that term be a lambda function?

No. The kwarg requires a set of mappings, not a function. There are builtin functions to generate those mappings, one of which takes an argument, which is why the equivalency generators are functions, not constant values.

And I still find the unit-unit-value non-intuitive as well. It's like thinking in RPN. We think of conversions in terms of value-unit-unit, e.g. "12 inches to centimeters", not "inches, centimeters, 12". This is where the Quantity class came in.

That should be part of #445. We don't even have a Quantity class finished yet.

In [the liter to gram example] please explain in more detail what's going on here. You're building a list, but of one object, a tuple... I don't know what you're doing here.

It's explained in the paragraphs above it. Please elaborate on why it's unclear.

I might also suggest choosing a better example than a half-century old definition of the liter.

Please suggest an alternative. Wavelength would have been another good choice, but it's built-in, and providing a long-form (custom) example runs the risk of suggesting users do it that way all the time. I chose an example that I thought was well-known from high school physics classes.

Note that the language that you yourself use in the documentation is "convert" and "incompatible"; you don't use the word "equivalent" at all when describing it.

I'd like some consensus around this. This language was borrowed from pynbody, where it's been there a long time, and has passed many rounds of review here.

speed_unit is ambiguous since both cm/s and m/h are both "speed units".

Yes -- renaming those variables makes sense.

It seems more natural to do something like this instead: speed_conversion = u.converter(from=(u.cm/u.s), to=(u.m/u.hour))

It's only more natural if one expects to find this functionality in a function rather than a method. I'm not sure that's the case given that everything else is a method -- including things you've asked to be converted to methods. Also, there's no way to force the use of kwargs in Python, so naming them doesn't actually force improved readability.

You mention "for example when there are equivalencies in use", but don't actually provide an example where I think that would be illustrative.

Sure. An explicit example of this can be added.

Why not make these fluxunit.to_string('console') functions.

Sure. They could be -- though I would suggest having them be autogenerated since the available set of output formatters is pluggable.

u.Unit("m/s/s") I would argue that this is unambiguous, and so should not raise an error.

This isn't from the docs, is it? I can't find it by grepping. Just do u.m/u.s/u.s. Python defines a left-to-right ordering of equivalent-level operations. The FITS unit standard does not and explicitly disallows this.

We need to have a standard and well-defined format for "units as strings". The most common one in use is FITS so an extended version of that is what we use as the default, though, as you point out, it has some limitations. It makes no sense to go down the road of inventing our own unit language. I suggest you read the proposed IVOA standard http://www.ivoa.net/Documents/VOUnits/ for background on this.

@demitri: Can you in future avoid hijacking pull request discussions with unrelated items? The github discussion form is not threaded, so it is difficult to discuss more than 2 or 3 things at a time, and it also only serves to hold back progress by pushing PRs to be about many more things than they were intended. To the extent that any of these items are not related to the header of the PR, they should in most cases be created as new issues.

Owner

eteq commented Nov 1, 2012

@mdboom @demitri - Am I correct in saying that all the issues specific to this PR (narrowly defined as just dealing with the dimensionless/None unit clarifications) have been addressed? It looks fine to me.

If so we should merge this and other specific issues should get their own PR or issue.

Contributor

demitri commented Nov 1, 2012

@eteq Sure. I'll move my comments (on my list of things to do) elsewhere. I didn't mean to "hijack" this thread; I was just commenting on what I was seeing on the doc pages. There are too many threads going for me to keep track of.

Owner

eteq commented Nov 1, 2012

Ok, cool. @mdboom, if you're done making changes, feel free to merge this - it looks great to me.

Owner

eteq commented Nov 13, 2012

@mdboom, is this all set to go?

Member

mdboom commented Nov 13, 2012

Yes, I believe so.

Owner

eteq commented Nov 13, 2012

Alright, merging. Thanks @mdboom !

eteq added a commit that referenced this pull request Nov 13, 2012

Merge pull request #419 from mdboom/unit/docstring_fixes
Update `units` docstrings and doc pages to reflect new None and UnrecognizedUnit stuff

@eteq eteq merged commit edcee79 into astropy:master Nov 13, 2012

1 check passed

default The Travis build passed
Details

@mdboom mdboom deleted the mdboom:unit/docstring_fixes branch May 21, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment