numpy.spacing documentation inaccuracies #15331

mdickinson · 2020-01-15T09:49:51Z

The current numpy.spacing documentation seems inaccurate. At the top, it says:

Return the distance between x and the nearest adjacent number.

But this isn't right for powers of 2: for example, if x = 1.0, the nearest representable float to x is 1.0 - 2**-53, so the distance would be 2.**-53. But in this case, spacing gives 2.0**-52.

It's also not immediately clear from the description what the expected sign of the result is. From experimentation, it looks as though np.spacing(-x) is -np.spacing(x), except in the case of zeros, where the result is the same for both negative and positive zeros.

The text was updated successfully, but these errors were encountered:

mdickinson · 2020-01-15T09:56:51Z

One more corner case: I'm not sure whether this counts as a documentation issue or an implementation issue (or both, or neither):

>>> np.spacing(np.finfo(np.float64).max)
inf

It could be argued that the right result here is the distance between that max value and the next float down. (That distance being 2.0**1023, for np.float64, assuming IEEE 754 binary64.)

seberg · 2020-01-15T14:55:01Z

So I guess it actually is the nearest "larger" floating point number (by absolute value)... I agree that "distance" should really be an absolute value, I have never used this function, so I am not sure if there is some use, in that a + np.spacing(a) gives something more useful then if it returned the actual distance.

I think the inf example follows from the other examples, and is just a bit more extreme case. As long as spacing is the larger distance between both neighbours (so to speak), the infseems consistent to me (plus it does give a warning for me).

The function is implemented as a bunch of bit-fiddling, could be easily adapted to return the absolute value, but for backward compatibility concerns.... The internal function could also be changed to return your value (it has a flag whether to return the larger or the smaller spacing and it is set to the larger spacing), but I would think the larger is just as well?

mdickinson · 2020-01-15T18:22:41Z

On the sign, having the sign of the result match the sign of the input seems a perfectly reasonable choice; I think it's potentially valuable that a + np.spacing(a) gives the next-away-from-zero value from a. I'm not suggesting any behaviour change here - even without backwards compatibility concerns it doesn't seem a clear choice either way (and as you say, with backwards compatibility concerns, it is clear that this shouldn't be changed).

The -0.0 corner case is a bit surprising, in that it's the only case where the sign of the result doesn't match the sign of the input. OTOH, there's an argument for having -0.0 and 0.0 behave identically in almost all numeric contexts, with a few well-known exceptions. Both behaviours seem reasonable, but again it would be good to document.

For the np.finfo(np.float64).max case, I guess I find this surprising because I don't think of the output as being a difference so much as being the value of the least significant place in the given float. OTOH, the "spacing" name clearly indicates that this should be thought of as a difference.

But I agree that all the current choices seem reasonable, and that this is really just a documentation issue.

For context, I was looking at this mostly because CPython just implemented math.ulp, with a similar purpose but a slightly different set of choices: https://bugs.python.org/issue39310.

eric-wieser · 2020-01-15T21:03:14Z

I think it's potentially valuable that a + np.spacing(a) gives the next-away-from-zero value from a.

Well, you can spell that as np.nextafter(a, np.inf), so I'm not certain that provides all that much value.

mattip · 2020-01-16T00:18:51Z

Are the new math.nextafter and math.ulp going to cause us a whole new set of edge-case incompatibilities with the numpy definitions?

mdickinson · 2020-01-16T18:38:45Z

@mattip I hope not. math.nextafter has identical semantics to numpy.nextafter (not really surprisingly, since they're both thin wrappers around C's nextafter).

I'd hope that math.ulp and numpy.spacing are sufficiently differently named that people won't assume without checking that they do the same thing. But even if they do, the most common use-case is presumably positive finite floats, and there the two agree (with the exception of the largest finite positive float).

@eric-wieser I guess my point was that you can't spell that as np.nextafter(a, np.inf): for negative a you need np.nextafter(a, -np.inf) instead, so to cover both cases you'd want something like np.nextafter(a, np.copysign(np.inf, a)). As I commented on the CPython issue for nextafter, what I commonly seem to need is next_away_from_zero(a), and it's a minor nice-to-have to be able to spell that as simply as a + np.spacing(a).

seberg · 2020-01-16T19:38:35Z

Well, right now we still have a chance to change Python. E.g. if we think that our definition for the largest representable float is more reasonable (inf rather than the smaller spacing), python can still change it.
The name spacing is actually pretty nice, and I wonder if python thought of it ;), so something that is not ulp but can be guessed to mean ulp, OTOH, python choosing a different name is better for us ;).

I guess, we could at some point add np.ulp and once we do that discourage np.spacing, in either case, I will make a note on the bpo to link here, I think they should at least be aware...

seberg · 2020-01-16T19:43:38Z

Sorry, nvm. I see you already commented about the existence of np.spacing. So the only question would be if someone here disagrees with the choice there to say that ulp(float_max) != inf, which seems strange, but the special case also seems a bit strange to me.

cournape · 2020-03-23T05:19:59Z

So just for context, IIRC, I initially implemented those functions to implement some test functions in np.testing, that was the main use case.

mrmbernardi · 2023-11-16T13:27:39Z

As of writing this comment the documentation is still wrong about the nearest adjacent number:

https://numpy.org/devdocs/reference/generated/numpy.spacing.html

miccoli · 2023-12-02T18:32:53Z

I was recently bitten by this inaccuracy, falsely assuming that np.spacing would return a positive value, according to the common meaning of distance.

If nobody else is working on this, I can open a PR:

DOC
- expliclty state that np.spacing returns an oriented distance pointing away from zero
- define edge cases based on current behaviour
- state differences with math.ulp
testing: check that what stated in the docs is true (if not already checked).

A maybe useful invariant for visualizing current behaviour could be

np.all(np.abs(np.spacing(a) + a) > np.abs(a))

which holds for all finite a.

seberg added the 04 - Documentation label Jan 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

numpy.spacing documentation inaccuracies #15331

numpy.spacing documentation inaccuracies #15331

mdickinson commented Jan 15, 2020 •

edited

mdickinson commented Jan 15, 2020 •

edited

seberg commented Jan 15, 2020

mdickinson commented Jan 15, 2020

eric-wieser commented Jan 15, 2020

mattip commented Jan 16, 2020

mdickinson commented Jan 16, 2020

seberg commented Jan 16, 2020

seberg commented Jan 16, 2020

cournape commented Mar 23, 2020

mrmbernardi commented Nov 16, 2023

miccoli commented Dec 2, 2023

numpy.spacing documentation inaccuracies #15331

numpy.spacing documentation inaccuracies #15331

Comments

mdickinson commented Jan 15, 2020 • edited

mdickinson commented Jan 15, 2020 • edited

seberg commented Jan 15, 2020

mdickinson commented Jan 15, 2020

eric-wieser commented Jan 15, 2020

mattip commented Jan 16, 2020

mdickinson commented Jan 16, 2020

seberg commented Jan 16, 2020

seberg commented Jan 16, 2020

cournape commented Mar 23, 2020

mrmbernardi commented Nov 16, 2023

miccoli commented Dec 2, 2023

mdickinson commented Jan 15, 2020 •

edited

mdickinson commented Jan 15, 2020 •

edited