Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible bugs with NaN? #23

Closed
ghost opened this issue Mar 15, 2020 · 7 comments
Closed

Possible bugs with NaN? #23

ghost opened this issue Mar 15, 2020 · 7 comments
Labels
invalid This doesn't seem right

Comments

@ghost
Copy link

ghost commented Mar 15, 2020

I am too writing an interval library for C#, and while searching for design inspiration, came across portion. So I tried it out (never ever used Python before) and got this: math.nan in (-inf, inf) == true. Is that a bug? Doesn't seem to be the case for (-inf, 0) or (0, inf).

Also, (math.nan, 0).empty == false. This definitely seems like a bug.

@AlexandreDecan
Copy link
Owner

AlexandreDecan commented Mar 15, 2020

Hello,

This is not a bug, since math.nan cannot be meaningfully compared to other values, by definition:

>>> math.nan < 0
False
>>> math.nan > 0
False
>>> math.nan == 0
False

There is nothing we can do about this since it defines/overrides <, >, etc. preventing libraries to define appropriate (or meaningful) behaviour or to raise an appropriate error.

@AlexandreDecan AlexandreDecan added the invalid This doesn't seem right label Mar 15, 2020
@ghost
Copy link
Author

ghost commented Mar 15, 2020

But it should be detectable with math.isnan, shouldn't it be? This would allow to turn any atomic intervals initialized with NaN into empty ones, which would in turn enable proper containment checking (it could be checked for explicitly of course).

@AlexandreDecan
Copy link
Owner

AlexandreDecan commented Mar 15, 2020

Of course, I could check whether one of the bound is a math.nan object, but in that case, I'll have to do the same for numpy.nan, pandas.NA, etc. Since I cannot be exhaustive (because I'm not aware of all classes that overrides comparisons in a non-meaningful way), I prefer not to try to detect those cases, following Python philosophy (esp. given it adds a slight overhead, and Python is... eerh... not the fastest :-D).

Btw, it makes little sense to create an interval with math.nan, numpy.nan, etc. :-)

I'm sure this looks weird from someone not used to Python, but that's quite usual to not handle "special cases" except when they are not "so special" (e.g. infinities are special cases that are valid to support, but nan are special cases that are not that special since they correspond to an invalid usage) ;-)

@ghost
Copy link
Author

ghost commented Mar 15, 2020

Oh well, I guess I fell victim to thinking there is only one true Python NaN out there, which turns out not to be the case. Good luck developing portion!

@AlexandreDecan
Copy link
Owner

One official nan but many non-official ones ;-)

Keep me informed on your progress! At some point, if performances are really in favor of the c++ implementation, it could be interesting to convert portion to a wrapper around your implementation.

@ghost
Copy link
Author

ghost commented Mar 15, 2020

I doubt .NET Python interop is great to be honest, plus some semantics are going be different. However, if I am reading your #21 benchmarks right, then my intersects are about 4000 times faster with 500 atomics (which doesn't seem right).

Also, unions and complements seem to be much faster then intersects and differences while I have them take about the same time. Maybe there is some low-ganging fruit somewhere (algorithm wise) with them? Or maybe my implementation for unions is just really bad? IDK.

Maybe a new issue should be opened?

@AlexandreDecan
Copy link
Owner

I know nothing about .NET interop with Python. On the other hand, I know it's "quite easy" to integrate C/C++ code with Python, so... but anyway, we're not there yet ;-)

The benchmark in #21 is an informal one. I quickly ran this on my laptop, mainly to have an idea of the speed up obtained after having changed most algorithms. I wouldn't be surprised to see that any implementation of those operations in another language is faster (to some extent) than mine, since Python induces a quite large overhead in general (and since I cannot "specialize" these operations as the bounds can be any kind of object).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

1 participant