Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Implement `np.floating.as_integer_ratio` #10741

Merged
merged 2 commits into from Apr 19, 2019

Conversation

@eric-wieser
Copy link
Member

commented Mar 14, 2018

The original incentive here was to port tests from CPython for #9963

Fixes #9969

@mhvk
Copy link
Contributor

left a comment

Beyond the comment about the test, I think this needs an entry in the release notes - I presume this is mostly there for compatibility with cpython float?

numpy/core/tests/test_scalar_methods.py Outdated Show resolved Hide resolved

@eric-wieser eric-wieser force-pushed the eric-wieser:as_integer_ratio branch from 38924fb to 49455c9 Mar 15, 2018

@eric-wieser

This comment has been minimized.

Copy link
Member Author

commented Mar 15, 2018

Updated with docstring and release notes

@njsmith

This comment has been minimized.

Copy link
Member

commented Mar 16, 2018

I assume this is inspired by the debate raging in python core:

The argument there is actually about adding as_integer_ratio to int, which might suggest doing the same for the np.integer types. Of course our ints are less duck-compatible with floats than Python's ints, because if you try to treat say a np.int8 as if it were a float then you're going to have a bad time with wraparound.

The whole topic seems very unimportant to me, because who even cares about as_integer_ratio on floats or ints? It seems like parlor trick. If you want to find out the exact value of a float there are more useful ways. But OTOH I guess this isn't necessarily an argument against it either.

I'm wincing at adding more incompatibilities between scalars and 0d arrays. OTOH I don't know that eliminating those is actually realistic, and we already have this on float64, and this is unlikely to add major new obstacles since it seems unlikely anyone will use it much.

It's also unusual and surprising that this returns Python ints rather than numpy ints, but OTOH I guess this is necessary because the returned values might not fit into an int64.

So kinda mixed feelings about this overall. I guess my overall impression is an emphatic meh?

Am I missing important use cases?

@njsmith

This comment has been minimized.

Copy link
Member

commented Mar 16, 2018

Oh, on further examination it looks like that started out as a debate about the float.is_integer method, which then dragged in as_integer_ratio as it went. Sorry, this is confusing...

is_integer is much more of an attractive nuisance than as_integer_ratio, since people might actually think they want it, but they usually don't (because of the usual floating point rounding issues).

@eric-wieser

This comment has been minimized.

Copy link
Member Author

commented Mar 16, 2018

The argument there is actually about adding as_integer_ratio to int, which might suggest doing the same for the np.integer types

I saw something like that elsewhere, but decided to wait to see what CPython does first. As they point out, most of the value is in consistency - and what we do right now is consistent with CPython!

If you want to find out the exact value of a float there are more useful ways.

I don't think this is true for np.longdouble. You could use Decimal(str(f)), but that loses precision.

I guess this is necessary because the returned values might not fit into an int64.

Yep, otherwise we may as well implement longdouble.as_integer_ratio as float(ld).as_integer_ratio() which drops precision, defeating the point of the function

I'm wincing at adding more incompatibilities between scalars and 0d arrays

I'm hoping that eventually we'll solve the dtype-specific-methods-on-arrays problem

@njsmith

This comment has been minimized.

Copy link
Member

commented Mar 17, 2018

If you want to find out the exact value of a float there are more useful ways.

I don't think this is true for np.longdouble. You could use Decimal(str(f)), but that loses precision.

Wait, but wasn't that the whole point of the Dragon4 work, that str(f) shouldn't lose precision? Does it not work for longdouble? This seems like a bug to fix regardless.

I'm hoping that eventually we'll solve the dtype-specific-methods-on-arrays problem

I don't see how any solution is really possible, since if we allow arbitrary methods to be added to arrays then it means we can never add any new methods to arrays ourselves without potentially breaking backcompat :-/.

@seberg

This comment has been minimized.

Copy link
Member

commented Mar 17, 2018

Sorry, I realized this is probably just spitting out noise, but since I had started writing, I did not want to delete it again ;).

My (probably naive) gut feeling on dtype specific methods is to allow something like an elementwise property which could give access to ufuncs (or at least unary ufuncs) and could mimic such methods. But then I also am of the opinion that scalars are a good idea, and I seem to be in the minority there :) (not in the sense of code duplication of course, but because of things like mutability and because I actually believe that in almost all cases it would be "obvious" if the result will be scalar or array).

The main "oddity" about compatibility is in my opinion not the 0D arrays do not behave like scalars, but the part that scalars have to pretend they are 0D arrays because we convert 0D arrays to scalars for good reason almost everywhere. It seems to me that this incompatibility is the other way around though, so I do not really see much of an issue ;).

I would almost like something like:

class MyFloat(np.float64):
    # EDIT: Defining casting might be really annoying :(

    add_method_from_ufunc(ufunc_implementation, 'possibly_method_name')

    # Or for convenience, define it all in python (slow, but....):
    @as_ufunc(out_dtype=object)
    def as_integer_ratio(self):
        # Numpy can wrap it into a ufunc:
        return integer_ratio

# EDIT: To explain:

val = MyFloating(0.1)
val.as_integer_ratio()  # Would work

val_arr = np.array([val])  # Create array of dtype MyFloating, possibly need dtype=MyFloating
val_arr.elementwise.as_integer_ratio()  # Could work

MyFloating.elementwise(val_arr)  # Something in this direction should likely also work

Now, will people be able to block our methods? Sure, but only if they subclass our types, and then we could possibly even implement a warning that a new dtype method is being/will be shadowed.

@eric-wieser

This comment has been minimized.

Copy link
Member Author

commented Mar 17, 2018

Wait, but wasn't that the whole point of the Dragon4 work, that str(f) shouldn't lose precision?

No, I think the point was that ftype(str(f)) shouldn't lose precision. Dragon4 produces a unique representation for each float, but it doesn't produces an exact decimal representation.

For example:

>>> from fractions import Fraction
>>> f16 = np.float16(2.1)
>>> str(f16)
2.1
>>> str(np.longdouble(f16))
2.099609375  # this is the exact value, but we got 2.1 above because that was enough for uniqueness

>>> f_bad = Fraction(str(f16)); f_bad
Fraction(21, 10)
>>> f_bad == f16
False

>>> f_good = Fraction(*f16.as_integer_ratio()); f_good
Fraction(1075, 512)
>>> f_good == f16
True
@eric-wieser

This comment has been minimized.

Copy link
Member Author

commented Apr 11, 2018

One of the incentives behind this PR was to be able to write better tests for longdouble types - in particular, the cpython tests for math.remainder fall back on .as_integer_ratio(), so it would nice to be able to duplicate them to test remainder_ieee #9963

@eric-wieser

This comment has been minimized.

Copy link
Member Author

commented May 25, 2018

Any more thoughts on this? I don't think that thinking about dtype-specific array methods is relevant here - we already have np.float64.as_integer_ratio, adding np.floatXX.as_integer_ratio seems like a non-contentious extension.

The only argument I can see against this is maintenance cost - and I think that's paid for by a better ability to debug precision issues with longdouble.

@mhvk

This comment has been minimized.

Copy link
Contributor

commented May 25, 2018

I think we might as well have it.

@eric-wieser

This comment has been minimized.

Copy link
Member Author

commented Jun 9, 2018

@charris, any opinions?

CI failure is garbage here

@mattip

This comment has been minimized.

Copy link
Member

commented Jun 13, 2018

Should/Did this hit the mailing list? Seems like a simple case of adapting to CPython's float type. In any case it needs updating for 1.16

@eric-wieser eric-wieser force-pushed the eric-wieser:as_integer_ratio branch from 69fd573 to 0a015e5 Aug 20, 2018

@eric-wieser

This comment was marked as outdated.

Copy link
Member Author

commented Aug 20, 2018

Conflicts resolved

@eric-wieser

This comment has been minimized.

Copy link
Member Author

commented Aug 25, 2018

Python is now getting int.as_integer_ratio() too (python/cpython#8750), so it seems like the decision is that all numeric types should have this. @njsmith, is this convincing enough for you?

@eric-wieser

This comment was marked as outdated.

Copy link
Member Author

commented Nov 10, 2018

Conflicts resolved

@eric-wieser

This comment has been minimized.

Copy link
Member Author

commented Mar 21, 2019

Guess I now need to update this for 1.17...

@mhvk

This comment has been minimized.

Copy link
Contributor

commented Mar 21, 2019

I'd be quite happy to merge this once rebased.

ENH: Implement `np.floating.as_integer_ratio`
This matches the builtin `float.as_integer_ratio` and (in recent python versions) `int.as_integer_ratio`.

@eric-wieser eric-wieser force-pushed the eric-wieser:as_integer_ratio branch from 47c8700 to 79799b3 Apr 11, 2019

@eric-wieser

This comment has been minimized.

Copy link
Member Author

commented Apr 11, 2019

Rebased, let's see if tests still pass

@eric-wieser eric-wieser requested a review from mhvk Apr 12, 2019

@mhvk
Copy link
Contributor

left a comment

Implementation looks all OK. For the tests, I'd prefer replacing the 1000 always-the-same random samples with a few well-chosen ones.

numpy/core/tests/test_scalar_methods.py Outdated Show resolved Hide resolved
numpy/core/tests/test_scalar_methods.py Outdated Show resolved Hide resolved
@tylerjereddy
Copy link
Contributor

left a comment

I started refactoring the test a little locally based on Marten's feedback. Maybe I'll push a separate commit for Eric to revise.

numpy/core/tests/test_scalar_methods.py Outdated Show resolved Hide resolved
numpy/core/tests/test_scalar_methods.py Outdated Show resolved Hide resolved
numpy/core/tests/test_scalar_methods.py Show resolved Hide resolved
numpy/core/tests/test_scalar_methods.py Outdated Show resolved Hide resolved
numpy/core/tests/test_scalar_methods.py Outdated Show resolved Hide resolved
@tylerjereddy

This comment has been minimized.

Copy link
Contributor

commented Apr 12, 2019

I pushed in a draft commit that breaks the huge / nested loop test into smaller tests for clarity with respect to reporting failures and having less nesting in a single test. Not sure if Eric will like that but can always just gut the commit.

I didn't try to tackle the case generation yet though.

@eric-wieser

This comment has been minimized.

Copy link
Member Author

commented Apr 13, 2019

I think a lot of the test structure was designed to resemble the upstream tests this emulates. I'll have to check how much I modified them

@eric-wieser

This comment has been minimized.

Copy link
Member Author

commented Apr 13, 2019

This is the test I copied from:

    def test_floatasratio(self):
        for f, ratio in [
                (0.875, (7, 8)),
                (-0.875, (-7, 8)),
                (0.0, (0, 1)),
                (11.5, (23, 2)),
            ]:
            self.assertEqual(f.as_integer_ratio(), ratio)

        for i in range(10000):
            f = random.random()
            f *= 10 ** random.randint(-100, 100)
            n, d = f.as_integer_ratio()
            self.assertEqual(float(n).__truediv__(d), f)

        R = fractions.Fraction
        self.assertEqual(R(0, 1),
                         R(*float(0.0).as_integer_ratio()))
        self.assertEqual(R(5, 2),
                         R(*float(2.5).as_integer_ratio()))
        self.assertEqual(R(1, 2),
                         R(*float(0.5).as_integer_ratio()))
        self.assertEqual(R(4728779608739021, 2251799813685248),
                         R(*float(2.1).as_integer_ratio()))
        self.assertEqual(R(-4728779608739021, 2251799813685248),
                         R(*float(-2.1).as_integer_ratio()))
        self.assertEqual(R(-2100, 1),
                         R(*float(-2100.0).as_integer_ratio()))

        self.assertRaises(OverflowError, float('inf').as_integer_ratio)
        self.assertRaises(OverflowError, float('-inf').as_integer_ratio)
        self.assertRaises(ValueError, float('nan').as_integer_ratio)
@mhvk

This comment has been minimized.

Copy link
Contributor

commented Apr 13, 2019

Hmm, yes, I think I complained about the random number test before :-; which is probably why you have only 1000, not 10000. Still do not see any point in it.

@eric-wieser

This comment has been minimized.

Copy link
Member Author

commented Apr 13, 2019

I pushed in a draft commit that breaks the huge / nested loop test into smaller tests

If we're breaking up the test, it would make sense to me to create a class TestAsIntegerRatio to group them together.

@tylerjereddy tylerjereddy force-pushed the eric-wieser:as_integer_ratio branch from 229fecc to ddea169 Apr 15, 2019

@tylerjereddy

This comment has been minimized.

Copy link
Contributor

commented Apr 15, 2019

it would make sense to me to create a class TestAsIntegerRatio to group them together.

I've done this too now. I suspect Eric or Marten will still have some thoughts on those hard-coded test examples I've added in now though. At first I thought the integer cases didn't seem well-sampled, but I think it is just deceptive because of minexp / maxexp limits, and the magnitudes on the ldexp values produced seem to be on par with the bigger loop originally in place.

@tylerjereddy

This comment has been minimized.

Copy link
Contributor

commented Apr 15, 2019

Ah, Windows failures on the hardcoded test values. np.longdouble looks like the suspect, I think.

@tylerjereddy tylerjereddy force-pushed the eric-wieser:as_integer_ratio branch from ddea169 to 92f0187 Apr 16, 2019

@tylerjereddy

This comment has been minimized.

Copy link
Contributor

commented Apr 16, 2019

Well, at least we're seeing something useful from the ppc64le CI here

@mattip

This comment has been minimized.

Copy link
Member

commented Apr 17, 2019

the case that fails: f = np.ldexp(0.4277180662199366, 14266, dtype='float128')

@eric-wieser eric-wieser force-pushed the eric-wieser:as_integer_ratio branch from 92f0187 to 4bebf05 Apr 18, 2019

@eric-wieser

This comment has been minimized.

Copy link
Member Author

commented Apr 18, 2019

Thanks for the test cleanup @tylerjereddy - I should do some more reading on the hypothesis module you're using to help with these.

I amended your last commit to copy the skipif style I found elsewhere in umath tests, and fixed some variable names that I screwed up in the original test.

Feel free to tweak them some more, but I think either way this looks good to me.

@tylerjereddy

This comment has been minimized.

Copy link
Contributor

commented Apr 19, 2019

Ok, the code changes were already approved by a core dev modulo testing refinements, which are now complete. So, in it goes, thanks Eric

@tylerjereddy tylerjereddy merged commit ddd59a9 into numpy:master Apr 19, 2019

15 checks passed

LGTM analysis: C/C++ No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python No new or fixed alerts
Details
Shippable Run 3613 status is SUCCESS.
Details
azure-pipeline numpy.numpy Build #20190418.3 succeeded
Details
azure-pipeline numpy.numpy (Linux_Python_36_32bit_full_with_asserts) Linux_Python_36_32bit_full_with_asserts succeeded
Details
azure-pipeline numpy.numpy (Windows Python35-64bit-full) Windows Python35-64bit-full succeeded
Details
azure-pipeline numpy.numpy (Windows Python36-32bit-fast) Windows Python36-32bit-fast succeeded
Details
azure-pipeline numpy.numpy (Windows Python36-64bit-full) Windows Python36-64bit-full succeeded
Details
azure-pipeline numpy.numpy (Windows Python37-32bit-fast) Windows Python37-32bit-fast succeeded
Details
azure-pipeline numpy.numpy (Windows Python37-64bit-full) Windows Python37-64bit-full succeeded
Details
azure-pipeline numpy.numpy (macOS) macOS succeeded
Details
ci/circleci Your tests passed on CircleCI!
Details
codecov/project 85.57% (target 1%)
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@eric-wieser

This comment has been minimized.

Copy link
Member Author

commented Apr 20, 2019

Thanks Tyler!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.