Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast check for C long #24244

Closed
jdemeyer opened this issue Nov 19, 2017 · 36 comments
Closed

Fast check for C long #24244

jdemeyer opened this issue Nov 19, 2017 · 36 comments

Comments

@jdemeyer
Copy link

Implement a fast variant of pyobject_to_long, which could be used to replace checks of the form isinstance(x, int).

The goal of these functions is to provide a single place to check for integers and convert them to a C long (at least in Cython code, where performance matters most). Various places in Sage check for integers in different ways. First of all, this is an inconsistency which should be fixed.

In Python 3, the types int/long are changed, so many of the existing checks will need to be changed anyway. So we might as well provide one function which should be able to replace most of these checks. And ideally, we want to do this without losing too much performance. For that reason, I avoid PyLong_FromLong in favour of a hand-written implementation.

The function pyobject_to_long has signature

cdef bint integer_check_long(x, long* value, int* err)

It does several things: first of all, it checks whether x is "integer-like". This means that it is either a Python int or long, a Sage Integer or it supports __index__. The return value of the function is whether or not it is an integer. If it is an integer, it tries to convert the value to a C long and stores it in *value. Errors are passed through the *err variable.

This interface looks overcomplicated on first sight, but the complexity is needed to support the various use cases of this function. Also note that it is an inline function, so the compiler should be able to optimize away the unused variables.

Use case 1: a function can take different kinds of input and want to check whether the input is an integer. Typically, this would be done with a long if chain like

if isinstance(x, ...):
    ...
elif isinstance(x, ...):
    ...
elif integer_check_long(x, &value, &err):
    # Process value and err as needed
elif isinstance(x, ...):
    ...

Use case 2: convert an integer to a C long purely for performance. In this case, you want to treat overflow the same as the input not being an integer:

cdef int err = -1
integer_check_long(x, &value, &err)
if err == 0:
    # Conversion successful, use value

Depends on #24252

Component: cython

Author: Jeroen Demeyer

Branch: 70592a8

Reviewer: Travis Scrimshaw

Issue created by migration from https://trac.sagemath.org/ticket/24244

@jdemeyer jdemeyer added this to the sage-8.1 milestone Nov 19, 2017
@jdemeyer
Copy link
Author

Branch: u/jdemeyer/fast_check_for_c_long

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 20, 2017

Commit: dd27cd5

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 20, 2017

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

eeee3deMove long.pxd to arith
dd27cd5New function integer_check_long()

@jdemeyer
Copy link
Author

Dependencies: #24245

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 20, 2017

Changed commit from dd27cd5 to b1ea838

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 20, 2017

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

b1ea838New functions integer_check_long() and integer_check_long_py()

@jdemeyer
Copy link
Author

Changed dependencies from #24245 to #24252

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 20, 2017

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

bb114c9Fake Integer interface
4d76bc1New functions integer_check_long() and integer_check_long_py()

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 20, 2017

Changed commit from b1ea838 to 4d76bc1

@tscrim
Copy link
Collaborator

tscrim commented Nov 20, 2017

comment:9

I think it would be better if integer_check_long and integer_check_long_py returned the error and 0 if it was completely successful. Then pyobject_to_long would then interpret the error using the enum and do the correct Python error raising. I feel like you are basically doing this, but in a roundabout way by passing err.

@tscrim
Copy link
Collaborator

tscrim commented Nov 20, 2017

comment:10

Also, in #24248, you do not even use the return bint of integer_check_long_py.

@jdemeyer

This comment has been minimized.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 21, 2017

Changed commit from 4d76bc1 to 70592a8

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 21, 2017

Branch pushed to git repo; I updated commit sha1. New commits:

70592a8Fix compiler warning

@tscrim
Copy link
Collaborator

tscrim commented Nov 22, 2017

comment:13

Use case 2 could just become:

if integer_check_long(x, &value) == 0:
    # Conversion successful, use value

which feels much more natural to me.

However, I agree that use case 1 is a bit more of a hassle because you could not subsequently assign the error message (this wouldn't be a problem in, e.g., C because of the looser syntax rules). Although you can get around this by expanding out the elif as an else-if (but it looks fugly). Do you have a specific case where this does come up?

@jdemeyer
Copy link
Author

comment:14

Replying to @tscrim:

Use case 2 could just become:

if integer_check_long(x, &value) == 0:
    # Conversion successful, use value

which feels much more natural to me.

Sure, I can simplify use case 1 and use case 2 individually. But I see no way to simplify things in such a way that both use cases still work. Alternatively, I could simply add another layer for use case 2 and make a new function

if integer_check_long_no_overflow(x, &value):
    # Conversion successful, use value

Finally, let me stress that this complexity shouldn't affect performance since we are dealing with inline functions and compilers are good at optimization.

@jdemeyer
Copy link
Author

comment:15

Replying to @tscrim:

Do you have a specific case where this does come up?

I don't know exactly what you mean, but you can look at my work in progress at #24247. There are various calls to integer_check_long(_py) there.

@jdemeyer
Copy link
Author

comment:16

Let me know what you would suggest to do with this ticket.

@tscrim
Copy link
Collaborator

tscrim commented Nov 22, 2017

Reviewer: Travis Scrimshaw

@tscrim
Copy link
Collaborator

tscrim commented Nov 22, 2017

comment:17

Replying to @jdemeyer:

Replying to @tscrim:

Do you have a specific case where this does come up?

I don't know exactly what you mean, but you can look at my work in progress at #24247. There are various calls to integer_check_long(_py) there.

Do you have a place on a future ticket where use case 1 applies? I see use case 2 on #24247 and this one:

            if integer_check_long_py(right, &value, &err):
                if not err:
                    return (<Element>left)._pow_long(value)
                else:
                    return (<Element>left)._pow_int(right)

which could be simplified to

            err = integer_check_long_py(right, &value):
            if not err:
                return (<Element>left)._pow_long(value)
            elif err == ERR_OVERFLOW:
                return (<Element>left)._pow_int(right)

However, if you see a clear benefit to take on the more complicated function description and return values, then you can set this to a positive review. I just am thinking a little bit about the potential future maintenance of this function (I also don't think there will be any [noticeable] performance difference) and the places where it is used.

@jdemeyer
Copy link
Author

comment:18

Here is a concrete proposal: we keep the complicated signature for now (if anything, simply to prevent conflicts with current tickets depending on this one). Once we have a clearer idea on how these functions are used, we can revisit and change things if needed.

@jdemeyer
Copy link
Author

comment:19

Replying to @tscrim:

I also don't think there will be any [noticeable] performance difference

Fun observation: compiler warnings can be used to get data on how the compiler is able to optimize the code. For example, in #24247 I'm writing essentially

        cdef long value
        cdef int err
        if integer_check_long(y, &value, &err):
            if not err:
                return (<Element>x)._pow_long(value)

In this case, the compiler does not give a warning that err and value might be uninitialized. This means that the compiler is able to inline and optimize all code paths of integer_check_long to prove that err and value are set to a known value whenever they are accessed in the code snippet above.

@embray
Copy link
Contributor

embray commented Nov 28, 2017

comment:21

Oh, I just saw this from #24293 (I don't look at every ticket opened...)

As for "integer_fake" is there maybe a name for it that better reflects its purpose"? I was thinking something like "integer_proto" (for prototype) since providing early access to the Integer prototype seems to be its primary purpose for existing. "integer_fake" to me sounds like something for testing, or some other odd purpose.

@jdemeyer
Copy link
Author

comment:22

Replying to @embray:

As for "integer_fake" is there maybe a name for it that better reflects its purpose"? I was thinking something like "integer_proto" (for prototype) since providing early access to the Integer prototype seems to be its primary purpose for existing.

I think that integer_fake or integer_proto are equally meaningless. What would "prototype" mean in this context? I don't mind changing the name, but I'm not convinced with integer_proto either.

"integer_fake" to me sounds like something for testing, or some other odd purpose.

"odd purpose" totally applies here :-)

@jdemeyer
Copy link
Author

comment:23

Thinking more about it, I actually like how the name sounds like it should not be used. integer_proto sounds too official somehow.

@tscrim
Copy link
Collaborator

tscrim commented Nov 29, 2017

comment:24

When I read proto(type), my first thought was the prototype design pattern, but that's right, in C it is called prototyping. However, since this is only mimicking that, I think fake is a little more accurate as to the class's purpose.

@embray
Copy link
Contributor

embray commented Nov 29, 2017

comment:25

I mean, it should be used, just for a specific narrow case. Alright I won't push on it; just something called "fake" is a little jarring to me to see outside of, say, test code.

@embray
Copy link
Contributor

embray commented Nov 29, 2017

comment:26

BTW, I should add, this is really cool. Definitely something we should have done in the first place, in retrospect.

@jdemeyer
Copy link
Author

comment:27

Replying to @embray:

BTW, I should add, this is really cool. Definitely something we should have done in the first place, in retrospect.

Thanks!

@embray
Copy link
Contributor

embray commented Nov 29, 2017

comment:28

Okay okay, what about "integer_decl", since it's basically a forward declaration IIUC.

@jdemeyer
Copy link
Author

jdemeyer commented Dec 5, 2017

comment:29

Replying to @embray:

Okay okay, what about "integer_decl", since it's basically a forward declaration IIUC.

It's a fake forward declaration which shouldn't be used.

@vbraun
Copy link
Member

vbraun commented Dec 13, 2017

Changed branch from u/jdemeyer/fast_check_for_c_long to 70592a8

@jm58660
Copy link
Mannequin

jm58660 mannequin commented Apr 24, 2018

Changed branch from 70592a8 to u/jmantysalo/70592a850d7677928592c022760d6183bd81364f

@jm58660
Copy link
Mannequin

jm58660 mannequin commented Apr 24, 2018

Changed branch from u/jmantysalo/70592a850d7677928592c022760d6183bd81364f to 70592a8

@jm58660
Copy link
Mannequin

jm58660 mannequin commented Apr 24, 2018

comment:32

Arghs, my typo.

@jm58660
Copy link
Mannequin

jm58660 mannequin commented Apr 24, 2018

Changed commit from 70592a8 to none

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants