Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise range() iteration despite type casts on the step value #2519

Open
jakirkham opened this issue Jul 25, 2018 · 20 comments
Open

Optimise range() iteration despite type casts on the step value #2519

jakirkham opened this issue Jul 25, 2018 · 20 comments

Comments

@jakirkham
Copy link
Contributor

Given the code below, all of the functions compile except for mysum2.

cdef double mysum0(double[:] a) nogil:
    cdef size_t i
    cdef double r = 0
    for i in range(<size_t>len(a)):
        r += a[i]
    return r


cdef double mysum1(double[:] a) nogil:
    cdef size_t i
    cdef double r = 0
    for i in range(<size_t>0, <size_t>len(a)):
        r += a[i]
    return r


cdef double mysum2(double[:] a) nogil:
    cdef size_t i
    cdef double r = 0
    for i in range(<size_t>0, <size_t>len(a), <size_t>1):
        r += a[i]
    return r

The errors generated by Cython for mysum2 are as follows.

Error compiling Cython file:
------------------------------------------------------------
...


cdef double mysum2(double[:] a) nogil:
    cdef size_t i
    cdef double r = 0
    for i in range(<size_t>0, <size_t>len(a), <size_t>1):
            ^
------------------------------------------------------------

test_range.pyx:20:13: Coercion from Python not allowed without the GIL

Error compiling Cython file:
------------------------------------------------------------
...


cdef double mysum2(double[:] a) nogil:
    cdef size_t i
    cdef double r = 0
    for i in range(<size_t>0, <size_t>len(a), <size_t>1):
            ^
------------------------------------------------------------

test_range.pyx:20:13: Iterating over Python object not allowed without gil

Error compiling Cython file:
------------------------------------------------------------
...


cdef double mysum2(double[:] a) nogil:
    cdef size_t i
    cdef double r = 0
    for i in range(<size_t>0, <size_t>len(a), <size_t>1):
                 ^
------------------------------------------------------------

test_range.pyx:20:18: Calling gil-requiring function not allowed without gil

Error compiling Cython file:
------------------------------------------------------------
...


cdef double mysum2(double[:] a) nogil:
    cdef size_t i
    cdef double r = 0
    for i in range(<size_t>0, <size_t>len(a), <size_t>1):
                 ^
------------------------------------------------------------

test_range.pyx:20:18: Constructing Python tuple not allowed without gil

Error compiling Cython file:
------------------------------------------------------------
...


cdef double mysum2(double[:] a) nogil:
    cdef size_t i
    cdef double r = 0
    for i in range(<size_t>0, <size_t>len(a), <size_t>1):
                  ^
------------------------------------------------------------

test_range.pyx:20:19: Converting to Python object not allowed without gil

Error compiling Cython file:
------------------------------------------------------------
...


cdef double mysum2(double[:] a) nogil:
    cdef size_t i
    cdef double r = 0
    for i in range(<size_t>0, <size_t>len(a), <size_t>1):
                             ^
------------------------------------------------------------

test_range.pyx:20:30: Converting to Python object not allowed without gil

Error compiling Cython file:
------------------------------------------------------------
...


cdef double mysum2(double[:] a) nogil:
    cdef size_t i
    cdef double r = 0
    for i in range(<size_t>0, <size_t>len(a), <size_t>1):
                                             ^
------------------------------------------------------------

test_range.pyx:20:46: Converting to Python object not allowed without gil

Details about the environment used below. This was run on my Mac, but expect the error to occur independently of the OS used.

name: cython
channels:
  - conda-forge
  - defaults
dependencies:
  - appdirs=1.4.3=py_1
  - appnope=0.1.0=py36_0
  - asn1crypto=0.24.0=py_1
  - attrs=18.1.0=py_1
  - automat=0.7.0=py36_0
  - backcall=0.1.0=py_0
  - blas=1.1=openblas
  - bleach=2.1.3=py_0
  - ca-certificates=2018.4.16=0
  - certifi=2018.4.16=py36_0
  - cffi=1.11.5=py36_0
  - constantly=15.1.0=py_0
  - cryptography=2.3=py36hdffb7b8_0
  - cryptography-vectors=2.3=py36_0
  - cython=0.28.4=py36hfc679d8_0
  - decorator=4.3.0=py_0
  - entrypoints=0.2.3=py36_1
  - html5lib=1.0.1=py_0
  - hyperlink=17.3.1=py_0
  - idna=2.7=py36_2
  - incremental=17.5.0=py_0
  - ipykernel=4.8.2=py36_0
  - ipython=6.4.0=py36_0
  - ipython_genutils=0.2.0=py_1
  - jedi=0.12.0=py36_0
  - jinja2=2.10=py_1
  - jsonschema=2.6.0=py36_1
  - jupyter_client=5.2.3=py_1
  - jupyter_core=4.4.0=py_0
  - libffi=3.2.1=3
  - libgfortran=3.0.0=0
  - libsodium=1.0.16=0
  - line_profiler=2.1.2=py36h470a237_1
  - markupsafe=1.0=py36_0
  - mistune=0.8.3=py36_1
  - nbconvert=5.3.1=py_1
  - nbformat=4.4.0=py_1
  - ncurses=6.1=0
  - notebook=5.6.0=py36_0
  - numpy=1.14.5=py36_blas_openblashd3ea46f_201
  - openblas=0.2.20=8
  - openssl=1.0.2o=0
  - pandoc=2.2.2=hde52d81_1
  - pandocfilters=1.4.2=py_1
  - parso=0.3.0=py_0
  - pexpect=4.6.0=py36_0
  - pickleshare=0.7.4=py36_0
  - prometheus_client=0.3.0=py_0
  - prompt_toolkit=1.0.15=py36_0
  - ptyprocess=0.6.0=py36_0
  - pyasn1=0.4.3=py_0
  - pyasn1-modules=0.2.1=py_0
  - pycparser=2.18=py_1
  - pygments=2.2.0=py_1
  - pyhamcrest=1.9.0=py_2
  - pyopenssl=18.0.0=py36_0
  - python=3.6.5=1
  - python-dateutil=2.7.3=py_0
  - pyzmq=17.1.0=py36hae99301_0
  - readline=7.0=haf1bffa_1
  - scipy=1.1.0=py36_blas_openblashd3ea46f_201
  - send2trash=1.5.0=py_0
  - service_identity=17.0.0=py_0
  - setuptools=40.0.0=py36_0
  - simplegeneric=0.8.1=py_1
  - six=1.11.0=py36_1
  - sqlite=3.20.1=0
  - terminado=0.8.1=py36_0
  - testpath=0.3.1=py36_0
  - tk=8.6.8=0
  - tornado=5.1=py36_0
  - traitlets=4.3.2=py36_0
  - wcwidth=0.1.7=py_1
  - webencodings=0.5.1=py36_0
  - xz=5.2.3=0
  - zeromq=4.2.5=hfc679d8_4
  - zlib=1.2.11=h470a237_3
  - zope.interface=4.5.0=py36h470a237_0
  - twisted=17.5.0=py36_0
@jakirkham
Copy link
Contributor Author

Should add that if mysum2 is rewritten using the for ... from ... by ... (as shown below), this compiles fine.

cdef double mysum2(double[:] a) nogil:
    cdef size_t i
    cdef double r = 0
    for i from <size_t>0 <= i < <size_t>len(a) by <size_t>1:
        r += a[i]
    return r

@gabrieldemarmiesse
Copy link
Contributor

This is a known issue. When using step, Cython falls back to the python implementation of range. So the GIL must be acquired, which is against the nogil declaration. See #532 to track the progress of this issue.

@jakirkham
Copy link
Contributor Author

To be a bit clearer, dropping the <size_t> from the step parameter let's it compile fine. So having a step in range is not a problem. Having the step be cast to <size_t> (and possibly other types) is problematic.

@scoder
Copy link
Contributor

scoder commented Jul 31, 2018

The problem is that a) Cython has to know the step value at translation time in order to generate the loop comparisons in the right direction, and b) the exact details about size_t are only known at C compilation time, i.e. after Cython is done with all. This might become clearer if you used <char>200 as step – impossible for Cython to guess the sign there.

We could probably special case a couple of more expressions here, but there are always limits to the assumptions that Cython can make.

@jakirkham
Copy link
Contributor Author

Trying unsigned int ran into the same issues. Not sure based on the statement above whether this is expected or not.

Handling size_t and other unsigned types would be nice. Except for pathological cases, expect this should follow using the existing logic fine.

For signed types, would it be reasonable for Cython to generate some code that looks roughly like the following? Or is there something I'm missing?

if (step > 0) {
    for (i = start; i < stop; i += step) {
        ...
    }
} else {
    for (i = start; i > stop; i += step) {
        ...
    }
}

@scoder
Copy link
Contributor

scoder commented Jul 31, 2018

The problem with that example is the duplication of the ..., which can be an arbitrarily large chunk of code. Also, there's the case of step == 0.

And the problem with unsigned is that it's not always 100% clear at the Cython level what is unsigned and what isn't, where wrap-arounds happen, etc. Thus my example with char where the sign is platform dependent. If it's explicit, then we might be able to do something about it.

@jakirkham
Copy link
Contributor Author

That's fair. Let's play with it a bit then. What about this?

for (i = start; (step > 0) ? (i < stop) : (i > stop); i += step) {
    ...
}

Hmm...thought char was always signed. In any event, if it's ambiguous, agree assuming is problematic. Though things like unsigned int or size_t should be clear.

@robertwb
Copy link
Contributor

robertwb commented Aug 1, 2018

That would work--it would be an extra condition on each loop but likely faster than Python (and any decent C compiler should optimize away a step >= 0 if step is actually unsigned (though it may be a warning in that case)). One issue is what we should do if we can't raise an exception if step == 0.

@jakirkham
Copy link
Contributor Author

...it would be an extra condition on each loop but likely faster than Python (and any decent C compiler should optimize away a step >= 0 if step is actually unsigned (though it may be a warning in that case)).

We could always turn this into an inlined cdef function using fused types to specialize the unsigned cases with a simpler check.

One issue is what we should do if we can't raise an exception if step == 0.

Don't think we should do anything. The user can trivially generate the same problematic code with the old for ... from ... by ... syntax even if the step is a constant, 0. Cython compiles this without issues.

@scoder
Copy link
Contributor

scoder commented Aug 1, 2018 via email

@jakirkham
Copy link
Contributor Author

Agreed, but that is an orthogonal issue.

@jakirkham
Copy link
Contributor Author

jakirkham commented Feb 15, 2019

Is this the right function to be looking at? Should one be changing this? Or is there a better way to go about this?

Edit: Should one be changing this block as well?

@scoder
Copy link
Contributor

scoder commented Feb 15, 2019

_transform_carray_iteration() is not the right place. It's for iteration over C-arrays.

The tree restructuring code for the range() loops is in Optimize.py, but that's not the only related code. Sadly, the range() optimisation is sprinkled across a couple of places, including flow control analysis and type inference (actually search for range optimization). Cleaning that up is a long-standing problem, together with #1159. But it shouldn't impact the issue here.

Also see PR #368.

@jimy-byerley
Copy link

Hello,
I'm facing the same issue with unsigned steps using range
isn't still any fix to this ?

I find a bit sad to be obliged to convert from the natural python code to for ... from ... by ... just for a concern about a sign which is anyway absent ...

@thomasahle
Copy link

I'm having this issue too. @jimy-byerley how does the for ... from ... by ... syntax work?

@scoder
Copy link
Contributor

scoder commented Feb 10, 2022

I'm having this issue too. @jimy-byerley how does the for ... from ... by ... syntax work?

It's a deprecated piece of syntax that should no longer be used in new code. It will probably be removed at some point.

https://cython.readthedocs.io/en/latest/src/userguide/language_basics.html#integer-for-loops

@scoder
Copy link
Contributor

scoder commented Feb 10, 2022

The tree restructuring code for the range() loops is in Optimize.py
Also see PR #368.

It's probably not trivial to get this to work, but it first calls for a bunch of tests to see what should work and what shouldn't.
PR #368 is also still worth finishing.

@scoder
Copy link
Contributor

scoder commented Feb 10, 2022

Note that completing PR #368 would probably get us most of the benefit – this issue is then only a small optimisation on top. See #1106.

@scoder scoder changed the title Cython range dislikes size_t step Optimise range() iteration despite type casts on the step value Feb 10, 2022
@thomasahle
Copy link

thomasahle commented Feb 11, 2022

I'm having this issue too. @jimy-byerley how does the for ... from ... by ... syntax work?
It's a deprecated piece of syntax that should no longer be used in new code. It will probably be removed at some point.

So the best choice is to just do a i = 0; while i < n: ...; i += s for now?
(If I want it optimized, that is.)
The code I'm trying to Cythonize is this btw: https://en.wikipedia.org/wiki/Fast_Walsh%E2%80%93Hadamard_transform#Python_example_code

@jimy-byerley
Copy link

That's what I have done so far. 😐 There was nothing else at the time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants