Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Support for C++17 execution policy parameter #3790

Merged
merged 19 commits into from
May 25, 2021

Conversation

shwina
Copy link
Contributor

@shwina shwina commented Aug 25, 2020

C++17 introduces execution policies and overloads of many algorithms that accept an execution policy parameter. Exposing these in Cython gives users yet another way to easily leverage parallelism, and also to do things like fully transparently target GPUs.

This PR is an RFC for adding these to Cython. As a starting point, I have Cython definitions for execution policies and the relevant overloads of the sort algorithm. Before pursuing this further, I'd love to hear if this would be a welcome addition to Cython and if the approach taken looks good. If you would prefer a GitHub issue for discussion, please let me know and I'm happy to open one :-)

Note: the use of forwarding references in the C++ definition of sort makes writing the Cython defintions for them slightly non-trivial -- perhaps there is a better way to do this than the way I am doing right now.

@shwina shwina changed the title [RFC] Support for c++17 execution policy parameter [RFC] Support for C++17 execution policy parameter Aug 25, 2020
@shwina
Copy link
Contributor Author

shwina commented Aug 30, 2020

Any thoughts on moving this forward? @scoder

@da-woods
Copy link
Contributor

da-woods commented Aug 30, 2020

I'd say it needs some testing. I wouldn't bother detecting whether the execution policies are actually used (because that'll be hard) but it should at least compile and run and sort something.

There doesn't look to be a tag for cpp17 in the test-suite, but I imagine you can base a new one off the cpp11 tag. I'd probably create a new test-file alongside https://github.com/cython/cython/blob/master/tests/run/cpp_stl_algo_modifying_sequence_ops.pyx (cpp_std_algo_modifying_sequence_ops_cpp17?) for this batch of tests.

@da-woods
Copy link
Contributor

Writing small wrapper functions around templates that are hard to wrap seems like the way to go, these ones look fairly sensible and cython_std seems to be accepted as the namespace to use so the basic approach looks fine provided that they work.

@shwina
Copy link
Contributor Author

shwina commented Aug 30, 2020

Thanks @da-woods -- one concern I had was that the use of forwarding references make writing the definitions a bit tedious. Doing so for all the algorithms would effectively increase the LOC in algorithms.pxd by 3-4x. Do you have any thoughts on how difficult and/or possible it would be to support some kind of Cython syntax for forwarding references, e.g.,:

cdef void sort[ExecutionPolicy, Iter](ExecutionPolicy&& policy, Iter first, Iter second)

(Just as I was writing this out, I saw your second comment)

@da-woods
Copy link
Contributor

@shwina I don't think having many lines of code is a problem here (except that you have to write them...). I'd guess the difficulty of supporting && references would be "medium" - it involves a few small changes in a number of places (at least the parser and in pyrextypes). I don't think anyone's made any attempt to do it but I don't see why it'd be especially hard.

@shwina
Copy link
Contributor Author

shwina commented Aug 31, 2020

Thanks! Being able to declare functions with forwarding references would also be useful more generally. I'll give it a try (in a separate PR).

@shwina
Copy link
Contributor Author

shwina commented Oct 8, 2020

@da-woods

First, re: testing, we can test using the seq execution policy for all algorithms, which should be supported by all compilers/hardware targets. This ensures the overloads work, at least. Would you want to see tests for all the algorithms or just one?

Second: I proposed using forwarding references in the move declaration here. Shall I include that in this PR?

@da-woods
Copy link
Contributor

da-woods commented Oct 8, 2020

@da-woods

First, re: testing, we can test using the seq execution policy for all algorithms, which should be supported by all compilers/hardware targets. This ensures the overloads work, at least. Would you want to see tests for all the algorithms or just one?

Just testing sequential execution sounds sensible. That's basically testing that you produce runnable C++ code. I'm not sure you could meaningfully test which algorithm has run. I'd think tests of a few of the algorithms would be sufficient. There looks to be quite a few.

Second: I proposed using forwarding references in the move declaration here. Shall I include that in this PR?

I'd probably put move in its own PR. (It has the potential to break stuff, while this probably doesn't, so it probably needs a little more scruitiny)

@shwina
Copy link
Contributor Author

shwina commented Oct 8, 2020

Thanks!

  • Added a cpp17 tag
  • Added tests for a handful of algorithms called with the seq execution policy - please let me know if you'd like to see something else tested

Informally: this works great for our use case which is using the par execution policy for parallelized algorithms.

ext.extra_compile_args.append("-std=c++17")
if sys.platform == "darwin":
ext.extra_compile_args.append("-stdlib=libc++")
ext.extra_compile_args.append("-mmacosx-version-min=10.13")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a reference for this version number: https://cibuildwheel.readthedocs.io/en/stable/cpp_standards/#macos-and-deployment-target-versions

Does that look about right?

@shwina
Copy link
Contributor Author

shwina commented Oct 9, 2020

Hmm, I seem to have hit on an interesting problem. Currently some tests are failing because I've introduced "ambiguous" overloads. But according to cppreference, these overloads aren't supposed to participate in overload resolution: https://en.cppreference.com/w/cpp/algorithm/transform

Thinking about how to handle this.

@da-woods
Copy link
Contributor

da-woods commented Oct 9, 2020

The logs have

=== Got: ===
144:13: ambiguous overloaded method
329:17: Unable to deduce type parameters for void (Iter, Iter) except + nogil given (iterator, <error>)
329:32: Call with wrong number of arguments (expected 3, got 2)

This looks like the failure is happening in Cython, not C++ (thus what cppreference says isn't hugely relevant yet). At least one of the types is being deduced as Cython's inbuilt error type, which suggests something has failed a little earlier.

@da-woods
Copy link
Contributor

da-woods commented Oct 9, 2020

The easiest (i.e. cheating slightly) solution might be to use Cython's "cname" feature to give the execution policy variations a different name in Cython. For example

OutputIt transform_with_execution_policy[ExecutionPolicy, InputIt, OutputIt, UnaryOp](
        ExecutionPolicy&& policy, InputIt first1, InputIt last1, OutputIt d_first, UnaryOp unary_op) "transform" except +

It isn't ideal but it might be easier than really digging into the code that identifies c++ overloads within Cython.

The other option along these lines would be to have to pxd files: algorithms.pxd and algorithms_with_execution_policy.pxd and let the user pick which one to use by cimporting from the right one. (A comment in the files could help them find the c++17 ones)

@shwina
Copy link
Contributor Author

shwina commented Oct 9, 2020

There's does seem to be a third option, which may offer the least friction to end-users.

The problem right now is that there are these two overloads of transform:

OutputIt transform[ExecutionPolicy, InputIt, OutputIt, UnaryOp](
    ExecutionPolicy&& policy, InputIt first1, InputIt last1, OutputIt d_first, UnaryOp unary_op) except +

OutputIt transform[InputIt1, InputIt2, OutputIt, BinaryOp](
    InputIt1 first1, InputIt1 last1, InputIt2 first2, OutputIt d_first, BinaryOp binary_op) except +

When the arguments are of type (T1, T1, T1, T1, T2), Cython's cannot decide between the two. C++ however will use SFINAE to disambiguate at runtime.

So, we could technically have just the one declaration:

# This overload is ambiguos with the next one. We just let C++ disambiguate from the arguments
# OutputIt transform[ExecutionPolicy, InputIt, OutputIt, UnaryOp](
#     ExecutionPolicy&& policy, InputIt first1, InputIt last1, OutputIt d_first, UnaryOp unary_op) except +

OutputIt transform[InputIt1, InputIt2, OutputIt, BinaryOp](
    InputIt1 first1, InputIt1 last1, InputIt2 first2, OutputIt d_first, BinaryOp binary_op) except +

Thoughts?

edit: clarification

@da-woods
Copy link
Contributor

da-woods commented Oct 9, 2020

Thoughts?

My feeling is it won't work (but I'm happy to be proved wrong).

One has a function with arguments (T1, T2, T2, T3, T4) and the other has arguments (T1, T1, T2, T3, T4). I think Cython will at the next step on because it can't make the arguments match.

@shwina
Copy link
Contributor Author

shwina commented Oct 9, 2020

It does work. I believe that right now Cython raises because it sees the actual arguments with types (T1, T1, T1, T1, T2) and it can't choose between the two overloads with declared types (T1, T2, T2, T3, T4) or (T1, T1, T2, T3, T4). I'll push my changes in a bit -- dealing with an orthogonal issue at the moment.

@shwina
Copy link
Contributor Author

shwina commented Oct 9, 2020

@da-woods

I just pushed my changes along with the changes in #3875 since I ran into that bug here.

@da-woods
Copy link
Contributor

Not commenting on your proposed fix yet (because I haven't really looked at it) but given that transform (and possibly min_element, although I'm not sure why) look to have been causing specific problems it might be worth making sure that they're in the test-cases for this.

@shwina
Copy link
Contributor Author

shwina commented Oct 11, 2020

So the tests all pass for me locally but that's because I'm testing with gcc 9.3 (Ubuntu 20.04).

Apparently the earliest gcc version to support the <execution> header is gcc 9.1. See here.

I'm not sure how to proceed here. I could add a new test tag that requires a minimum compiler version for example, but I don't know how intrusive/welcome this would be.

Comment on lines +337 to +341
gcc_version = get_gcc_version(ext.language)
if gcc_version:
if float(gcc_version.group(1)) >= float(version):
return ext
return EXCLUDE_EXT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be more like "IF we're using gcc/g++, THEN check the version – otherwise, rely on the 'cpp17' tag" ? Excluding everything but g++ here seems too broad – to me. (But I didn't look up the details.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could I ask which other compilers Cython tests against? I only have access to gcc at the moment, but for the others, I'd have to track down which versions introduced the <execution> header.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything that can be used to compile Python should be allowed: clang, mingw, MSVC, intel, ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, the changes in this PR don't preclude using any of those to compile Cython. However, they do add Cython definitions for a C++17 only feature, which all compilers and all versions support.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could I ask which other compilers Cython tests against?

The automated tests get run on GCC on Linux, Clang on Mac, and MSVC on Windows.

Putting in version-checks in runtests.py for all known compilers seems a bit of a pain to me (but there isn't an obvious better way to do it, and there's precedent for c++11). I wonder if it'd be better for the tests to be off by default in runtests.py, and enabled by a command-line argument (which would be turned on in the CIs that Cython controls itself)? (But I'm not confident in this opinion)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can see, Clang still does not support this feature (Ctrl-F for "Standardization of Parallelism TS" here), and MSVC supports it as of VS 2017 15.7.

How should we proceed here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to start testing it with one compiler (as long as it's GCC :D ) and add others later, when we feel like it.
This isn't really a feature where I would expect compiler specific breakage being introduced by changes in Cython, and even if we find such problems later, we can also fix them later. And add more compilers at that point, when we hear about one where it's supposed to work but doesn't.

if func.type.is_cfunction and args:
for i, formal_arg in enumerate(func.type.args):
if formal_arg.is_forwarding_reference():
if args[i].is_lvalue():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth writing tests for (static/normal) methods since that changes args. The original version strictly uses self.args.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, do you mean additional tests in cpp_forwarding_ref.pyx for templated methods in addition to free functions? Apologies, it's been a while.. 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and tried that. Normal methods seem to work as expected, but overloaded static methods don't. Looking into why...

Copy link
Contributor Author

@shwina shwina Apr 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm - it could be unrelated to this PR, since the following fails on master for me:

# mode: run
# tag: cpp, cpp11

cdef extern from *:
    """
    struct Foo
    {
      static const char* bar(int x, int y) {
        return "second";
      }

      static const char* bar(int x) {
        return "first";
      }

    };
    """
    cppclass Foo:
        @staticmethod
        const char* bar(int x)

        @staticmethod
        const char* bar(int x, int y)


def test_static_method_overload():
    """
    >>> test_static_method_overload()
    """
    assert Foo.bar(1) == b"first"
    assert Foo.bar(1, 2) == b"second"

Error:



=== Expected: ===


=== Got: ===
30:18: Call with wrong number of arguments (expected 2, got 1)

======================================================================
ERROR: runTest (__main__.CythonRunTestCase)
compiling (cpp/cy2) and running cpp_static_method_overload
----------------------------------------------------------------------
Traceback (most recent call last):
  File "runtests.py", line 1312, in run
    ext_so_path = self.runCompileTest()
  File "runtests.py", line 977, in runCompileTest
    return self.compile(
  File "runtests.py", line 1233, in compile
    self._match_output(expected_errors, errors, tostderr)
  File "runtests.py", line 1285, in _match_output
    self.assertEqual(None, unexpected)
AssertionError: None != '30:18: Call with wrong number of arguments (expected 2, got 1)'

----------------------------------------------------------------------
Ran 1 test in 0.288s

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks known #1851 so maybe isn't with worrying about too much

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I got curious and opened #4129 to address that :-)

runtests.py Outdated Show resolved Hide resolved
Cython/Compiler/PyrexTypes.py Outdated Show resolved Hide resolved
runtests.py Outdated Show resolved Hide resolved
tests/run/cpp_stl_algo_execpolicies.pyx Outdated Show resolved Hide resolved
@shwina
Copy link
Contributor Author

shwina commented May 25, 2021

Thanks @scoder for the fixes and apologies I couldn't get around to it earlier.

@scoder scoder merged commit a0a105d into cython:master May 25, 2021
@scoder
Copy link
Contributor

scoder commented May 25, 2021

Thanks for your contribution, @shwina

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants