Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow (compile-time) conditional "with nogil" blocks #2579

Closed
scoder opened this issue Aug 25, 2018 · 20 comments
Closed

Allow (compile-time) conditional "with nogil" blocks #2579

scoder opened this issue Aug 25, 2018 · 20 comments

Comments

@scoder
Copy link
Contributor

scoder commented Aug 25, 2018

Especially when using fused types, some code section may or may not require the GIL. Currently, the only way to release the GIL in these cases is to duplicate the code sections and wrap one in a with nogil block but not the other, e.g.

    if fusedtype is object:
        ...  # complicated code here
    else:
        with nogil:
            ... # complicated code repeated here

The normal reflex of moving the code into a cdef function also does not work out because the function then has the same declaration problem.

At least for cases where the distinction is known at compile time (which includes type checks for fused types), there should be a way to base the nogil case on a condition.

Constraints:

The solution should be as obvious from a Python perspective as possible. with nogil is currently special-cased in Cython, but that shouldn't impact its use. An obvious approach in Python would be, for example:

    cm = nogil if condition else dummy
    with cm:
        ...

but this is difficult to match with the need for special casing nogil in Cython at compile time.

Also, while an initial implementation could be restricted to compile time constant conditions, the same syntax should still be applicable when allowing the decision to be taken at runtime (by implicitly generating a duplicate code block in C).

nogil is not only a (special) context manager in Cython but also a decorator, so nogil(True) might not be easy to distinguish from @nogil, especially in pure Python code (Shadow.py). Still, also allowing @nogil(True) as a decorator would be nice.

Approach:

Given that directives like with boundscheck(True) already exist, with nogil(True) is probably the way to go. Thus, allow with nogil(flag) and detect the decorator case in pure Python mode by checking for callables in the dummy implementation in Shadow.py.

The InterpretCompilerDirectives transform in ParseTreeTransforms.py is the place where compiler directives are evaluated. For with nogil and with gil, a Nodes.GILStatNode is created. Note that this is run before transforming normal with-statements (with is not relevant for this special case) and constant folding (in the ConstantFolding tree transform, which determines the compile time value of the condition), and also long before type analysis (AnalyseExpressionsTransform) where fused types are evaluated (see the order in Pipeline.py), so the directive condition will not be finally known until constant folding and type analysis have run, and must therefore be stored as a syntax tree node instead of a boolean flag.

The type analysis step should be able to take the final decision about the condition value in Nodes.GILStatNode. However, it could well be that certain syntax tree nodes in the code block have already decided by then that they are in a nogil context. That might need to be revised to make sure they only take that decision after the compile time condition is evaluated and the final GIL state known.

@noamher
Copy link

noamher commented Jan 17, 2019

@scoder
There is a somewhat similar challenge with prange.

For example:

if use_parallelism:   # use_parallelism is a python boolean object
  with nogil:
    for i in range(K):
       ... # complicated code
else:
  with nogil:
    for i in prange(K):
      ... # complicated code repeated here

I wonder if it's possible to make a solution such as

rng = prange(K) if use_parallelism else range(K)

Another thing is that the schedule strategy of prange also has to be decided at compile time and it would be nice to be able to decide it at runtime.

@scoder
Copy link
Contributor Author

scoder commented Jan 17, 2019

With prange/OpenMP, you can set the thread count to 1 or configure the schedule with the OMP_SCHEDULE environment variable.

@noamher
Copy link

noamher commented Feb 16, 2019

@scoder, I would like to try to take this task on and I would like to hear if you think I'm on the right track.

Boundcheck is first identified as a function used as a decorator but in the InterpretCompilerDirectives it is transform into a CompilerDirectivesNode (which contains the decorated function). Before creating the CompilerDirectivesNode, the expression used as the boundscheck argument is verified to be a BoolNode (so either True or False, without any need to evaluated the expression).

I imagine GILStatNode will have to be modified to accept an optional condition (an ExprNode).
If no condition is provided, the GilStatNode will be used exactly as it is used today. e.g.

with nogil / gil:
    ...

If a condition is provided, e.g.

with nogil / gil(numeric is int):
   ...

Then the condition is stored as the expression of the condition of the GILStatNode.

A later transformation (after the fused type function is "unpacked" into multiple functions in the AnalyseDeclarationsTransform phase) will have to check whether the condition is a valid condition (either a constant boolean or an expression such as "numeric is int".

The logic to check for expressions such as "numeric is int" already exist in the ReplaceFusedTypeChecks visitor which is also called during the AnalyseDeclarationsTransform phase).

ReplaceFusedTypeChecks replaces conditions such as "numeric is int" with a BoolNode (True / False) which I believe is pruned later (the IfClauseNode is removed along with it's body if it the condition is False, or the IfClauseNode is removed but the body remains if the condition is True).

Perhaps ReplaceFusedTypeChecks can also be used to replace the condition of the GILStatNode with True / False.

An alternative is to create another Node to represent the conditional nogil.

What do you think?

@scoder
Copy link
Contributor Author

scoder commented Feb 16, 2019

Perfect analysis. The evaluation of the condition is already implemented, so all the GILStateNode has to do is to raise an error(...) for anything with not self.condition.has_constant_result(), and otherwise look at the truth value of self.condition.constant_result. A GILStatNode with a falsy condition result can be discarded.

Note that ReplaceFusedTypeChecks won't do much in non-fused functions, so this is not the right place to discard GILStatNodes. I think the best place for this is the ConstantFolding transform, which is executed relatively early in the pipeline (where it can cover all trivially constant conditions), and then runs again later for fused functions to resolve type checks etc. So this transform should cover all use cases. Note that, since it runs twice (and since it's generally a "best effort" transform), it's not the right place to raise a compiler error, since the first run will not be able to resolve all cases. It might be enough to raise the error only in the late GilCheck transform, which checks for GIL violations after all other tree transformations have run.

Now, the final question is whether all of this happens in the right order in the compiler pipeline. It might be that we already need to know about nogil sections before we know the truth value of the expression, at least in the case of fused functions. That would be unfortunate, although I guess we might get away with some ambiguity here in the case of fused types, where the code block would usually have to work without the GIL, at least in some cases. Tests will have to tell us if things "just work" or if we need to fix more things up, so quite a bit of work here is to write an extensive set of tests that mixes conditional gil/nogil blocks together with several language features to make sure that we didn't break them. Especially features like try-finally, nested gil/nogil scenarios (with and without conditions), and fused types, also mixed Python/non-Python fused types (e.g. int and object).

All positive tests for this should go into a single new test file under tests/run/, take a look at nogil.pyx and with_gil.pyx. You will also need an error test for invalid conditions and GIL violations. Look at tests/errors/nogil.pyx for that, and add a new test file next to it.

I hope I answered all your questions so far. Keep asking if you need more help.

@noamher
Copy link

noamher commented Feb 16, 2019

Would it be necessary to discard GILStatNodes outside of non fused types functions?

@scoder
Copy link
Contributor Author

scoder commented Feb 16, 2019

Would it be necessary to discard GILStatNodes outside of non fused types functions?

They somehow have to do nothing when disabled, and I think that discarding them is the easiest way to do that. Fused functions are not the only use case for this feature. You could also just imagine a global DEF constant that disables certain nogil usages, e.g. for profiling, debugging, performance comparisons, … I think it's best to not make fused functions special at all in that regard.

@noamher
Copy link

noamher commented Feb 22, 2019

should I use a compile time expression or a (normal) expression as the condition of the gil/nogil statement ?

@scoder
Copy link
Contributor Author

scoder commented Feb 22, 2019

Anything constant (.has_constant_result()). Compile time expressions become regular constant values after parsing.

@noamher
Copy link

noamher commented Feb 22, 2019

so all the GILStateNode has to do is to raise an error(...) for anything with not self.condition.has_constant_result()

Should that occur in the c'tor of GILStateNode, during parsing (in Parsing.py -> p_with_items), or in the GILCheck transform?

@scoder
Copy link
Contributor Author

scoder commented Feb 22, 2019

I'd actually do it in the ConstantFolding transform, when we know the result of the condition. Implement a visit_GILStatNode(self, node) method there and check if there is a condition with a constant result, then either set the condition attribute to None or return the node body instead of the node to discard it. Then we can raise a compile error in GILStatNode itself if it still has a condition expression during the gil checking phase.

@noamher
Copy link

noamher commented Feb 22, 2019

So during Parsing I should use p_compile_time_expr rather than p_test ?

@scoder
Copy link
Contributor Author

scoder commented Feb 22, 2019

No, p_test() is good. Any expression will do, it doesn't have to be constant at parse time. It's enough if it's constant after the ConstantFolding transform has run over it (recursively), i.e. right before you look at the GILStatNode itself.

@noamher
Copy link

noamher commented Feb 22, 2019

In ConstantFolding, no base class of GILStatNode has a visit method implemented so I'm assuming that the children of a GILStatNode (including the statements in the body) are no currently optimized.
Am I getting it wrong?

If they are not currently optimized (using ConstantFolding), should they be?
If so, self.visitchildren(node) is the way?

@noamher
Copy link

noamher commented Feb 22, 2019

Or does the base class of ConstantFolding take care of visiting all nodes (including the body of GILStatNode)?

@noamher
Copy link

noamher commented Feb 22, 2019

Okay, I just saw the line

visit_Node = Visitor.VisitorTransform.recurse_to_children

So I should definitely have the self.visitchildren(node) line.

@scoder
Copy link
Contributor Author

scoder commented Feb 22, 2019

There is a default handler for the node base class Node that matches all nodes and simply traverses to their children. When you add a new visit method, you'd usually visit all children first, and then take a look at what became of the node and its children after that. Whatever you return will replace the node.

@noamher
Copy link

noamher commented Feb 22, 2019

should the children of GILStatNode include the test member I added or just the body (as was until now)?

This question I believe is relevant to all visitors that visit GILStatNode.

@scoder
Copy link
Contributor Author

scoder commented Feb 22, 2019

When you add a new attribute, you have to add it to the child_attrs class attribute in order to have the transforms traverse it. Is that what you meant? (Oh, and call the attribute condition, not test.) And otherwise just pass it into the constructor in the parser.

@noamher
Copy link

noamher commented Feb 22, 2019

@scoder do you have an idea how I can write a test which checks whether the gil is taken?

I guess I can write a test which will fail if the gil is not taken (using some PyObject method for example).

@scoder
Copy link
Contributor Author

scoder commented Feb 23, 2019

Yes, I think a test that uses some arbitrary Python operations in nogil(False) and does not use them in nogil(True) would be a good start. Testing explicitly for the GIL seems difficult. I mean, Python operations can crash there, but that's not a good test … Better just test that Cython does the right GIL checks at the end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants