Skip to content

enhancements argumentnonechecks

StefanBehnel edited this page Apr 16, 2010 · 11 revisions

CEP 520: None checks for extension type arguments in functions

Currently, Cython allows users to annotate typed Python function arguments with a "not None" syntax, as in

def func(MyExtType ext not None):
    return ext.some_c_method()

In the above example, a None check is required to prevent a crash when accessing the C-level method, as the MyExtType declaration allows None as a valid argument, in addition to all instances of MyExtType or a subtype.

This CEP proposes to

  1. replace (or accompany) the "not None" annotation that forbids None values with an "or None" annotation that allows them, for example:

    {{{ def func(MyExtType ext or None):

    if ext is None:

    doSomethingElse()

    else:

    ext.some_c_method()

    }}}

  1. allow the 'not None' / 'or None' annotations also for builtin types, e.g.

    {{{ def func(list L or None):

    ...

    }}}

  1. change the default behaviour of extension/builtin type arguments in functions to strictly test the type of the argument, i.e. to __reject__ the value None and raise a TypeError for it.
  1. add a new directive allow_none_for_extension_args that allows to control the default behaviour.

Background

  • In version 0.9.9, Pyrex has shifted from a negative "not None" annotation to a positive "or None" annotation for function arguments.
  • In wrapper code, it is extremely common to pass extension type proxies into functions or methods. When accessing the C attributes or methods of these arguments, they must not be None, so programmers are required to make sure the either check the input value themselves or add the "not None" annotation. This is very easy to forget and leads to crash bugs.
  • Especially for new users, it is very easy to miss the fact that an extension type argument allows None as a value, so it is easy to write code that crashes on None input. The type annotation seems to suggest that the value is assured to have the requested type, which it does not in the case of a None value.
  • It is currently impossible to write def func(list a not None):. While this cannot currently lead to a crash (as long as the struct fields of builtin type are not properly supported), it seems inconsistent considering that builtin types are not very different from extension types.

Proposal

The CEP proposes to replace the "not None" annotation by a reversed "or None" annotation, and to change the default behaviour from accepting None as a value for extension type arguments to a strict type test that rejects it.

  • Example 1: {{{ def func(MyExtType ext not None):

    return ext.some_c_method()

    }}} would become {{{ def func(MyExtType ext):

    return ext.some_c_method()

    }}}

  • Example 2: {{{ def func(MyExtType ext):

    if ext is None:

    doSomethingElse()

    else:

    ext.some_c_method()

    }}} would become {{{ def func(MyExtType ext or None):

    if ext is None:

    doSomethingElse()

    else:

    ext.some_c_method()

    }}}

  • Example 3: {{{ def func(MyExtType ext=None):

    if ext is None:

    doSomethingElse()

    else:

    ext.some_c_method()

    }}} would behave as before. Here, the None default value would simply be special cased in the compiler.

  • Example 4: {{{ def func(MyExtType ext not None = None):

    ...

    }}} would become {{{ def func(MyExtType ext = None):

    ...

    }}} and would be a compile time error in both cases.

  • Example 5 (using the proposed syntax): {{{ def func(MyExtType ext or None = None):

    ...

    }}} would be redundant but allowed and is identical to {{{ def func(MyExtType ext = None):

    ...

    }}}

  • Example 6 (using the proposed syntax): {{{ def func(MyExtType ext or None = DEFAULT_VALUE):

    ...

    }}} would be the required syntax to allow None in addition to a non-None default value or one that is not known to be None at compile time.

  • Example 7 (using the proposed syntax): {{{ def func(list L or None):

    print (None if L is None else L[0])

    def func_not_None(list L not None):

    print L[0]

    }}}

  • Example 8 (using the proposed directive): {{{ @cython.allow_none_for_extension_args(True) def func(MyExtType x):

    if ext is None:

    doSomethingElse()

    else:

    ext.some_c_method()

    }}}

  • Example 9: {{{ def func1(object o not None):

    ...

    def func2(o not None):

    ...

    }}} would be allowed and reject None, which would otherwise be allowed by the strict type test (because isinstance(None, object) == True!).

Discussion

Pros:

  • This change touches on input argument checking and makes it easier to write code that doesn't crash on unexpected input. Currently, Cython raises a TypeError for all input that does not match the declared type of a parameter (e.g. when coercing numeric values), except when receiving None for a declared extension type argument. This special case would disappear.
  • Typed argument values that must reject None are more common than those that can safely allow them without being optional arguments. So the current syntax requires more overhead on the user side than the proposed syntax.
  • It is easier to read MyExtType ext or None than to remember that MyExtType ext allows None as input.
  • Functions that require None to be allowed as input value are easy to spot due to the TypeError that they will generate with the proposed change. This is much easier to fix than to debug a function that crashes hard somewhere within a deep call stack because the None case was not properly handled.
  • The automatically generated docstring signature will make it much more explicit where a None value is allowed than the current syntax.
  • Supporting builtin types removes an inconsistency between extension type arguments and builtin type arguments.

Cons:

  • The current semantics for declaring a typed variable (cdef MyExtType x) allows the value None, so that the semantics of arguments would differ in the default behaviour. Changing the behaviour of cdef MyExtType x is not an option as the variable requires an initial value and users must be able to set it to a neutral value to destroy the reference they contain.
  • None values can appear in other places, too (e.g. as return values from function calls). So even if argument types are an important place where Python values from the outside world must be validated, this would make their type checking a special case without removing the problem completely.
  • Working control flow analysis in the compiler would make such an annotation less interesting, as the compiler could then see where potential None values can make operations unsafe and generate a None check in those places. It cannot replace the use case of an initial None check, though, as the usage of the value can well appear within a loop, where a None check can induce a serious performance impact.
  • Passing None for a builtin type cannot currently lead to a crash, so this may not be worth breaking existing code.
Clone this wiki locally