[ENH] Implement "typing.Optional[]" and give it Python semantics ("not None" vs. "or None") #3883

scoder · 2020-10-18T06:32:02Z

In Cython, def func(list arg) allows arg to be None. In Python's PEP-484 typing universe, def func(arg: list) excludes None and passing None requires the spelling def func(arg: Optional[list]).

Given that Optional[…] exists, we could change the type analysis of PEP-484 argument annotations to exclude None by default (i.e. interpret them as if not None was given), since this it is a common source of mistakes (see #2696). The use of Optional[] would then be interpreted as or None.

This would probably not apply to variable annotations. While PEP-526 doesn't seem to mention the topic anywhere explicitly, it seems hard to prevent variables from being set to None, and it has a runtime impact to prevent that case. Also, it does not seem very useful to actively prevent users from setting variables to None.

Also note that there is the issue of subtypes, which Python typing allows but Cython typing rejects. This difference would probably remain.

The text was updated successfully, but these errors were encountered:

da-woods · 2020-10-18T19:58:55Z

Some of the problem here is identifying typing.Optional. It's fairly easy when you have an annotation of typing.Optional but as far as I can tell there isn't an easy mechanism to do it for from typing import Optional or import typing as not_typing; somevar: not_typing.Optional[int].

I'm trying to get this to work for #3400. If I get something satisfactory I'll try to make a PR for it on its own, because it'd probably be useful here too.

0dminnimda · 2021-06-29T11:04:10Z

from typing import Optional or import typing as not_typing; somevar: not_typing.Optional[int].

@da-woods Should we allow this at all? Because it seems that the only real cases are

import typing
var: typing.Optional[...]

from typing import Optional
var: Optional[...]

And we can still just watch the var: x.Optional[...] and give warnings that "this is not something that we will take it as typing.Optional, could you mean typing.Optional?"

da-woods · 2021-06-29T17:34:39Z

@da-woods Should we allow this at all?

I don't see why not. You have to track it anyway because users could name their own global variables typing or Optional and these shouldn't affect the type. Anyway, the code to allow it already exists and already works so why not allow it?

0dminnimda · 2021-06-29T18:08:03Z

Anyway, the code to allow it already exists and already works so why not allow it?

I don't mind, I just suggested an easier way.

matusvalo · 2022-01-29T19:24:09Z

@scoder this should be closed based on PR #3400 right?

da-woods · 2022-01-29T19:49:25Z

@matusvalo yes I think it should

scoder · 2022-01-29T20:12:59Z

Is it complete, though? I also suggested to change the default semantics of func(x: str) to disallow None for x and require users to write func(x: Optional[str]). While I rejected #2696 as such, I'm not sure that this equally applies to type annotations. Providing them with full Python semantics, given that new code has to be written or old code modified in order to use them, seems tempting.

scoder · 2022-01-29T21:24:25Z

Ah, right. It seems that this was also implemented:

cython/Cython/Compiler/Nodes.py

Lines 950 to 973 in 102366d

    
               # inject type declaration from annotations 
        
               # this is called without 'env' by AdjustDefByDirectives transform before declaration analysis 
        
               if (self.annotation and env and env.directives['annotation_typing'] 
        
                       # CSimpleBaseTypeNode has a name attribute; CAnalysedBaseTypeNode 
        
                       # (and maybe other options) doesn't 
        
                       and getattr(self.base_type, "name", None) is None): 
        
                   arg_type = self.inject_type_from_annotations(env) 
        
                   if arg_type is not None: 
        
                       base_type = arg_type 
        
               return self.declarator.analyse(base_type, env, nonempty=nonempty) 
        
           def inject_type_from_annotations(self, env): 
        
               annotation = self.annotation 
        
               if not annotation: 
        
                   return None 
        
               base_type, arg_type = annotation.analyse_type_annotation(env, assigned_value=self.default) 
        
               if base_type is not None: 
        
                   self.base_type = base_type 
        
               if arg_type and arg_type.python_type_constructor_name == "typing.Optional": 
        
                   self.or_none = True 
        
                   arg_type = arg_type.resolve() 
        
               if arg_type and arg_type.is_pyobject and not self.or_none: 
        
                   self.not_none = True 
        
               return arg_type

scoder · 2022-01-29T21:46:34Z

One issue that I found is that (as before) this only applies to Python calls. cdef functions look the same but neither the function nor the caller check for None when it's disallowed. This is because previously, not None was not allowed on cdef function arguments. Now it's intended to be the default.

Should we make this fail on the caller side?

cdef func(x: list): ...

func(None)

It feels a bit mixed. Here, the developer is usually in charge of the calling code. User provided None values should be ruled out at this point. So there doesn't seem to be a strict requirement to change this into a runtime error.

What do you think?

scoder · 2022-01-29T22:05:07Z

Oh, and we probably also need some error tests for things like func(x: Optional[SomeStruct]) or other C types, right? That shouldn't be allowed.

da-woods · 2022-01-29T22:14:03Z

I'm not completely sure what the right thing to do is, but it should definitely be tested and documented!

Having different rules for cdef functions feels a bit obscure to me.

Optional[ctype] probably should be an error (as you say). For structs it obviously can't work. And I think people would find Optional[double] being interpretted as a PyObject to be confusing

scoder · 2022-01-29T22:26:15Z

One more tricky thing: func(x: Optional[double[:]]). This, on the other hand, should probably be allowed. Memory views definitely suffer from the same "didn't expect that to be None" gotcha.

…unc(x: Optional[list])". See cython#3883

scoder · 2022-01-29T23:46:20Z

I have implemented Optional / or None semantics for func(x: list = None) in 13f8cd7.

scoder · 2022-01-29T23:52:13Z

I'm not completely sure what the right thing to do is, but it should definitely be tested and documented!

Yes.

Having different rules for cdef functions feels a bit obscure to me.

Sort-of, yes. I mean, cdef functions already have different semantics, but not when it comes to argument type behaviour.

Optional[ctype] probably should be an error (as you say).

Turns out, it already is, because or None is disallowed for non-Python types. I'll just improve the error message there.

For structs it obviously can't work. And I think people would find Optional[double] being interpreted as a PyObject to be confusing

It looks like Optional[float] and Optional[int] are rejected now, though. That shouldn't be the case. They should be resolved to the Python types instead of the C types, as should generally be the case for Python type annotations.

…d complaining about "or None", which wasn't the actual syntax construct on user side. See #3883

scoder · 2022-01-30T11:16:57Z

It looks like Optional[float] and Optional[int] are rejected now, though.

Turns out that this is difficult to handle. The type analysis converts explicit type references like cython.int to c_int_type, and it does the same for int, because it lacks the context whether C types are expected here. Thus, Optional[int] (which is ok and 'obviously' refers to a Python int) and Optional[cython.int] (which should be rejected) look really the same and cannot easily be distinguished. I think we might have to reject C types with Optional[] at an earlier stage, so that the type declaration stages can rely on valid type declarations already (or at least don't have to second-guess what the type declaration looked like in actual code).

… "Optional[int]" etc. interprets "int" as (valid) Python int type and not (invalid) C int type. See #3883

scoder · 2022-01-30T20:13:51Z

I looked into the test failures that I introduced in the master branch (sorry for that) and one of the issues is this test:

cython/tests/run/pep526_variable_annotations.py

Lines 187 to 189 in 7914e3b

    
           x: typing.Tuple[int, float] = (a[0], a[1]) 
        
           y: Tuple[int, ...] = (1,2.) 
        
           z = a[0]  # should infer to int

As you can see, it assumes that x: Tuple[int, float] refers to a tuple of C int, C float. This is a wrong assumption. It sees a Python type declaration and should infer (Python int, Python float) instead (or C double, for that matter). Otherwise, it would incorrectly constrain the value range of the first tuple item, if this comes in from Python code.

I've reverted my changes in master and will see how I can fix this up in a separate branch.

da-woods · 2022-01-30T20:17:12Z

It was definitely my intent that typing.Tuple should be able to produce C-tuples (and I think this is a reasonable thing to do).

I think here it shouldn't though, because it should require cython.int and cython.float? Sorry about that... I probably should have spotted this.

da-woods · 2022-01-30T20:43:27Z

So my view: AnnotationNode.analyse_type_annotation seems to have a specific workaround for int/long/float which applies directly to the annotation.

if annotation.is_name and not annotation.cython_attribute and annotation.name in ('int', 'long', 'float'):

These subscripted types can end up with fairly deeply nested types, which we want to detect these in. That suggests we'd be better passing an safe_annotation_typing flag to analyse_as_type and then using that flag to decide what types to keep, rather than doing it all in the method of AnnotationNode?

scoder · 2022-01-30T20:47:17Z

Yes, nesting is an issue, but also e.g. cdef Set[int] x vs. x: Set[int]. In the first case, int is a C int, in the second, it needs to be a Python int.

… "Optional[int]" etc. interprets "int" as (valid) Python int type and not (invalid) C int type. See cython#3883

jakirkham · 2022-02-02T00:16:16Z

Yes, nesting is an issue, but also e.g. cdef Set[int] x vs. x: Set[int]. In the first case, int is a C int, in the second, it needs to be a Python int.

How should these be handled in a .pyx/.pxd vs. a .py file?

scoder · 2022-02-02T01:43:46Z

Yes, nesting is an issue, but also e.g. cdef Set[int] x vs. x: Set[int]. In the first case, int is a C int, in the second, it needs to be a Python int.

How should these be handled in a .pyx/.pxd vs. a .py file?

I also had to stop and think about that for a moment, but the file type should make no difference here.

The reason for the different meanings is that cdef Set[int] x should work like cdef int x, where int has always referred to a C int (because there is no reason to refer to a Python int here). If you use Python typing syntax, however, it needs to mimic Python semantics, so x: Set[int] and x: int must both refer to a Python int, simply because we might really be looking at code here that was written for Python and not for Cython. Whether that code appears in a .py file or in a (potentially renamed) .pyx file (or even a .pxd file) is irrelevant.

da-woods · 2022-02-02T07:11:54Z

Set[int] presumably refers to set in both cases (because we can't do much with the container contents). I think Tuple is the only container where we try to make use of the element types (with ctuples).

scoder · 2022-02-02T10:16:28Z

Set[int] presumably refers to set in both cases (because we can't do much with the container contents). I think Tuple is the only container where we try to make use of the element types (with ctuples).

Correct as it stands, but it doesn't need to stay that way. We could certainly start using Set[int] to infer the type of iteration variables and other kinds of item access. (#1979)

scoder · 2022-02-02T10:29:08Z

In fact, we could consider the PythonTypeConstructor type a feature specifically of type inference, and only resolve types before setting node.type. That way, we could use the full set of available type information in the inference (and possibly optimisation) phases. Even after type analysis, whenever we can make use of nested type information, we can let nodes infer their type again instead of just looking at node.type.

jakirkham · 2022-02-02T22:29:26Z

Set[int] presumably refers to set in both cases (because we can't do much with the container contents). I think Tuple is the only container where we try to make use of the element types (with ctuples).

Correct as it stands, but it doesn't need to stay that way. We could certainly start using Set[int] to infer the type of iteration variables and other kinds of item access. (#1979)

Just as another thought, this could be a place where C++ developers might want to use std::set. Whether/how that would be setup is admittedly a whole other discussion, but just want to surface that use case.

To the original question about whether file type should affect type interpretation, this could be posed with Tuple instead of Set. It wasn't really meant to be focused on Set specifically.

da-woods · 2022-02-02T22:37:53Z

I'd be reluctant to start substituting C++ containers for Python types. There's two set implementations for C++ (set and unordered_set), and they have different internal designs and different performance characteristics, and different requirements for what they hold. I believe the Python set is probably closer to unordered_set. It also creates a copy (which often wouldn't be wanted).

I think I agree that file-type shouldn't affect interpretation. Mainly because I think it'd be a complicated rule that'd be hard to explain and end up confusing people (so it'd need a very good justification why it's needed)

jakirkham · 2022-02-02T22:45:34Z

Sorry didn't mean to send us down a rabbit hole with C++. Just thinking of other cases where Set[int] and others might be useful. Though would add there are a lot of ways this could be implemented to handle these differences, but maybe that can be saved for another discussion.

… "Optional[int]" etc. interprets "int" as (valid) Python int type and not (invalid) C int type. See cython#3883

wyfo · 2022-02-10T11:33:57Z

I have implemented Optional / or None semantics for func(x: list = None) in 13f8cd7.

I think it could be a good idea to also add PEP 604 support, i.e. var: dict | None

scoder · 2022-02-10T12:27:34Z

I think it could be a good idea to also add PEP 604 support, i.e. var: dict | None

Right. But writing up #4631, I can see that that's not a small topic.

wyfo · 2022-02-10T12:48:25Z

By the way, I see in 13f8cd7 that you've supported implicit optionals (non-optional but with None default value).
However, this behavior is now deprecated, as stated in PEP 484:

This is no longer the recommended behavior. Type checkers should move towards requiring the optional type to be made explicit.

I think it would be better thus to not support it, but I'm maybe too strict.

…cit "typing.Optional[]" declaration. See #3883

… "Optional[int]" etc. interprets "int" as (valid) Python int type and not (invalid) C int type. See cython#3883

scoder · 2022-02-10T13:15:04Z

By the way, I see in 13f8cd7 that you've supported implicit optionals (non-optional but with None default value). However, this behavior is now deprecated
I think it would be better thus to not support it, but I'm maybe too strict.

Thanks for the notice. I'm sure there's a considerable body of code out there by now that makes use of this 'obvious' simplification, so a warning seems better than a syntax error.
b1b23de

… "Optional[int]" etc. interprets "int" as (valid) Python int type and not (invalid) C int type. See cython#3883

* Check for "Optional[ctype]" earlier because we need to make sure that "Optional[int]" etc. interprets "int" as (valid) Python int type and not (invalid) C int type. See #3883 * Fix typing assumptions in PEP 526 variable annotations test: in a Python type annotation, "int" means Python int and "float" means Python float, not the C types. * Use a context manager to make it explicit in annotation type analysis when C types are allowed, and when Python types are required or expected. * Generalise the concept of equivalent Python and C types for more efficient type inference: PyFloat/double, PyBool/bint, PyComplex/double complex. * Refactor analyse_type_annotation() to prepare the extraction of type modifiers (as opposed to special types). See discussion in #4606 (comment) * Refactor handling of "typing.Optional", "dataclasses.InitVar" etc. annotations to move them into the declared Entry during type analysis and keep only the bare type in the type system. * Force ClassVar[...] types to be object types. * Add a warning when users define a ClassVar[] with a non-Python type. See #4606 (comment) * Provide a helpful warning when users write plain C types in a non-C annotation context. * Only consider Python object item types from list/tuple as self.type in IndexNode since that will be the result of the index access. Coercion needs to happen externally, then based on the type inference. * Ignore Python annotation type "long" since it almost certainly does not refer to PyLong but to C long. Issue a warning to make users aware of it. * Fix PEP-526 test by working around incomplete type inference, but leave FIXME comments.

da-woods · 2022-07-31T09:53:35Z

This issue is largely done I think?

scoder · 2022-07-31T15:04:10Z

I think so, too. Let's close it.

scoder added feature Python Semantics Type Analysis labels Oct 18, 2020

scoder added this to the 3.0 milestone Oct 18, 2020

scoder changed the title ~~[ENH] Implement typing.Optional[] and give it Python semantics (not None vs. or None)~~ [ENH] Implement "typing.Optional[]" and give it Python semantics ("not None" vs. "or None") Oct 18, 2020

da-woods mentioned this issue Nov 15, 2020

typing support #3910

Closed

da-woods mentioned this issue May 30, 2021

[ENH] typing.Union -> fused type? #4201

Open

da-woods closed this as completed Jan 29, 2022

scoder reopened this Jan 29, 2022

scoder added a commit to scoder/cython that referenced this issue Jan 29, 2022

Give "func(x: list = None)" the same "type allows None" meaning as "f…

13f8cd7

…unc(x: Optional[list])". See cython#3883

scoder added a commit that referenced this issue Jan 29, 2022

Improve the error message for "Optional[ctype]". Previously, it faile…

4225d7c

…d complaining about "or None", which wasn't the actual syntax construct on user side. See #3883

scoder added a commit that referenced this issue Jan 30, 2022

Check for "Optional[ctype]" earlier because we need to make sure that…

f73108c

… "Optional[int]" etc. interprets "int" as (valid) Python int type and not (invalid) C int type. See #3883

da-woods mentioned this issue Jan 30, 2022

Start of trying to handle nested annotation_types #4605

Closed

scoder added a commit to scoder/cython that referenced this issue Jan 31, 2022

Check for "Optional[ctype]" earlier because we need to make sure that…

850556d

… "Optional[int]" etc. interprets "int" as (valid) Python int type and not (invalid) C int type. See cython#3883

scoder mentioned this issue Jan 31, 2022

Fix annotation type analysis for Python "typing" types #4606

Merged

scoder added a commit to scoder/cython that referenced this issue Jan 31, 2022

Check for "Optional[ctype]" earlier because we need to make sure that…

0de3933

… "Optional[int]" etc. interprets "int" as (valid) Python int type and not (invalid) C int type. See cython#3883

matusvalo mentioned this issue Feb 3, 2022

[BUG] Cython does not recognise type correctly when typing.Optional is used #4614

Closed

scoder added a commit to scoder/cython that referenced this issue Feb 4, 2022

Check for "Optional[ctype]" earlier because we need to make sure that…

c24efc8

… "Optional[int]" etc. interprets "int" as (valid) Python int type and not (invalid) C int type. See cython#3883

scoder added a commit that referenced this issue Feb 10, 2022

Follow PEP-484 and warn about "func(x: list = None)" without an expli…

b1b23de

…cit "typing.Optional[]" declaration. See #3883

scoder added a commit to scoder/cython that referenced this issue Feb 10, 2022

Check for "Optional[ctype]" earlier because we need to make sure that…

fde83e2

… "Optional[int]" etc. interprets "int" as (valid) Python int type and not (invalid) C int type. See cython#3883

scoder added a commit to scoder/cython that referenced this issue Feb 24, 2022

Check for "Optional[ctype]" earlier because we need to make sure that…

2b3559b

… "Optional[int]" etc. interprets "int" as (valid) Python int type and not (invalid) C int type. See cython#3883

scoder closed this as completed Jul 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Implement "typing.Optional[]" and give it Python semantics ("not None" vs. "or None") #3883

[ENH] Implement "typing.Optional[]" and give it Python semantics ("not None" vs. "or None") #3883

scoder commented Oct 18, 2020 •

edited

Loading

da-woods commented Oct 18, 2020

0dminnimda commented Jun 29, 2021

da-woods commented Jun 29, 2021

0dminnimda commented Jun 29, 2021

matusvalo commented Jan 29, 2022

da-woods commented Jan 29, 2022

scoder commented Jan 29, 2022

scoder commented Jan 29, 2022

scoder commented Jan 29, 2022

scoder commented Jan 29, 2022

da-woods commented Jan 29, 2022

scoder commented Jan 29, 2022

scoder commented Jan 29, 2022

scoder commented Jan 29, 2022

scoder commented Jan 30, 2022

scoder commented Jan 30, 2022

da-woods commented Jan 30, 2022

da-woods commented Jan 30, 2022

scoder commented Jan 30, 2022

jakirkham commented Feb 2, 2022

scoder commented Feb 2, 2022

da-woods commented Feb 2, 2022

scoder commented Feb 2, 2022

scoder commented Feb 2, 2022

jakirkham commented Feb 2, 2022

da-woods commented Feb 2, 2022

jakirkham commented Feb 2, 2022

wyfo commented Feb 10, 2022

scoder commented Feb 10, 2022

wyfo commented Feb 10, 2022 •

edited

Loading

scoder commented Feb 10, 2022

da-woods commented Jul 31, 2022

scoder commented Jul 31, 2022

[ENH] Implement "typing.Optional[]" and give it Python semantics ("not None" vs. "or None") #3883

[ENH] Implement "typing.Optional[]" and give it Python semantics ("not None" vs. "or None") #3883

Comments

scoder commented Oct 18, 2020 • edited Loading

da-woods commented Oct 18, 2020

0dminnimda commented Jun 29, 2021

da-woods commented Jun 29, 2021

0dminnimda commented Jun 29, 2021

matusvalo commented Jan 29, 2022

da-woods commented Jan 29, 2022

scoder commented Jan 29, 2022

scoder commented Jan 29, 2022

scoder commented Jan 29, 2022

scoder commented Jan 29, 2022

da-woods commented Jan 29, 2022

scoder commented Jan 29, 2022

scoder commented Jan 29, 2022

scoder commented Jan 29, 2022

scoder commented Jan 30, 2022

scoder commented Jan 30, 2022

da-woods commented Jan 30, 2022

da-woods commented Jan 30, 2022

scoder commented Jan 30, 2022

jakirkham commented Feb 2, 2022

scoder commented Feb 2, 2022

da-woods commented Feb 2, 2022

scoder commented Feb 2, 2022

scoder commented Feb 2, 2022

jakirkham commented Feb 2, 2022

da-woods commented Feb 2, 2022

jakirkham commented Feb 2, 2022

wyfo commented Feb 10, 2022

scoder commented Feb 10, 2022

wyfo commented Feb 10, 2022 • edited Loading

scoder commented Feb 10, 2022

da-woods commented Jul 31, 2022

scoder commented Jul 31, 2022

scoder commented Oct 18, 2020 •

edited

Loading

wyfo commented Feb 10, 2022 •

edited

Loading