Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An issue with forward declarations and recursive type #22

Open
mitar opened this issue Dec 19, 2017 · 25 comments
Open

An issue with forward declarations and recursive type #22

mitar opened this issue Dec 19, 2017 · 25 comments

Comments

@mitar
Copy link
Collaborator

mitar commented Dec 19, 2017

Example:

import typing

from pytypes import type_util

Container = typing.Union[
    typing.List['Data'],
]

Data = typing.Union[
    Container,
    str, bytes, bool, float, int, dict,
]

type_util._issubclass(typing.List[float], Container)
Traceback (most recent call last):
  File "<redacted>/lib/python3.6/site-packages/pytypes/type_util.py", line 1387, in _issubclass_2
    return issubclass(subclass, superclass)
TypeError: Forward references cannot be used with issubclass().

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 14, in <module>
    type_util._issubclass(typing.List[float], Container)
TypeError: Invalid type declaration: float, _ForwardRef('Data')
@Stewori
Copy link
Owner

Stewori commented Dec 19, 2017

I think currently forward declarations are only supported directly in annotations. Not sure if a mix of types and forward declarations is supported, but I suppose it's not (yet). Currently you can only use things like

def some_function(a: "a_type_defined_later"): ...

Putting the whole type declaration into a string might work, but I suppose forward declarations are currently parsed in get_type_hints or so. I'm not sure when I'll find time to implement this. PRs welcome...

@Stewori Stewori changed the title An issue with recursive type An issue with forward declarations and recursive type Dec 19, 2017
@Stewori
Copy link
Owner

Stewori commented Dec 19, 2017

Quickly looked into the code. The relevant position is at

val = eval(val, globs)
. There it detects a string eventually and would resolve forward declaration. We could add such a case to _issubclass but then it would resolve the forward declaration again and again for every new subtype check. Or _issubclass is allowed to modify the incoming type. Or we have some kind of cache.

A problem with resolving forward declarations within _issubclass is that _issubclass usually does not know the context, i.e. the module where the declaration is defined. It can maybe guess it from the caller; I suspect it is not possible to get the enclosing module from a type, especially not if it is one pure forward declaration string. If you have an idea how _issubclass should/could retrieve that info, let me know. Or maybe we should add yet another optional arg that names the module to use for resolving forward declarations.

Maybe the best idea would be to have a public service function (e.g. resolve_forward_decl) that resolves forward declarations in a type and also takes the module as explicit argument, returning the resolved type. Then _issubclass simply requires that a type containing forward declarations was previously processed by that function. An appropriate error msg would point the user to this.

Also note that resolving a forward ref, even checking if a type needs to resolve a forward ref somewhere internally is expensive and in most cases not necessary. With an explicit service function the user can actively and transparently resolve types as needed. What do you think?

@mitar
Copy link
Collaborator Author

mitar commented Dec 19, 2017

I think the public service function would be the best. It just has to make sure it resolves recursive types through reusing the same reference instead of copying the type over again and again. But yes, I think this would be the best and could also be done only once/cached on the caller's side.

@mitar
Copy link
Collaborator Author

mitar commented Dec 23, 2017

Any chance on implementing this soon or helping a bit how to do it? I thought I can make a workaround for us but it seems it does not work and it started blocking one validator we have for code and are using this package to check types. :-(

@Stewori
Copy link
Owner

Stewori commented Dec 24, 2017

Sorry, I was rather busy during christmas time. It would be great to have some help on this one. I already spent some thoughts about implementing the resolve function. I'll try to put a draft together we can iterate on...

@mitar
Copy link
Collaborator Author

mitar commented Dec 24, 2017

Yea, I completely understand about holidays. I tried to finish this before them, but it seems I got stuck now. Anyway, don't worry, I will try to see what I can do.

@Stewori
Copy link
Owner

Stewori commented Dec 24, 2017

So, I spontaneously committed some work on this as of 0f454ae. This should help with your issue for now.
However, it maybe does not yet handle callables correctly.
And it might run into recursion issues eventually. The example from above works now, if you do it like this:

Container = Union[Sequence['Data'], int] # All this wouldn't work with List because of invariant arg type
Data = Union[Container, str, bytes, bool, float, int, dict]
pytypes.resolve_fw_decl(Container) # helpful exception test will show up if this line is left out
pytypes.is_subtype(typing.List[float], Container) # true
pytypes.is_subtype(typing.List[complex], Container) # false

Note that you need Sequence still because of the invariant parameter of List discussed in other thread.

@mitar
Copy link
Collaborator Author

mitar commented Dec 24, 2017

You are amazing! I will check it immediately!

@mitar
Copy link
Collaborator Author

mitar commented Dec 27, 2017

Hm, what is the second return value from resolve_fw_decl? Why is that necessary?

@mitar
Copy link
Collaborator Author

mitar commented Dec 27, 2017

Oh, I see, I do not even have to check return values. Cool.

@mitar
Copy link
Collaborator Author

mitar commented Dec 27, 2017

Works like a charm! Thanks.

@mitar mitar closed this as completed Dec 27, 2017
@mitar
Copy link
Collaborator Author

mitar commented Dec 27, 2017

Care to release a version with this improvement?

@Stewori
Copy link
Owner

Stewori commented Dec 27, 2017

The first return value is the type itself. In case the input was one pure forward declaration as a string this is relevant. The second return value is a boolean indicating if a change was made, i.e. it is true if at least one actual forward declaration was found and resolved. This is to help people assess if they should eventually repeat or update operations or hashes or whatever they might have already done with their type. It's also useful for testing.

There are actually some remaining todos on this issue (so reopening it for now...):

  • support callables in resolve_fw_decl (done as of 3a65f6d)
  • use resolve_fw_decl in
    val = eval(val, globs)
    (done as of 3846df7)
  • write doc for resolve_fw_decl
  • write test for resolve_fw_decl
  • write test for is_subtype concerning typing._ForwardRef
  • add support for typing._ForwardRef in type_util.type_str (done as of 3fb6266)
  • do some reasoning and checking regarding potential endless recursion in some typechecking scenarios (done)
  • write test for endless recursion case
  • support typing._ForwrdRef in stubfile_2_converter

@Stewori Stewori reopened this Dec 27, 2017
@Stewori
Copy link
Owner

Stewori commented Dec 27, 2017

Care to release a version with this improvement?

I want to have a new release soon, but would like to get some more stuff in, e.g. at least some of the todos listed above, fixes/resolutions for #23, #18, #20, eventually #21, failures mentioned in python-attrs/attrs#301 (comment) Help is welcome!

@mitar
Copy link
Collaborator Author

mitar commented Dec 29, 2017

Hm, it seems there is no protection against infinite recursion. I get stack overflow with this test:

import typing

from pytypes import type_util

Data = typing.Union['Container', float]
Container = typing.Union[Data, int]

type_util.resolve_fw_decl(Data)
type_util.resolve_fw_decl(Container)

type_util._issubclass(list, Container, bound_typevars={})

Of course it is obvious that definition of types is buggy, but maybe resolve_fw_decl could validate that?

@Stewori
Copy link
Owner

Stewori commented Jan 1, 2018

maybe resolve_fw_decl could validate that?

Is it possible to have a recursion issue with a type not involving Union? I'm spontaneously not entirely sure.

If so, I think it would be better to make _is_subclass_Union recursion proof in general. I'm just wondering what would be the best way to do it (e.g. caching already checked types; maybe allow it to run at first a special _issubclass mode that does not follow forward references to have the succeed-fast potential).

Otherwise we would have to make _issubclass itself recursion proof (doable as well, but more costly regarding caching).

It would be good to have an eventual example for recursion issue not involving Union. For better understanding and later for testing.

@mitar
Copy link
Collaborator Author

mitar commented Jan 2, 2018

Is it possible to have a recursion issue with a type not involving Union? I'm spontaneously not entirely sure.

I think Union + forward declarations.

Or Union types without forward declarations if somebody manually "patch" Union objects in some other way.

If so, I think it would be better to make _is_subclass_Union recursion proof in general.

Are you sure? Keeping track might introduce additional performance cost. But from design perspective this is probably cleaner, yes.

I'm just wondering what would be the best way to do it

I would just have an extra argument to it, similar to bound variables, stored types being in process of resolving.

@Stewori
Copy link
Owner

Stewori commented Jan 2, 2018

I think Union + forward declarations.

I mean are there examples without Union, i.e. is this issue specific to Union checking or do we have to think more general?

Are you sure? Keeping track might introduce additional performance cost. But from design perspective this is probably cleaner, yes.

My philosophy in pytypes is to allow user to opt out of such arguable stuff. Aside that I value correctness higher than performance (in production, one would disable typechecking anyway I suppose). Also, that's why I think of a fast-succeed mode.

@mitar
Copy link
Collaborator Author

mitar commented Jan 2, 2018

I mean are there examples without Union, i.e. is this issue specific to Union checking or do we have to think more general?

I cannot imagine a case without Union. This is the only current type which allows multiple options, no?

@Stewori
Copy link
Owner

Stewori commented Jan 2, 2018

Any solution to this would require at least one more new arg for _issubclass and thus the whole type checking function family. I wonder if it would be better to wrap all these args into kind of class like type_memo in typeguard.

I cannot imagine a case without Union

Me neither. I hope we don't overlook something.

Stewori added a commit that referenced this issue Jan 16, 2018
…d_typevars, c.f. #24. Improved doc of _issubclass and _isinstance.
@Stewori
Copy link
Owner

Stewori commented Jan 16, 2018

As of 74ef4b9 _issubclass performs recursion-checks if a _ForwardRef is encountered. As far as I tested, this solves the recursion issue you observe. Would be good if you could confirm this. Final todo for this issue is to add tests. Help welcome...

@mitar
Copy link
Collaborator Author

mitar commented Jan 16, 2018

I have not encountered this in practice but only during testings of new code. So the example above is the only one I have.

Hm, but _issubclass does not see anymore _ForwardRefs if they were resolved? Or am I misunderstanding your comment? Or resolving does not get rid of _ForwardRefs?

New code seems to work through against my example above.

@Stewori
Copy link
Owner

Stewori commented Jan 17, 2018

No, it actually does not get rid of _ForwardRef. After looking more closely into _ForwardRef I had learned that it provides fields to store the referenced type, i.e. __forward_arg__, __forward_value__ and __forward_evaluated__. On resolving, __forward_value__ shall be filled with the actual type, __forward_arg__ provides the string representation thereof. Using these fields seems to be the official way and avoids sort of hassle with replacing a type within a potentially complex structure of PEP484 types. The downside is, it allows for cycles, but that's not my invention. _issubclass should now be cycle proof regarding this. And I think without much additional overhead.

@mitar
Copy link
Collaborator Author

mitar commented Jan 17, 2018

Awesome!

@mitar
Copy link
Collaborator Author

mitar commented Apr 5, 2018

As of 74ef4b9 _issubclass performs recursion-checks if a _ForwardRef is encountered. As far as I tested, this solves the recursion issue you observe.

So it currently returns False in such case, no? Wouldn't raising an exception be better? Because this is a faulty type to begin with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants