Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs of typing.get_args: Mention that due to caching of typing generics the order of arguments for Unions can be different from the one of the returned tuple #86483

Closed
Dominik1123 mannequin opened this issue Nov 10, 2020 · 9 comments
Labels
3.9 only security fixes 3.10 only security fixes docs Documentation in the Doc dir type-feature A feature request or enhancement

Comments

@Dominik1123
Copy link
Mannequin

Dominik1123 mannequin commented Nov 10, 2020

BPO 42317
Nosy @gvanrossum, @ilevkivskyi, @miss-islington, @Dominik1123, @Fidget-Spinner
PRs
  • bpo-42317: Improve docs of typing.get_args concerning Union #23254
  • [3.9] bpo-42317: Improve docs of typing.get_args concerning Union (GH-23254) #23307
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-11-16.01:54:24.997>
    created_at = <Date 2020-11-10.19:35:37.290>
    labels = ['type-feature', '3.9', '3.10', 'docs']
    title = 'Docs of `typing.get_args`: Mention that due to caching of typing generics the order of arguments for Unions can be different from the one of the returned tuple'
    updated_at = <Date 2020-11-16.01:54:24.996>
    user = 'https://github.com/Dominik1123'

    bugs.python.org fields:

    activity = <Date 2020-11-16.01:54:24.996>
    actor = 'gvanrossum'
    assignee = 'docs@python'
    closed = True
    closed_date = <Date 2020-11-16.01:54:24.997>
    closer = 'gvanrossum'
    components = ['Documentation']
    creation = <Date 2020-11-10.19:35:37.290>
    creator = 'Dominik V.'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 42317
    keywords = ['patch']
    message_count = 9.0
    messages = ['380699', '380762', '380784', '380831', '380850', '380851', '381051', '381053', '381054']
    nosy_count = 6.0
    nosy_names = ['gvanrossum', 'docs@python', 'levkivskyi', 'miss-islington', 'Dominik V.', 'kj']
    pr_nums = ['23254', '23307']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue42317'
    versions = ['Python 3.9', 'Python 3.10']

    @Dominik1123
    Copy link
    Mannequin Author

    Dominik1123 mannequin commented Nov 10, 2020

    Due to caching of __getitem__ for generic types, the order of arguments as returned by get_args might be different for Union:

    >>> from typing import List, Union, get_args
    >>> get_args(get_args(List[Union[int, str]])[0])
    (<class 'int'>, <class 'str'>)
    >>> get_args(get_args(List[Union[str, int]])[0])
    (<class 'int'>, <class 'str'>)

    This is because List[Union[int, str]] is List[Union[str, int]].

    I understand that caching is useful to reduce the memory footprint of type hints, so I suggest to update the documentation of get_args. At the moment it reads:

    For a typing object of the form X[Y, Z, ...] these functions return X and (Y, Z, ...).

    This seems to imply that the returned objects are identical to the ones in the form X[Y, Z, ...]. However that's not the case:

    >>> U1 = Union[int, str]
    >>> U2 = Union[str, int]
    >>> get_args(List[U1])[0] is U1
    True
    >>> get_args(List[U2])[0] is U2
    False

    I'm not so much concerned about the identity, but the fact that a subsequent call to get_args on the Union returns a different type seems to be relevant.

    So I propose to add the following sentence to the get_args docs:

    [...], it gets normalized to the original class.
    If X is a Union, the order of (Y, Z, ...) can be different from the one of the original arguments [Y, Z, ...].

    Or alternatively:

    [...], it gets normalized to the original class.
    If X is a Union, the order of (Y, Z, ...) is arbitrary.

    The second version is shorter but it's not completely accurate (since the order is actually not arbitrary).

    @Dominik1123 Dominik1123 mannequin added the 3.9 only security fixes label Nov 10, 2020
    @Dominik1123 Dominik1123 mannequin assigned docspython Nov 10, 2020
    @Dominik1123 Dominik1123 mannequin added docs Documentation in the Doc dir type-feature A feature request or enhancement 3.9 only security fixes labels Nov 10, 2020
    @Dominik1123 Dominik1123 mannequin assigned docspython Nov 10, 2020
    @Dominik1123 Dominik1123 mannequin added docs Documentation in the Doc dir type-feature A feature request or enhancement labels Nov 10, 2020
    @Fidget-Spinner
    Copy link
    Member

    You're right, currently this happens for 2 reasons:

    1. _SpecialGenericAlias (used by List), caches its __getitem__. (As you already pointed out :) )

    2. _UnionGenericAlias (Union)'s hash is hash(frozenset(self.__args__)). i.e. Unions with different args orders but same unique args produce the same hash result. Causing the same cache hit.

    I find it mildly sad however that:

    >>> get_args(Union[int, str])
    [int, str]
    
    >>> get_args(Union[str, int])
    [str, int]

    Which is slightly inconsistent with its behavior when nested in List. I don't think there's an easy way to fix this without breaking the cache (and also it makes sense that Unions' args aren't order dependent). So I'm all for updating the docs with your addition (slightly edited):

    If X is a Union, the order of (Y, Z, ...) may be different from the order of the original arguments [Y, Z, ...].

    @Fidget-Spinner Fidget-Spinner added 3.10 only security fixes labels Nov 11, 2020
    @gvanrossum
    Copy link
    Member

    Agreed it's mildly sad, and I wish the cache could preserve the order in List[Union[int, str]], but for that to work we'd have to change how the cache works, which feels complex, or we'd have to chance things so that Union[int, str] != Union[str, int], which seems wrong as well (and we've had them equal for many releases so this would break code).

    Fixing the cache would require adding a new comparison method to all generic type objects, and that just doesn't seem worth the effort (but I'd be open to this solution in the future).

    So for now, let's document that get_args() may swap Union arguments.

    @Fidget-Spinner
    Copy link
    Member

    Dominik, would you like to submit a PR for this :) ?

    @Dominik1123
    Copy link
    Mannequin Author

    Dominik1123 mannequin commented Nov 12, 2020

    Thinking more about it, I came to realize that it's not the Union that sits at the root of this behavior, but rather the caching performed by generic types in general. So if we consider

    L1 = List[Union[int, str]]
    L2 = List[Union[str, int]]
    

    then get_args(L1)[0] is get_args(L2)[0] and so get_args has no influence on the order of arguments of the Union objects (they are already the same for L1 and L2).

    So I think it would be more accurate to add the following sentence instead:

    If X is a generic type, the returned objects (Y, Z, ...) might not be identical to the ones used in the form X[Y, Z, ...] due to type caching.

    Everything else follows from there (including flattening of nested Unions).

    @gvanrossum
    Copy link
    Member

    Exactly!

    @miss-islington
    Copy link
    Contributor

    New changeset c3b9592 by Dominik1123 in branch 'master':
    bpo-42317: Improve docs of typing.get_args concerning Union (GH-23254)
    c3b9592

    @miss-islington
    Copy link
    Contributor

    New changeset 2369759 by Miss Islington (bot) in branch '3.9':
    bpo-42317: Improve docs of typing.get_args concerning Union (GH-23254)
    2369759

    @gvanrossum
    Copy link
    Member

    Thanks!

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes docs Documentation in the Doc dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants