Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up isinstance on union types #91603

Closed
kumaraditya303 opened this issue Apr 16, 2022 · 10 comments · Fixed by #91955
Closed

Speed up isinstance on union types #91603

kumaraditya303 opened this issue Apr 16, 2022 · 10 comments · Fixed by #91955
Labels
3.11 only security fixes performance Performance or resource usage topic-typing

Comments

@kumaraditya303
Copy link
Contributor

PEP 604 allows isinstance on union types but it much slower compared to a tuple of types. This can be speed up with a fast path. Currently it is almost 4x slower, so there is room for improvement.

Benchmarks:

@kumaraditya303 ➜ /workspaces/cpython (fast-iter-str) $ ./python -m timeit "isinstance(1, (int, str))"
5000000 loops, best of 5: 54.6 nsec per loop
@kumaraditya303 ➜ /workspaces/cpython (fast-iter-str) $ ./python -m timeit "isinstance(1, int|str)"
1000000 loops, best of 5: 202 nsec per loop
@kumaraditya303 kumaraditya303 added performance Performance or resource usage 3.11 only security fixes labels Apr 16, 2022
@Fidget-Spinner
Copy link
Member

Fidget-Spinner commented Apr 16, 2022

Wow I'm surprised it's that much slower.

Out of curiosity, in the C code, why not pass the tuple (int|str).__args__ into our isinstance code handling the tuple form? That way both will perform the same. And we also get to share code between the second arg tuple form and second arg union form.

@JelleZijlstra
Copy link
Member

JelleZijlstra commented Apr 16, 2022

isinstance() is basically defined as:

def isinstance(inst, cls):
    if type(inst) is cls:
        return True
    if type(cls) is type:
        return type(inst) in cls.mro()
    if tuple in type(cls).mro():
        return any(isinstance(inst, c) for c in cls)
    if hasattr(type(cls), "__instancecheck__"):
        return type(cls).__instancecheck__(cls, inst)
    if type in type(cls).mro():
        ...
    raise TypeError

types.Union is probably slow because it goes through the __instancecheck__ path. Instead, we could add a fast path in object_recursive_isinstance that checks directly for types.UnionType and avoids the method call.

@Fidget-Spinner
Copy link
Member

@JelleZijlstra thanks for the explanation. I was wondering if we can do something like this in C:
Before:

def isinstance(inst, cls):
    if type(inst) is cls:
        return True
    if type(cls) is type:
        return type(inst) in cls.mro()
    if type(cls) is tuple:
        for t in cls:
            isinstance(cls, t)
    ...
    raise TypeError

After:

def isinstance(inst, cls):
    if type(inst) is cls:
        return True
    if type(cls) is type:
        return type(inst) in cls.mro()
    if type(cls) is types.UnionType:
        cls = cls.__args__
    # fall through and let the code below handle this
    if type(cls) is tuple:
        for t in cls:
            isinstance(cls, t)
    ...
    raise TypeError

uriyyo added a commit to uriyyo/cpython that referenced this issue Apr 17, 2022
@uriyyo
Copy link
Member

uriyyo commented Apr 17, 2022

@Fidget-Spinner @JelleZijlstra

Currently, there is a difference in behavior between tuple check and types.UnionType check:

instance(1, (int, list[str]))
instance(1, int | list[str]) # will raise TypeError

Should implementation should take into account such differences?

@JelleZijlstra
Copy link
Member

@uriyyo that's pretty surprising, because isinstance(1, int | Any) succeeds (even though isinstance(1, Any | int) fails). I'd be OK with allowing these calls with parameterized generics too. We could add a sentence to https://docs.python.org/3.10/library/functions.html#isinstance saying that TypeError may not be raised for an invalid type if an earlier check succeeds.

@JelleZijlstra
Copy link
Member

I noticed a big chunk of the difference is actually from the construction of the Union instance, not from the isinstance() call:

% ./python.exe -m timeit "int|str"
2000000 loops, best of 5: 107 nsec per loop
% ./python.exe -m timeit "(int, str)"
10000000 loops, best of 5: 36.8 nsec per loop
% ./python.exe -m timeit "isinstance(1, int|str)"   
2000000 loops, best of 5: 169 nsec per loop
% ./python.exe -m timeit "isinstance(1, (int, str))"
5000000 loops, best of 5: 56.3 nsec per loop
>>> dis.dis("int|str")
              0 RESUME                            0

  1           2 LOAD_NAME                         0 (int)
              4 LOAD_NAME                         1 (str)
              6 BINARY_OP                         7 (|)
             10 RETURN_VALUE
>>> dis.dis("(int, str)")
              0 RESUME                            0

  1           2 LOAD_NAME                         0 (int)
              4 LOAD_NAME                         1 (str)
              6 BUILD_TUPLE                       2
              8 RETURN_VALUE

Reading the code, this is what int|str does:

  • BINARY_OP opcode
  • Call PyNumber_Or
  • Call the nb_or slot on type
  • Call _Py_union_type_or
  • Check that the LHS and RHS are legal as members of a union (they're None, types, generic aliases, or unions)
  • Put the LHS and RHS in a new tuple (which is the only thing that (int, str) ever has to do).
  • Use that tuple to construct a new Union instance, but first call dedup_and_flatten_args to clean things up.

@Fidget-Spinner do you have ideas on how we can speed this up?

A couple of ideas:

  • We do some redundant work in this code path, such as checking that the LHS is a type (we know it is, because we came here from type.__or__), and trying to dedupe the args even though we already know they're both types. Maybe we can skip some of these checks.
  • For slices, we keep a single slot around in the interpreter to store slice objects, so it's really fast to make a new one. Maybe we can do something similar with union objects.

Regardless, it's still helpful to speed up the isinstance() call, so we should merge @uriyyo's change special casing UnionType in isinstance().

@uriyyo
Copy link
Member

uriyyo commented Apr 17, 2022

@JelleZijlstra I have an idea have we can speed up types.UnionType instantiation. I will try to come up with PR today or tomorrow.

serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Apr 26, 2022
Co-authored-by: Yurii Karabas <1998uriyyo@gmail.com>
serhiy-storchaka added a commit that referenced this issue Apr 28, 2022
Reduce the complexity from O((M+N)^2) to O(M*N), where M and N are the length
of __args__ for both operands (1 for operand which is not a UnionType).

As a consequence, the complexity of parameter substitution in UnionType has
been reduced from O(N^3) to O(N^2).

Co-authored-by: Yurii Karabas <1998uriyyo@gmail.com>
Fidget-Spinner pushed a commit that referenced this issue Apr 28, 2022
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
@uriyyo
Copy link
Member

uriyyo commented Apr 28, 2022

@JelleZijlstra @Fidget-Spinner We introduced _Py_union_args function that returns __args__ attribute of UnionType object.

What do you think about using this function instead of accessing by pointer? As for me, it will be more readable and maintainable.

For instance:

    PyObject *a_set = PySet_New(((unionobject*)a)->args); // old
    PyObject *a_set = PySet_New(_Py_union_args(a));       // new

I am happy to make PR with such improvement.

@serhiy-storchaka
Copy link
Member

There is no _Py_union_args in the current code. And I do not see a value of introducing it if it just returns the field of a structure. It would rather make the code less transparent.

@Fidget-Spinner
Copy link
Member

There is no _Py_union_args in the current code.

I just merged a PR using it. But yeah I agree with Serhiy that we don't need it if it's used internally. I think the benefit exists if it's used in an external file with no access to the unionobject struct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes performance Performance or resource usage topic-typing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants