Skip to content

Conversation

ilevkivskyi
Copy link
Member

@ilevkivskyi ilevkivskyi commented Sep 5, 2025

Wow, this was quite a ride. Indirect dependencies were always supported kind of on best effort. This PR puts them on some principled foundation. It fixes three crashes and three stale types reported. All tests are quite weird/obscure, they are designed to expose the flaws in current logic (plus one test that passes on master, but it covers important corner case, so I add it just in case ). A short summary of various fixes (in arbitrary order):

  • Update many outdated comments and docstrings
  • Missing transitive dependency is now considered stale
  • Handle transitive generic bases in indirection visitor
  • Handle chained alias targets in indirection visitor
  • Always record original aliases during semantic analysis
  • Replace ad-hoc module_refs logic in type checker with more principled one during semantic analysis
  • Delete qualified_tvars as a concept, they are not needed since long ago
  • Remove ad-hoc handling for TypeInfos from build.py
  • Support symbols with setter type different from getter type

In general the logic should be more simple/straightforward now:

  • Get all symbols we try to access in a module and record the modules they were defined in (note this automatically handles problem with possible excessive get_proper_type() calls).
  • Get all types in a type map, for each type transitively find all named types in them (thus aggregating all interfaces the type depends on)

Note since this makes the algorithm correct, it may also make it slower (most notably because we must visit generic bases). I tried to offset this by couple optimizations, hopefully performance impact will be minimal. On my machine slow down is ~0.6%

@ilevkivskyi ilevkivskyi requested a review from JukkaL September 5, 2025 13:54

This comment has been minimized.

This comment has been minimized.

@ilevkivskyi
Copy link
Member Author

OK, I updated the PR to handle other issues mentioned in #933 as well. FWIW it actually looks more elegant/simple now. I will now try to minimize the performance impact if possible.

@ilevkivskyi
Copy link
Member Author

I made one more change, now I am adding only actual module as a dependency, not all its parents. Not sure why this was needed, as no tests failed when I removed them. IMO this is conceptually correct: indirect dependency, is a dependency on the place where an actual symbol is defined.

Also it terms of performance I measure ~0.6% slow down, which is IMO totally acceptable.

This comment has been minimized.

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, the original logic always seemed a little sketchy. I agree that the performance regression is worth it, and I can't think of a way to do this more efficiently while retaining correctness. It's important to get this right or we risk random false positives and false negatives once we support parallel type checking.

ilevkivskyi added a commit that referenced this pull request Sep 8, 2025
This makes self-check almost 2% faster on my desktop (Python 3.12,
compiled -O2). Inspired by a slight regression in
#19798 I decided to re-think how we
detect/label the recursive types.

The new algorithm is not 100% equivalent to old one, but should be much
faster. The main semantic difference is this:
```python
A = list[B1]
B1 = list[B2]
B2 = list[B1]
```
previously all three aliases where labeled as recursive, now only last
two are. Which is kind of correct if you think about it for some time,
there is nothing genuinely recursive in `A` by itself. As a result:
* We have somewhat more verbose `reveal_type()` for recursive types
after fine-grained increments. Excessive use of `get_proper_type()` in
the daemon code is a known issue. I will take a look at it when I will
have a chance.
* I cleaned up some of relevant visitors to be more consistent with
recursive aliases.
* I also do couple cleanups/speedups in the type queries while I am at
it.

If there are no comments/objections, I will merge it later today. Then I
will merge #19798, and then _maybe_
an equivalent optimization for recursive instances like `class
str(Sequence[str]): ...`
Copy link
Contributor

github-actions bot commented Sep 8, 2025

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

@ilevkivskyi ilevkivskyi merged commit ccd290c into python:master Sep 8, 2025
20 checks passed
@ilevkivskyi ilevkivskyi deleted the simplify-indirect branch September 8, 2025 23:29
JukkaL added a commit that referenced this pull request Sep 22, 2025
SCC construction should only consider import dependencies, since
indirect dependencies are not available during non-incremental runs.
I think otherwise SCCs can be different in non-incremental vs incremtal
runs.

Attempt to fix an issue with excessively large SCCs computed
(sometimes) after #19798 when doing incremental runs. I haven't
been able to create a self-contained test case that reproduces
the issue, but this appears to fix this issue with an internal
codebase.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants