Skip to content

Conversation

ilevkivskyi
Copy link
Member

@ilevkivskyi ilevkivskyi commented Sep 6, 2025

Fixes #4772

TBH I am a bit torn here. After some thinking this is definitely needed for cache correctness, OTOH it only affects some rare error messages at a cost of potentially triggering pointless re-checking. I am leanings toward actually doing this, we should start from correctness, and optimize later. Any objections, or other ideas?

@ilevkivskyi ilevkivskyi requested a review from JukkaL September 6, 2025 00:24

This comment has been minimized.

This comment has been minimized.

@JukkaL
Copy link
Collaborator

JukkaL commented Sep 6, 2025

Since the "defined here" note is quite rare, I think we could calculate/estimate the line number when the note is being generated (if the target function is in a cached module), by parsing the source code of the target module + doing semantic analysis pass 1, and then traversing the AST to find the target function. However, this is an optimization we can implement later. We should probably maintain some kind of cache to avoid slowing down error reporting in case there are some massive files, which we don't want o be parsing over and over again.

@ilevkivskyi
Copy link
Member Author

@JukkaL

by parsing the source code of the target module

This is only part of the problem. The other part is that as we discussed elsewhere, we should write the errors to cache meta, instead of always reprocessing files with errors from scratch. In this case we would need to know that whether the previously generated errors (that include line numbers for "defined here" notes) are still "fresh"? I am talking about a very rare scenario of course, but just want to mention this for completeness.

Also to clarify, do you think we should go ahead with this PR for now, or ignore this problem for now?

Copy link
Contributor

github-actions bot commented Sep 6, 2025

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

@JukkaL
Copy link
Collaborator

JukkaL commented Sep 8, 2025

In this case we would need to know that whether the previously generated errors (that include line numbers for "defined here" notes) are still "fresh"?

One way to deal with this would be to store the error message locations in a more abstract form, e.g. keep the target fullname instead of file + line number. We'd then convert the full name to file + line number when reporting the error message to the user.

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems acceptable temporarily, but this could slow down incremental builds quite significantly, as most edits will cause line numbers to change and thus will change the public interface of a module even if it's e.g. a comment only change.

@ilevkivskyi
Copy link
Member Author

One way to deal with this would be to store the error message locations in a more abstract form, e.g. keep the target fullname instead of file + line number.

Hm, yeah, I was thinking about something like this. I will hold off merging this until I implement storing error messages, to see how easy it is (there is a chance it will not take much JSON wrangling).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"<function> defined here" notes omitted when function is loaded from cache
2 participants