Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use absolute path when checking source duplication error #9059

Merged
merged 6 commits into from Oct 18, 2020

Conversation

bonprosoft
Copy link
Contributor

closes #9058

I updated both seen_files and newst to use an absolute path to fix the problem.

mypy/build.py Outdated
@@ -2754,7 +2754,7 @@ def load_graph(sources: List[BuildSource], manager: BuildManager,

# Note: Running this each time could be slow in the daemon. If it's a problem, we
# can do more work to maintain this incrementally.
seen_files = {st.path: st for st in graph.values() if st.path}
seen_files = {os.path.abspath(st.path): st for st in graph.values() if st.path}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is very performance critical, and os.path.abspath is too slow to call here since it can make a syscall. We'd need a way to do this without introducing much overhead (graph may have 100k+ values).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your review!
I understand. Then, how about looking up both absolute path and relative path of newst.path in seen_files at this part:

mypy/mypy/build.py

Lines 2805 to 2807 in d9c5c65

newst_path = os.path.abspath(newst.path)
if newst_path in seen_files:

Copy link
Contributor Author

@bonprosoft bonprosoft Jul 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated my PR: c1f7d1e
Could you please check if it is reasonable?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about going back to the original approach, but adding an abspath field to State that is computed in State's constructor, instead of recomputing it on each load_graph?

(In the daemon, load_graph gets called a lot on a big graph but with only a few sources changed. We do some things that are linear in the graph but we feel bad about it and try to make sure that they aren't expensive things.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK actually I just went and did that thing. Thank you!

@msullivan msullivan merged commit e4131a5 into python:master Oct 18, 2020
@bonprosoft bonprosoft deleted the fix_source_duplicate_check branch October 23, 2020 03:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"Source file found twice under different module names" doesn't work in some cases
3 participants