-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flush buffered errors after each file is processed #1294
Comments
I'm going to try to take on this bug. I'll update my progress once I've figured out what I think is a good solution and then I'll come back with a reference to my PR once it's ready. |
wasn't sure if this was about unit tests or the error reporting at first. So I asked @gmprice for some help on this issue. So it looks like since there's now this concept of strongly connected components there's not this concept of processing one file at a time. So instead of trying to write errors after the end of a file we print after the end of a strongly connected component. Changes I intend to makeI'm going to change At the end of processing strongly connected components there shouldn't be any new errors. As such we should be able to output all of the error messages in a deduped manner. Test PlanI need to ensure that if I have the scenario of
In order to achieve this I'm going to mock the call to RefactorIt's also been requested that since I'm changing main.py that I also rename the one letter variables ( |
Note: there are many passes. You can only flush after the final pass is
done with an SCC. There should be an easy place where to do that.
…--Guido (mobile)
|
@gvanrossum Cool, thank you. I'll look for what line is the end of an SCC. I'm working going to work on the unit test after lunch. |
The right place should be at the very end of process_graph(), inside the
for-loop, right after it's called either process_fresh_scc() or
process_stale_scc().
One more concern I have (which @JukkaL can probably alleviate) is whether
there are cases where we have A imports B (so B is processed before A) but
somehow during the processing of A another error about B is reported.
|
So I've been struggling with this problem all day. So there's this problem where the callback way of doing things kind of breaks the way that unit tests are being done.
There's a bunch of tests that rely on the build.build function returning a result. But the problem is that if I want to stream results back I need to remove he errors from the internal manager that's used. This means that the returned result won't have any errors in it and none of the current tests will work. So this means I can keep the current call graph for the tests (everything just works like it currently does) and have a different call path for printing results. So currently I've changed the test suit to pass callback functions of |
The hardest part of this has just been figuring out how I'm going to actually test this. |
I agree the tests probably could use a different API. Or a flag. Or
something.
Do you have a draft of the code you're thinking of using yet? Maybe you can
push your branch without making it a PR yet?
|
All of the unit tests pass and there's no fundamental change to the code base other than adding the possibility of running the callback function at the end of the SCC loop. Quick note I had to move build manager in order to get the type checking to pass which is why the line count is so long. |
@rawrgulmuffins and I chatted for a bit about this. The tricky bit to test is the end-to-end fact that we manage to spit out the errors for file B before doing the work to type-check A, if We don't currently use |
Maybe you can just use a custom output_callback that records something the On Fri, Jun 3, 2016 at 6:58 PM, Greg Price notifications@github.com wrote:
--Guido van Rossum (python.org/~guido) |
As an update I have a failing unit test. I've set things up such that i'm temporarily setting I haven't made a PR for it but all of my code is on the branch that I referenced earlier. |
Great progress! Looking forward to the PR.
|
So this is still open and rawrgulmuffins' efforts have come to naught. It's a simple win though for large codebases like we have at Dropbox, where it's often frustrating to have to wait several extra minutes for the rest of the codebase to be processed when a simple (non-blocking) error is detected. In an abundance of caution it may be best to flush errors only after each SCC has been processed. For codebases with no or few import cycles this comes down to the same thing (each file will be in its own SCC, or perhaps there may be a few SCCs containing a handful of files), and due to the way SCCs are processed (see |
Note that if we do this, |
In order to avoid duplicate error messages for errors produced in both load_graph() and process_graph() and to prevent misordered error messages in a number of places, lists of error messages are now tracked per-file. These lists are collected and printed out when a file is complete. To maintain consistency with clients that use .messages() (namely, tests), messages are generated file-at-a-time even when not printing them out incrementally. Fixes #1294
In order to avoid duplicate error messages for errors produced in both load_graph() and process_graph() and to prevent misordered error messages in a number of places, lists of error messages are now tracked per-file. These lists are collected and printed out when a file is complete. To maintain consistency with clients that use .messages() (namely, tests), messages are generated file-at-a-time even when not printing them out incrementally. Fixes #1294
In order to avoid duplicate error messages for errors produced in both load_graph() and process_graph() and to prevent misordered error messages in a number of places, lists of error messages are now tracked per-file. These lists are collected and printed out when a file is complete. To maintain consistency with clients that use .messages() (namely, tests), messages are generated file-at-a-time even when not printing them out incrementally. Fixes #1294
Currently we only flush errors at the very end of a run. It might be nicer if we flushed errors after each file is fully processed (i.e. once the type checking pass has finished). We buffer errors so we can sort them and de-dupe them, but there shouldn't be any new errors for a file once it's been processed.
The text was updated successfully, but these errors were encountered: