Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Registering many errors blocks emacs for a noticeable amount of time #1168

Closed
jyp opened this issue Nov 21, 2016 · 23 comments
Closed

Registering many errors blocks emacs for a noticeable amount of time #1168

jyp opened this issue Nov 21, 2016 · 23 comments

Comments

@jyp
Copy link

jyp commented Nov 21, 2016

This is not so much a bug report, more of a question and perhaps a feature request.

My situation is that I sometimes flycheck big files, which may contain a lot of warnings (maybe a thousand). In this situation, flycheck sometimes blocks emacs for half a second or so. This happens even when the checker is run asynchronously. (I have written my own checker, so I am familiar with the internals. The blocking happens when the checker function calls its callback with the list of warnings.)

There could be two ways to solve the problem:

  1. Speed up the reporting of errors by a factor of 10 or more
  2. Make the reporting of errors fully asynchronous. That is, the checker function (that which is stored in the :start property) could call its callback several times. Say, in addition of
‘finished’
     The syntax checker has finished with a proper error report
     for the current buffer.  DATA is the (potentially empty)
     list of ‘flycheck-error’ objects reported by the syntax
     check.

     This report finishes the current syntax check.

we would have

‘partial’
     The syntax checker has some errors to report 
     for the current buffer.  DATA is the (probably non-empty)
     list of ‘flycheck-error’ objects reported by the syntax
     check in this partial run. 

     This report does not finish the current syntax check.

Perhaps this would be easy to implement? Perhaps there is even a way today to report errors asynchronously? I can offer help implementing the feature.

@swsnr
Copy link
Contributor

swsnr commented Nov 21, 2016

@jyp Thank you for this issue, but with all due respect I think you're underestimating the issue.

Speed up the reporting of errors by a factor of 10 or more

It's not as if we kind of deliberately slowed down error reporting to make someone find that hidden potential. Flycheck uses overlays (they are the only reasonable API for our purposes) which are known to have poor performance. As far as I know they even have quadratic complexity, i.e. adding 1000 overlays is a hundred times slower than adding 100 overlays.

Please go ahead and profile our code, but I do think that you'll find a tenfold speedup in there.

As for asynchronous reporting, I'm not sure whether that'd fix the issue. We can't add overlays asychronously, so all we'd achieve is splitting a 0.5s freeze into a couple of smaller freezes.

@cpitclaudel
Copy link
Member

As far as I know they even have quadratic complexity, i.e. adding 1000 overlays is a hundred times slower than adding 100 overlays.

Spooky. There was some discussion on emacs-devel of migrating overlays to the interval tree used by text properties. @jyp could you try adding (overlay-recenter (point-max)) to the beginning of flycheck-report-current-errors, and seeing if it makes any difference?

@lunaryorn Do you think the issue actually comes from overlay placement, though? IIUC each newly placed overlay requires calling flycheck-error-thing-region, which itself calls flycheck-error-column-region, which does (goto-char (point-min)) and (forward-line (- (flycheck-error-line err) 1)). Isn't that part quadratic?

@swsnr
Copy link
Contributor

swsnr commented Nov 21, 2016

@cpitclaudel Overlays in and by themselves are, but that part surely doesn't improve performance. That said, it's the only idea I got to move from lines which is what we get from syntax checkers to buffer positions which is what we need to add overlays. This will always involve scanning over lines.

So yes, we could even go as far as to say that adding a Flycheck error has cubic complexity to the number of lines in the buffer.

@cpitclaudel
Copy link
Member

Makes sense :) We could shave of a linear factor, though: if errors are on line 1, 5, 9, 153, and 1044, we should be able to just scan from 0 to 1, then 1 to 5, then 5 to 9, then 9 to 153, and finally 153 to 1044, instead of scanning from 0 to 1, then 0 to 5, then 0 to 9, then 0 to 153, then 0 to 1044.

This should provide significant time savings; though indeed the quadratic overlays remain

@cpitclaudel
Copy link
Member

@cpitclaudel
Copy link
Member

@jyp, do you have a concrete example that I could look at?

cpitclaudel added a commit that referenced this issue Nov 22, 2016
@cpitclaudel
Copy link
Member

I've implemented the change above in branch 1168-faster-overlays. On a 1500-errors python file, overlay placement goes down from 0.24 seconds to 0.08 seconds.

@jyp, Can you try that branch out and see whether it helps? @lunaryorn, I've opened a PR#1169 to discuss the change there.

cpitclaudel added a commit that referenced this issue Nov 22, 2016
cpitclaudel added a commit that referenced this issue Nov 22, 2016
cpitclaudel added a commit that referenced this issue Nov 22, 2016
@cpitclaudel
Copy link
Member

cpitclaudel commented Nov 23, 2016

Ok, I've collected more data on a large buffer. All examples below are with caching on, except the last one:

  • Recentering helps with deletion:
    2.40s [delete all overlays, no recenter]
    2.33s [delete all overlays, recenter at point-min]
    0.63s [delete all overlays, recenter at point-max]
  • Recentering overlays also helps with adding overlays:
    2.65s [add-overlay with no recenter]
    12.93s [add-overlay with no recenter]

    2.58s [add-overlay with recenter on point-min]
    10.91s [add-overlay with recenter on point-min]

    1.60s [add-overlay with recenter on point-max]
    6.43s [add-overlay with recenter on point-max]
  • Profiling shows that line-end-position is a culprit (perversely, line-end-position itself doesn't appear in the profile. But + inside of min in flycheck-error-column-region does appear, and it takes 30% of the time). And indeed:
    0.62s [add-overlay with recenter on point-max and inhibit-field-text-motion set to t]
    1.52s [add-overlay with recenter on point-max and inhibit-field-text-motion set to t]

Interestingly, the line number cache of #1169 only makes a small difference here; if I disable it, I get:

    2.80s [add-overlay with no recenter, inhibit-field-text-motion set to t, and no cache]
    13.39s [add-overlay with no recenter, inhibit-field-text-motion set to t, and no cache]

I also ran measurements on a more realistic buffer, with 468 errors (just above our limit of 400 errors):

    0.0041s [add-overlay with no optimizations]
    0.0144s [add-overlay with no optimizations]

    0.0038s [add-overlay with caching]
    0.0135s [add-overlay with caching]

    0.0033s [add-overlay with caching and recenter]
    0.0133s [add-overlay with caching and recenter]

    0.0021s [add-overlay with caching, recenter, and inhibit-field-text-motion]
    0.0083s [add-overlay with caching, recenter, and inhibit-field-text-motion]

So the caching in fact makes relatively little difference here. I looked at the C sources, and indeed forward-line already uses a cache (!). So maybe my quick measurements of earlier were not too relevant (?).

In any case, all these optimizations together yield an 8 to 9-fold speedup for buffers with huge numbers of errors, and a close-to-2-fold speedup for more reasonable buffers :) It's still far from ideal, because the sheer number of overlays makes Emacs itself slow (very slow in the first case, and a bit in the second one). But at least with these optimizations Flycheck doesn't contribute quite as much :)

Do I update PR #1169 to include all optimizations, or do I make separate PRs? And do I remove the line cache, since further testing suggests that it makes at best a small difference compared to the the other optimizations?

@cpitclaudel
Copy link
Member

A new profile with all these optimizations on a reasonable buffer suggests that 40% of the time is spent in delete-overlay, 30% in line-end-position, and the rest in various places, so I don't think there's much we can do beyond this.

@jyp
Copy link
Author

jyp commented Nov 23, 2016

@cpitclaudel Impressive work! I'll try it when I get the chance.

cpitclaudel added a commit that referenced this issue Nov 23, 2016
* Add appropriate overlay-recenter calls
* Cache line numbers to speed up forward-line
* Bind inhibit-field-text-motion to t to speed up line-end-position

This is a partial fix for GH-1168 (which see for timings).
Related issues: GH-153, GH-154, GH-179, GH-476.
cpitclaudel added a commit that referenced this issue Nov 23, 2016
* Add appropriate overlay-recenter calls
* Cache line numbers to speed up forward-line
* Bind inhibit-field-text-motion to t to speed up line-end-position

This is a partial fix for GH-1168 (which see for timings).
Related issues: GH-153, GH-154, GH-179, GH-476.
cpitclaudel added a commit that referenced this issue Nov 23, 2016
* Add appropriate overlay-recenter calls
* Bind inhibit-field-text-motion to t to speed up line-end-position

This is a partial fix for GH-1168 (which see for timings).
Related issues: GH-153, GH-154, GH-179, GH-476.
@swsnr
Copy link
Contributor

swsnr commented Nov 23, 2016

@cpitclaudel 😍 Thanks so much, that's amazing work.

Just a quick recap to make sure that I understand things correctly:

  1. With few errors the result is indifferent: It's faster, but was fast enough before, so we really optimise for is buffers with many errors.
  2. The line number cache from Improve overlay performance #1169 makes only a small difference, which is expected since forward-line implements its own internal cache, so we're actually just caching the same data twice.
  3. overlay-recenter makes a huge difference with many errors.
  4. inhibit-field-text-motion contributes much as well.

Does that more or less nail down your results?

@cpitclaudel
Copy link
Member

Absolutely. Based on this, I'd suggest dropping the line cache, and merging 3 and 4. I'll post the files I used for testing so that we can compare results (I'm surprised by the impact of inhibit-field-text-motion).

@cpitclaudel
Copy link
Member

cpitclaudel added a commit that referenced this issue Nov 23, 2016
* Add appropriate overlay-recenter calls
* Bind inhibit-field-text-motion to t to speed up line-end-position

This is a partial fix for GH-1168 (which see for timings).
Related issues: GH-153, GH-154, GH-179, GH-476.
cpitclaudel added a commit that referenced this issue Nov 23, 2016
* Add appropriate overlay-recenter calls
* Bind inhibit-field-text-motion to t to speed up line-end-position

This is a partial fix for GH-1168 (which see for timings).
Related issues: GH-153, GH-154, GH-179, GH-476.
@swsnr
Copy link
Contributor

swsnr commented Nov 23, 2016

@cpitclaudel Thanks for confirming.

I'd suggest to make a separate pull request for recentering and inhibit text field motion and merge that first. Let's keep #1169 open meanwhile; we can reevaluate it after merging the other optimisations.

I wonder whether we should look into integrating automated profiles for these critical code paths in our testing on Travis CI 😎 I don't have time to investigate whether and how that's possible, but I hope to find a quiet evening or weekend to go down that road: Looks really interesting, and a thorough profile of Flycheck's critical code paths would probably reveal some other potentials. I didn't particularly care for performance when I wrote Flycheck 😊


Besides, I love how the ELisp reference documents this:

A loop that scans the buffer forwards, creating overlays, can run faster if you do (overlay-recenter (point-max)) first.

"Can run faster" 😂 Indeed it does; lesson learned: Always read the manual very carefully and to the very end, for the best things seem to be documented in subordinate clauses down at the very bottom of the screen. Gems are always hardest to find 😎

@cpitclaudel
Copy link
Member

I'd suggest to make a separate pull request for recentering and inhibit text field motion and merge that first. Let's keep #1169 open meanwhile; we can reevaluate it after merging the other optimisations.

Ah, blast, I already edited #1169; let's do it the other way around (I'll open a new PR with the line cache reintroduced after we merge the clear-win optimizations).

@cpitclaudel
Copy link
Member

Besides, I love how the ELisp reference documents this:

A loop that scans the buffer forwards, creating overlays, can run faster if you do (overlay-recenter (point-max)) first.

Yup :) When I saw this post I had a vague feeling of "I wonder if I read something about this somewhere" :)

@cpitclaudel
Copy link
Member

I wonder whether we should look into integrating automated profiles for these critical code paths in our testing on Travis CI

This sounds like a fun task — and a useful thing to have, too :)

@swsnr
Copy link
Contributor

swsnr commented Nov 23, 2016

@cpitclaudel Whatever way you see fit, as long as we've got the commit implementing the line cache available somewhere. Would be a shame if it gets lost, and we later realize that it would have been a good idea ☺️

@fmdkdd
Copy link
Member

fmdkdd commented Jul 17, 2017

@cpitclaudel What's the status on that? #1169 has been merged, so has it improved the situation for you @jyp, or are there other venues for optimization?

@cpitclaudel
Copy link
Member

I think things are a lot better now. There's ongoing work on the Emacs side to improve overlay performance, and i can't think of much that we could do on our side.

That being said, there's still space for improvement for cases where too many errors are reported. We could e.g.

  • Stop parsing errors after parsing more than the threshold (instead of parsing them all and then disabling the checker for future runs)
  • Add a threshold on output size (in lines, or even more easily in bytes), and just disable the checker if it produces more than that
  • Add a threshold on input size (in bytes), and not send files that are larger than this threshold

The last two in particular would make things much better when using Flycheck on e.g. large minified javascript files (right now we freeze Emacs for a while just writing the whole file to the checker's stdin)

@fmdkdd
Copy link
Member

fmdkdd commented Jul 17, 2017

@cpitclaudel All of this looks good and doable. I've opened an issue for that. Thanks!

@cpitclaudel
Copy link
Member

Thanks!

@cpitclaudel
Copy link
Member

All remaining work for this issue is tracked in #1284 and #1471

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants