Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memoize syntax-ppss in a couple places #629

Closed
wants to merge 5 commits into from

Conversation

aaronjensen
Copy link
Contributor

These places are called many times each operation with the same point.
syntax-ppss can be slow in some modes, like elixir-mode.

Before

+ flyspell-post-command-hook                                      694  64%
- command-execute                                                 291  26%
 - call-interactively                                             291  26%
  - funcall-interactively                                         291  26%
   - self-insert-command                                          275  25%
    - sp--post-self-insert-hook-handler                           261  24%
     - condition-case                                             261  24%
      - if                                                        261  24%
       - progn                                                    261  24%
        - if                                                      261  24%
         - let                                                    261  24%
          - let                                                   261  24%
           - cond                                                 261  24%
            - if                                                  259  24%
             - setq                                               259  24%
              - progn                                             259  24%
               - let                                              200  18%
                - cdr                                             200  18%
                 - sp--all-pairs-to-insert                        200  18%
                  - let                                           200  18%
                   - let                                          199  18%
                    - let                                         166  15%
                     - while                                      166  15%
                      - let                                       166  15%
                       - if                                       166  15%
                        - sp--do-action-p                         165  15%
                         - let*                                   133  12%
                          - if                                     81   7%
                           - sp-point-in-string                    81   7%
                            - condition-case                       78   7%
                             - progn                               78   7%
                              - save-excursion                     77   7%
                               - nth                               77   7%
                                + syntax-ppss                      73   6%
                          + cond                                   47   4%
                          + sp-get-pair                             3   0%
                          + sp--get-closing-regexp                  1   0%
                         + setq                                    32   2%
                        + progn                                     1   0%
                    + if                                           33   3%
               + sp-insert-pair                                    53   4%
               + sp-skip-closing-pair                               6   0%
            + sp--char-is-part-of-closing                           2   0%
    + flycheck-handle-change                                        8   0%
    + smie-blink-matching-open                                      2   0%
      jit-lock-after-change                                         2   0%
    + expand-abbrev                                                 1   0%
   + newline-and-indent                                             7   0%
   + evil-normal-state                                              5   0%
   + profiler-report                                                4   0%
+ redisplay_internal (C function)                                  28   2%
+ evil-escape-pre-command-hook                                     22   2%
+ ...                                                              20   1%
+ timer-event-handler                                              12   1%
+ sp--save-pre-command-state                                        5   0%
+ winner-save-old-configurations                                    3   0%
+ alchemist-company-filter                                          2   0%
+ company-post-command                                              1   0%
+ which-key--hide-popup                                             1   0%

After

+ flyspell-post-command-hook                                      917  75%
- command-execute                                                 217  17%
 - call-interactively                                             217  17%
  - funcall-interactively                                         217  17%
   + newline-and-indent                                           202  16%
   + evil-normal-state                                              6   0%
   + profiler-report                                                4   0%
   + evil-open-below                                                3   0%
   - self-insert-command                                            1   0%
      jit-lock-after-change                                         1   0%
+ redisplay_internal (C function)                                  26   2%
+ ...                                                              13   1%
+ timer-event-handler                                              12   0%
+ winner-save-old-configurations                                    7   0%
+ evil-repeat-post-hook                                             4   0%
+ evil-escape-pre-command-hook                                      4   0%
+ evil-repeat-pre-hook                                              3   0%
+ alchemist-company-filter                                          3   0%
+ hl-paren-initiate-highlight                                       1   0%

These profiles were taken with #628 included as well.

@@ -8062,6 +8072,10 @@ the opening delimiter or before the closing delimiter."

(defvar sp-show-pair-enc-overlays nil)

(defvar-local sp-last-point nil)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, we had this variable before, I got rid of it at some point (but it was used for different purpose).

To save the state I've added the defstruct called sp-state. Please extend that. I'll slowly refactor all the code to use it as the only source of the buffer-local state (less buffer-local variable = better performance, and we have quite a few).

Also I think we could use better name, this is only related to point at which we test syntax-ppss, whereas it, at least to me, invokes the idea of the last position of the point before current command.

Otherwise this seems like a fine idea!

@aaronjensen
Copy link
Contributor Author

@Fuco1 Updated, thanks for the pointer. Please let me know if you'd like me to squash the commits, otherwise I think github's squash and merge would probably do a fine job of it.

(let ((result (syntax-ppss p)))
(setf (sp-state-last-syntax-ppss-point sp-state) p
(sp-state-last-syntax-ppss-result sp-state) result)
result))))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently I could inline the let and remove the final result here and it'd do the same thing because of the behavior of setf of returning the last val. This would be more concise but maybe not more clear if that behavior is not a well known one:

(defun sp--syntax-ppss (&optional p)
  "Memoize the last result of syntax-ppss."
  (let ((p (or p (point))))
    (if (eq p (sp-state-last-syntax-ppss-point sp-state))
        (sp-state-last-syntax-ppss-result sp-state)
      (setf (sp-state-last-syntax-ppss-point sp-state) p
            (sp-state-last-syntax-ppss-result sp-state) (syntax-ppss p)))))

let me know if you prefer this form

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use that often with setq so I don't think it will be a problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, be8e135

@Fuco1
Copy link
Owner

Fuco1 commented Jul 12, 2016

I don't see the squash option, the only button available is "Merge pull request". Do I need to enable it somewhere?

@aaronjensen
Copy link
Contributor Author

I don't see the squash option, the only button available is "Merge pull request". Do I need to enable it somewhere?

Once you click "Merge pull request" the button changes to one with a dropdown where you can select between the two options. If you don't see the squash option, it can be enabled here: https://github.com/Fuco1/smartparens/settings in the "Merge button" section

@aaronjensen
Copy link
Contributor Author

@Fuco1 is something in this branch causing the builds to timeout? I may need a hand diagnosing that.. that seems tricky

@Fuco1
Copy link
Owner

Fuco1 commented Jul 12, 2016

Probably just travis glitches, it sometimes happens.

One possible cause I can think of is that while the point might remain the same, the state of the buffer might change (e.g. some character is replaced without the point ever changing position). Then the parse state might change but you never catch it.

I don't think hashing the buffer content might be a great idea as that can take quite some time. Maybe checking the buffer-modified flag as well? (I don't know how that behaves with respect to atomicity though... is it only set after the entire series of functions is called from a command or does it really change right after each buffer content change?)

@aaronjensen
Copy link
Contributor Author

What about clearing on idle? Seems like a super edge case

@aaronjensen
Copy link
Contributor Author

I guess it's not an edge case, there are probably things like revert that could do this?

The other option is to build it into a wrapping macro/function like sp--with-case-sensitive where we clear it upon entry into it. that'd be the safest and ensure we only memoize within a given function invocation. Of course, that complicates things and if we're not careful w/ the usage of it (we'd actually probably want yet another wrapping function so it wasn't cleared every call to sp--looking-*) then we could lose benefits as it'd clear the memo too often.

@aaronjensen
Copy link
Contributor Author

I reset the memoization in the post command hook 56c5b74, do you think that is sufficient?

@dgutov
Copy link

dgutov commented Jul 20, 2016

syntax-ppss can be slow in some modes, like elixir-mode.

Is it really? How does that happen? Even if elixir-mode has a relatively slow syntax-propertize-function, if you're calling syntax-ppss multiple times from one position, it should just use the value saved to syntax-ppss-last, and not propertize anything.

@Fuco1
Copy link
Owner

Fuco1 commented Jul 20, 2016

@dgutov Thanks for pointing that out. Seems like it already does some form of caching by default.

I've checked the code of syntax-ppss and I can't say I'm any wiser. But the profiles above clearly shows that caching helps. Which leads me to two conclusions:

  • either the caching in emacs's own implementation sucks
  • or we cache more than we should (i.e. we don't invalidate properly)

It is probably safer to assume the bug is on our side then. But where could lie the problem (if there even is one??). Tests are green at least.

@Fuco1
Copy link
Owner

Fuco1 commented Jul 20, 2016

@aaronjensen re timeouts, it seems as an infinite loop to me. Aren't you doing some recursion somewhere? Because some half of the tests seem to run and then suddenly it stops.

@aaronjensen
Copy link
Contributor Author

@dgutov syntax-ppss appears to do parse-partial-sexp from the old position to the current position. In elixir, the old position is typically 1, which is probably because it is the outermost form.

As a matter of fact, there is a related comment:

                 ;; If `pt-min' is too far from `pos', we could try to use
                 ;; other positions in (nth 9 old-ppss), but that doesn't
                 ;; seem to happen in practice and it would complicate this
                 ;; code (and the before-change-function code even more).
                 ;; But maybe it would be useful in "degenerate" cases such
                 ;; as when the whole file is wrapped in a set
                 ;; of parentheses.

I wonder if elixir presents this exact "degenerate" case.

Either way, the slow bit is the parse-partial-sexp which ultimately is called about 100 times or so for every keypress by smartparens.

@aaronjensen
Copy link
Contributor Author

aaronjensen commented Jul 20, 2016

@Fuco1 The tests actually pointed to a problem. Thanks to syntax-ppss's code I found the right place to clear the memoization. This should be good to go as the tests now pass aside from one 24.3 ruby test (does that usually pass?)

@Fuco1
Copy link
Owner

Fuco1 commented Jul 20, 2016

That one test failed before as well. It was introduced by some change in advices which I can't track down. It should be completely unrelated.

@dgutov
Copy link

dgutov commented Jul 20, 2016

@Fuco1 Maybe the bug is here. A bug is Emacs is also possible (I've only recently found a problem in jit-lock where re-fontifying an already fontified region was an order of magnitude more expensive than it has to be). Finding and fixing it would be a great result.

@dgutov
Copy link

dgutov commented Jul 20, 2016

@aaronjensen

In elixir, the old position is typically 1

When does this happen? Normally, old-pos is taken from syntax-ppss-last, and syntax-ppss always assigns it at the end.

As a matter of fact, there is a related comment:

IIUC, that branch is only taken when old-pos is nil, i.e. when you're calling syntax-ppss from an earlier position than you called it before. So you're calling it from many different positions?

@aaronjensen
Copy link
Contributor Author

@dgutov

When does this happen? Normally, old-pos is taken from syntax-ppss-last, and syntax-ppss always assigns it at the end.

It doesn't assign it at the end if this is true, which it is in elixir when you're inside a module.

                (if (and old-pos (< (- pos old-pos)
                            ;; The time to use syntax-begin-function and
                            ;; find PPSS is assumed to be about 2 * distance.
                            (* 2 (/ (cdr (aref syntax-ppss-stats 5))
                                    (1+ (car (aref syntax-ppss-stats 5)))))))
            (progn
              (cl-incf (car (aref syntax-ppss-stats 0)))
              (cl-incf (cdr (aref syntax-ppss-stats 0)) (- pos old-pos))
              (parse-partial-sexp old-pos pos nil nil old-ppss))

I added logging for pos and old-pos and got:

2410 1 [97 times]

IIUC, that branch is only taken when old-pos is nil, i.e. when you're calling syntax-ppss from an earlier position than you called it before. So you're calling it from many different positions?

Yes, I think that's correct, that branch isn't taken, the comment just seemed related. Probably it's not.

@dgutov
Copy link

dgutov commented Jul 20, 2016

It doesn't assign it at the end if this is true, which it is in elixir when you're inside a module.

What do you mean? Judging by the code, it's always assigned.

I added logging for pos and old-pos and got:

Does it take the syntax-ppss-stats branch?

@aaronjensen
Copy link
Contributor Author

What do you mean? Judging by the code, it's always assigned.

Not in the true branch of the if i pasted, only in its false branch AFAICT. Am I missing something? I'm looking at emacs 25 code if that matters.

Does it take the syntax-ppss-stats branch?

Yes, the progn which ends in parse-partial-sexp

@aaronjensen
Copy link
Contributor Author

Here is the full code w/ the meaty cond snipped and some comments:

(condition-case nil
        (if (and old-pos (< (- pos old-pos)
                            ;; The time to use syntax-begin-function and
                            ;; find PPSS is assumed to be about 2 * distance.
                            (* 2 (/ (cdr (aref syntax-ppss-stats 5))
                                    (1+ (car (aref syntax-ppss-stats 5)))))))
            (progn
              (cl-incf (car (aref syntax-ppss-stats 0)))
              (cl-incf (cdr (aref syntax-ppss-stats 0)) (- pos old-pos))
              (parse-partial-sexp old-pos pos nil nil old-ppss))

          ;; begin else
          (cond snipped)

          (setq syntax-ppss-last (cons pos ppss)) ;; end if
          ppss)
      (args-out-of-range
       ;; If the buffer is more narrowed than when we built the cache,
       ;; we may end up calling parse-partial-sexp with a position before
       ;; point-min.  In that case, just parse from point-min assuming
       ;; a nil state.
       (parse-partial-sexp (point-min) pos)))

@dgutov
Copy link

dgutov commented Jul 21, 2016

Not in the true branch of the if i pasted, only in its false branch AFAICT. Am I missing something? I'm looking at emacs 25 code if that matters.

You're right. That looks like a bug, BTW.

Yes, the progn which ends in parse-partial-sexp

And here, it seems, the performance heuristic misfires. Not really sure about that: maybe it'll be sufficient to set syntax-ppss-last in both branches, or maybe the check needs to be adjusted as well (I don't really understand it).

Do you mind filing a "syntax-ppss is slow" bug with a reproducible scenario? What mode to install, what buffer contents to have, that sort of steps.

Thanks!

@aaronjensen
Copy link
Contributor Author

@dgutov http://debbugs.gnu.org/cgi/bugreport.cgi?bug=24048

@aaronjensen
Copy link
Contributor Author

Something in this pull breaks automatic insertion of end after a do in elixir, so I need to investigate that.

These places are called many times each operation with the same point.
`syntax-ppss` can be slow in some modes, like elixir-mode.
@aaronjensen
Copy link
Contributor Author

Something in this pull breaks automatic insertion of end after a do in elixir, so I need to investigate that.

This was a red herring. It was actually spacemacs that broke this: syl20bnr/spacemacs#6660

I've rebased this pull request and simplified the condition for adding the reset hook, this should be ready to merge now @Fuco1

@aaronjensen
Copy link
Contributor Author

@Fuco1 this is ready for a merge if you're good with it. Any additional feedback? Thanks!

@Fuco1
Copy link
Owner

Fuco1 commented Aug 28, 2016

I rebased and merged this, thanks!

@Fuco1 Fuco1 closed this Aug 28, 2016
@aaronjensen
Copy link
Contributor Author

aaronjensen commented Aug 28, 2016

@Fuco1 np, thanks for the merge. Btw, Squash and merge does a similar thing as rebasing manually but it marks the pull request as merged, so that's nice for tracking.

It won't maintain the commit history, but I wasn't intending to maintain that any way 😄

@aaronjensen aaronjensen deleted the memoize-syntax-ppss branch August 28, 2016 18:40
@Fuco1
Copy link
Owner

Fuco1 commented Aug 28, 2016

Yea, but it doesn't create a merge commit. I just wanted to move the branch at top. I don't know why they don't provide that option :/

But. I guess I could've just squash it anyway. shrug

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants