-
-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
org-ref-get-labels taking very long time #647
Comments
It is slow with large files, and I think it's because the function parses the buffer several times when collecting labels. Perhaps we could make it faster by searching all labels from a single regular expression rather than using multiple functions as we do now. We could also collect label position and context while we're there. I haven't tested this enough, but this is what I mean roughly:
(defun org-ref-get-labels ()
(save-excursion
(save-restriction
(widen)
(let (matches)
(goto-char (point-min))
(while (re-search-forward
(concat "\\(#\\+label:\s\\|"
"#\\+t?b?l?name:\s\\|"
"\\\\label{\\|"
":custom_id:\s\\|"
"[^#+]label:\\|<<\\)\\([^\s\t\n>}]*\\)")
nil t)
(cl-pushnew (match-string-no-properties 2) matches
:test 'equal))
matches))))
|
That is a good idea. I am sure This was done early in org-ref development and done in a way to test each kind of label. I think there are benefits to the parsing, but if used it should probably only be done once, and maybe there could be a flag that either switches for large buffers, or is a user preference. I won't have time to work on this until probably late in May, so if anyone wants to tackle it and pull request the change I would be happy to take a look at it. |
Thanks for all the comments. Unfortunately, I do not know much about Emacs Lisp. I have found a workaround for now which can mitigate the problem. As the function is called in jit-lock-function, by disabling font-lock-mode in the large org-mode buffer and only fontify the buffer using font-lock-fontify-block to fontify the current block can significantly reduce the cpu time and memory use. Hopefully, in the future, this problem can be solved by modifying the org-ref-get-labels function in org-ref. |
To add to the above discussion, when I use org-ref with poly-org I experience a several second lag after each keypress (see this issue). Using either mode by itself seems to work fine. I believe the issue on org-ref's side is related to the one here: the function proposed by @jagrg helps, but editing is still to slow to be usable. |
If you add (setq org-ref-colorize-links nil) to your init file, does this issue go away? You might be interested in trying https://github.com/jkitchin/org-ref/tree/cache-parse. I am trying a version where I cache the parse-data so it gets re-used. it might still be slow for heavy editing, but I think it is a step in the right direction. |
Strangely, |
So the font-lock triggers the re-parsing of the entire file, maybe multiple times, right? I have zero familiarity with the org/org-ref internals, but could the |
@matthuszagh you get colored links even after restarting emacs? |
Another option could be try @vspinu the buffer does get parsed frequently. This is the most accurate way to ensure you get all the relevant information. I have been thinking of some ways to do it incrementally, but they all sacrifice some accuracy. For example, you could incrementally add labels to a variable, but there isn't a simple way to remove them from that list if you delete the label. Eventually, the variable will need to be remade by parsing the buffer again. It looks possible to narrow the scope of org-element-parse-buffer with narrowing, so that might be a path towards incremental updates, if I can figure out how to narrow to the visible region. I might work on an approach like this next. |
@jkitchin I do. I've tried with:
and with setting Here's the version info: I've tested this with only straight.el and org-ref (plus dependencies if it has them), so there shouldn't be any conflicts with other packages. |
I think adding a
What is the purpose to narrow to the "visible region"? If it's only for the sake of font-lock then the font-lock/jit-lock machinery already has all pieces in place. You can add a function to In fact if |
The point of narrowing is to limit what org-element-parse-buffer does. By default it scans the whole buffer, unless narrowing is in place. this function gets called during fontification I think. You may be correct about what needs to be parsed, but you still have to use narrowing to limit the scope of org-element-parse-buffer. org-element-parse-buffer provides the most accurate, up-to-date and easy to use source of data for this, but it is the slowest. I think you can avoid that if you The slowness comes about because org-ref is checking the validity of a ref link, and doing that requires making sure there is a label it refers to. That is currently done with a full parse of the buffer, eventually inside of it would probably further help performance to replace the parse-buffer code with regexp searches. The parse-buffer code is simpler and works, and regexp code is harder to write and debug I think, but it might be worth exploring. It feels like it would be a lot of work to make this work in an incrementally updating way that also used a before-change-hook to remove deleted labels. I would like to see performance be better for large documents though, it even has an effect on me when I write them! |
I actually was wondering about "visible region" not the narrowing. From what you said it looked that only the "visible region" is of importance for the user experience.
What I was proposing is that you still use |
LOL. It isn't that simple. Many times the region defined in the before-change-hook is either empty (an insertion), or a single character (a delete), and otherwise a set of characters that may or may not include a full label. org-element-parse-buffer cannot deal with any of these scenarios because they don't contain full labels. Depending on the situation you have to look back and forward to see if the change is on a changing label. You cant rely on storing the positions, because they change as you edit the document, eg after the change, all labels after it have new positions. Another issue I have run into with this overall approach is that when you first open a document, it is not fully fontified, and if there is some folding of headlines, some labels are missed. I have a mostly working draft of this in https://github.com/jkitchin/org-ref/tree/font-lock-labels. |
Ok. let's think aloud.
Well, you have to extend the region of course. Most of the tooling relying on after-change does that (font-lock, syntax) and I bet you can just re-use font-lock-extend-region-functions for what you need.
So what? All you have to know is the range of the deleted region. Before-change-functions are used to clean up the region from the cache. The actual updating of the cache is done in the after-change-functions. In fact you can use only
Either store markers or (better) update the entries in the cash by shifting the positions by the length of the inserted text.
You would need to have an original pass if that's what is required.
AFAICS the change caches the results between the modifications. I bet it's a good improvement but it doesn't solve the core issues that the entire buffer is parsed after each character insertion. Hopefully @matthuszagh could give it a try on his document. I am not an org-ref user. I just want it to be working with poly-org. BTW, just a hunch. Do you add text properties outside of font-lock? Those should be protected with |
@vspinu have you tried |
@vspinu would it be possible for you to send me a poly-org setup and example file that you use that is a problem? |
@jkitchin You can use the same test file I provided to @vspinu, example.zip. Here's my setup for polymode and org-ref.
I'm still not seeing any effect from setting |
👍
|
@matthuszagh if you can do some profiling similar to the one at the beginning of this thread it might help identify where the performance lag is. |
This is what the
thanks for this suggestion. I might try something like this, as it would allow a sorted list of labels to be returned, e.g. in the order they appear in the buffer.
this is also done in the branch.
This isn't the case, all changes are incremental after the initial buffer parse as far as I know.
Thanks for this suggestion too, this branch was doing that for now, but I have wrapped it in that macro to protect it. I am reasonable sure this branch should majorly improve performance of org-ref-get-labels, but there might still be other places in org-ref that are problematic on big files. |
I have run some timing experiments. On the current master branch, it takes 13-14 seconds to open the example/info.org file!, and editing is irritatingly slow with second delays between keypresses as reported. But, if you set with the with the I plan to test this locally for a few more days, but it looks like I will be merging the |
Amazing job! Thanks! |
I have merged this into master. I am going to close it. If there are remaining problems with this, feel free to reopen it. |
Through profiling emacs, I found that org-ref-get-labels function is taking a lot of the cpu time and memory. Emacs is almost unresponsive when the org file is large.
I am running Spacemacs develop brunch version 0.300.0, and org-ref version is 20190318-1558 and the emacs version is 25.2.2. Is there anything that I can do to improve the speed?
Here is the screenshot of the profiling of cpu time:
- run-hook-wrapped 13520 88%
- org-activate-links 13492 88%
- org-element--parse-elements 11776 77%
org-ref-get-latex-labels 16 0%
org-ref-get-names 4 0%
org-ref-get-org-labels 4 0%
+ org-do-latex-and-related 4 0%
Here is the screenshot for profiling the memory:
- evil-line-move 655,529,820 85%
- jit-lock-function 655,264,803 85%
- font-lock-fontify-region 655,257,523 85%
- org-ref-get-labels 654,521,457 85%
- save-excursion 654,042,741 85%
org-ref-get-org-labels 2,048 0%
org-ref-get-names 2,048 0%
org-activate-dates 9,212 0%
org-fontify-drawers 1,024 0%
org-activate-tags 1,024 0%
org-fontify-macros 1,024 0%
org-font-lock-add-priority-faces 1,024 0%
org-activate-code 1,024 0%
+ turn-on-pangu-spacing 6,224 0%
- eval 34,004 0%
Thanks!
The text was updated successfully, but these errors were encountered: