New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parsing long files #791
Comments
Hum, that's odd. 4.8M doesn't look very large, and certainly nothing that would require 5 minutes to process. Could you provide the file so we can check what's going on? Also, which version of Geany are you using, and on which OS? Anyway, to work around this you can try a few things:
|
If it's 5MB source code file, I'm guessing it's auto-generated (or else Java is way more verbose than I imagined). If so, it could be all on one line which would make it really slow. If it's not all on one line but has many long lines and long-line wrapping is on, it could also take a long time. |
It is mainly hand written and so most of the lines are 40-120 chars long. I appreciate your time and thoughts on this matter. Thanks, Phil Philip R Brenan On Mon, Dec 7, 2015 at 11:29 PM, Matthew Brush notifications@github.com
|
Well Scintilla is known not to handle really long lines well (ex. minified JS), but also Edit: NVM, I didn't notice you said "Line wrapping is off". |
@philiprbrenan did you try the suggestions by @b4n to allow identifying if its the symbol parser or highlighting lexer? |
Following your suggestions: Disable real-time symbol parsing prevents the problem from occurring.
Disabling all file type-specific features works as suggested - this was I appreciate your help! Thanks, Phil Philip R Brenan On Mon, Dec 7, 2015 at 4:23 PM, Colomban Wendling notifications@github.com
|
When I save the attached Java file of 2K lines it takes about 30 seconds to save during which time Geany is unresponsive. Lint of course finds lots of errors very quickly and bails out after less than a second. The brackets match correctly, but class test1 is defined multiple times. If I shorten the file the parse blackout problem disappears at around 600 lines - although the is still a noticeable pause before the file is reported as saved in the Messages tab. |
Looks like the particular file is encountering pathological worst case performance of the ctags parser or symbol handling software. Improvements are welcome. |
It's not that slow here, maybe 0.5-0.75 seconds to save, but my computer is really fast. It seems to perform better if replacing Edit: I didn't notice you didn't attach the whole 2K line file as said. It takes about 3-4 seconds when I expand the file to ~2500 lines, and while pasting copies of it, I got it to lock up for about 10 seconds. |
Maybe one more suggestion - do you experience the same slowness if you switch from the "Symbols" tab in the sidebar to something else? IMO the parser should be fast in this case but I suspect the symbols tree generation is the slow one here. |
I have tried clicking in the Symbols window and then clicking in the Java
source and back again: this does not seem to cause any problems, nor does
expanding or collapsing the symbols tree. The effect appear to be non
linear - when I save 1K lines the delay is perhaps 2 seconds, but when I
try to save 2K lines it is 30 seconds. Checking the system monitor shows
100% CPU, memory appears to be fine.
|
Sorry, maybe I said it in a confusing way - in the sidebar select e.g. Documents instead of Symbols. The thing is that when the Symbols tree isn't shown, it isn't rebuilt when you type. I believe this might fix the problem you mentioned in your first post about the freeze when typing { at the beginning of the file. But I doubt it will have any effect on saving. Could you provide some bigger file for testing? The file you provided is just 100 lines and is insufficient to trigger the issue for me. |
OK nevermind, I can reproduce it when copy-pasting the lines in your file several times. I'll have a look at the profiler output if I can see something. |
Yes - that is what I have been doing to see where the effect kicks in.
Whether the Symbols/Documents tabs is open or not does not seem to have any
effect - the blackout effect is easily triggered by doing a save regardless.
The fact that I am reusing the same names over and over again might be part
of the problem. If helpful I could construct a more meaningful example
using unique names?
|
Alright, just tried with the profiler and the big part of the problem should be solved by the patch here: It doesn't fix the slowness completely but at least it should fix the non-linear part of it. With about 4000 lines from your example about 75% of time was spent by rehighlighting the document. The remaining 25% were spent by the parser (I'm afraid we cannot do much with this). If you are able to recompile, could you try the patch if it helps? |
This is absolutely marvellous - I will try to do this - as this is my first One of the (many ) reasons I use Geany for Java in preference to using the Thanks, Phil Philip R Brenan On Fri, Dec 11, 2015 at 3:51 PM, Jiří Techet notifications@github.com
|
By the way, I've been playing with the file a bit more and I can see quite some time spent in the symbol tree building too (switch to the Documents tab, edit the file so some symbols get added/removed and switch back to the Symbols tree - it takes quite some time to rebuild it). In the past I was suggesting we should limit the number of entries in the tree to some sane number, say 10000 entries (your file contains 234 entries per line, with 1000 lines it becomes 234000 entries), because the current implementation doesn't scale very well: I think we should introduce some limit. Yes, scaleability is one of my favorite Geany features too (not only for big files but also for big projects with thousands of files). So I'm definitely interested in improving any code that doesn't scale well. |
@philiprbrenan By the way, the patch removes a single line from the code so instead of pulling the patch you can just get Geany from master and comment-out the single line which might be easier for you. If you are on Debian, just run
which should install all the dependencies and then run
|
The symbol tree slowness may also be caused by this: The repeated symbol names make things worse for the tree generation - this might be fixable though. Note to self: learn the difference between multiplication and addition: in the numbers above there are just 35 tags per line. |
A huge improvement - down from 10's of seconds to too fast to notice. Thank On Fri, Dec 11, 2015 at 4:36 PM, Jiří Techet notifications@github.com
|
A bit off-topic, but ... @techee are you using the XCode profiler, GNU grof, or other? I tried to do a profile build to test this previously, but I was unable to get gprof to produce any output in the report, I suspect because of splitting libgeany out of the main app. All I did was to put |
@codebrainz I guess you missed the announcement of the wiki page about profiling Geany: |
@philiprbrenan Great to hear! Even though the original patch wasn't quite right as Colomban noticed, the updated version should fix the performance issue too. |
Yep, I missed that. Will give it a read, thanks. |
Closing as the issue seems to be resolved according to the comments. Feel free to reopen if it is still an issue. |
I have a large Java file (4.8MB). If I insert a new { near the start of this file, the run time for the parser becomes very long (5 minutes or so) - while the parser is running the editor is unusable. Is there some way to prevent this from happening? By parser I mean the code that determines the coloration of the keywords, strings, etc. Thanks!
The text was updated successfully, but these errors were encountered: