Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash on multiline regexp #1586

Closed
Chayyoo opened this issue Aug 24, 2017 · 16 comments
Closed

crash on multiline regexp #1586

Chayyoo opened this issue Aug 24, 2017 · 16 comments
Labels
bug confirmed A developer reproduced this issue, or it affetcs enough users

Comments

@Chayyoo
Copy link

Chayyoo commented Aug 24, 2017

My Geany crashes on the following multiline regexp (while searching in find/replace dialogue):
^(\d+);(\d+);(\d+);SCT\n(.?\n)?\1;47429007;\3

Geany version 1.30.1 (built from source on 2017-06-25 with GTK 2.24.30, GLib 2.48.2

@codebrainz
Copy link
Member

codebrainz commented Aug 24, 2017

Doesn't crash for me with 1.32 (ie. from Git) built earlier this month. If it's not a fixed bug, maybe it depends on the contents of the file?

@elextr elextr added can't reproduce A developer couldn't reproduce the issue waiting for information labels Aug 25, 2017
@Chayyoo
Copy link
Author

Chayyoo commented Aug 25, 2017 via email

@elextr
Copy link
Member

elextr commented Aug 25, 2017

@Chayyoo then you need to post a gist with a small file where it does happen because, as @codebrainz said, we can't reproduce it with any of our files.

@Chayyoo
Copy link
Author

Chayyoo commented Aug 28, 2017 via email

@elextr
Copy link
Member

elextr commented Aug 28, 2017

@Chayyoo that doesn't really help to track it down. Can you run Geany under a debugger and post a backtrace after the crash (hope you built it with -g :)

@Chayyoo
Copy link
Author

Chayyoo commented Aug 28, 2017 via email

@codebrainz
Copy link
Member

Can you give me a command to do what you want (something like gdb geany .... I presume)?

See here under "Getting a backtrace": http://www.geany.org/Support/Bugs

@elextr
Copy link
Member

elextr commented Aug 29, 2017

Note that you may need to keep pressing return to get the whole backtrace.

@Chayyoo
Copy link
Author

Chayyoo commented Aug 29, 2017 via email

@elextr
Copy link
Member

elextr commented Aug 29, 2017

As the instructions linked above say, type "run -v" at the gdb prompt.

@Chayyoo
Copy link
Author

Chayyoo commented Aug 30, 2017 via email

@elextr
Copy link
Member

elextr commented Aug 30, 2017

I attach the output of strace (last part) and ltrace (last lines of the file) after the crash. Can that help?I also verified that the crash is not due to a particular pattern in the input file between lines 5500 (where it doesn't crash) and 5600 (where it crashes). Apparently, only size matters.

Actually you didn't attach it or github ignored the attachments, but that probably won't help so don't waste time trying another way, just follow the instructions to run gdb and get the backtrace:

gdb geany

at the (gdb) prompt:

run -v

do whatever crashes it and after it crashes and returns to the gdb prompt:

bt

and return while it says return to continue

Paste the output.

@Chayyoo
Copy link
Author

Chayyoo commented Aug 30, 2017 via email

@elextr
Copy link
Member

elextr commented Aug 30, 2017

Ok, that shows that the crash is inside the PCRE library that Glib uses for regular expression handling. But what is the actual crash? Is it a segmentation violation? If it depends on size, are you running out of memory?

@Chayyoo
Copy link
Author

Chayyoo commented Aug 30, 2017 via email

@b4n
Copy link
Member

b4n commented Sep 1, 2018

I can reproduce the issue with your sample repeated until it reaches 1010445 lines, and with the pattern ^(\d+?);(\d+?);(\d+?);[BCST]+\n(.*?\n)*\1. The crash seems to be due to an excessive recursion (leading to stack overflow) inside libpcre, on which we have no control.
However, your pattern is likely to do this, as you allow for any number of optional lines between two identical lines; this basically means that the regex can capture the whole file, and has to be matched against the whole file if there is no match. It's sad to ever find a program crashing, but in this case there isn't much we can do, and your regular expression is fairly dangerous per se, in term of performance and memory usage at the very least -- and most distros' libpcre are built to use recursion because it's faster (or so they say) but can lead to unavoidable crashes on extreme cases like this.

BTW, I'm not sure it's what you want, but if you choose to be ungreedy on the number of lines allowed between the start and end, it is a lot less likely to crash (^(\d+?);(\d+?);(\d+?);[BCST]+\n(.*?\n)*?\1) -- but it sill will if there is no match for one of the first lines.

IMO, this is a case of "wontfix", both because we can't do anything about it, and it's caused by a pathological match -- which is exactly what is asked for.

@b4n b4n closed this as completed Sep 1, 2018
@b4n b4n added bug confirmed A developer reproduced this issue, or it affetcs enough users and removed can't reproduce A developer couldn't reproduce the issue waiting for information labels Sep 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug confirmed A developer reproduced this issue, or it affetcs enough users
Projects
None yet
Development

No branches or pull requests

4 participants