Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-line matching with regular expression should be default #1588

Closed
Johndeep opened this issue Aug 24, 2017 · 8 comments
Closed

Multi-line matching with regular expression should be default #1588

Johndeep opened this issue Aug 24, 2017 · 8 comments

Comments

@Johndeep
Copy link

Johndeep commented Aug 24, 2017

Story:
Besides searching for keywords like variables or unique words in codes, one also sometimes just want to search for special symbols like "\t" or "\n" for reasons. One would activate "regular expression" for this, but still, "\n" will not be found, unless "multi-line matching" is activated.

Suggesting:
To have an extra option for multi-line matching is somehow useless, this should obviously be the default if searching for regular expression. This also would simplify the search/replace dialog a bit, unless someone does have a use case for this extra option.

@codebrainz
Copy link
Member

Personally I agree, this trips me up too, I've never found a use for single-line matching. For a bit of the rationale, see the pull request that implemented this.

@vstepaniuk
Copy link

vstepaniuk commented May 21, 2019

  1. This is the same in almost all regex flavors (for historical reasons), so for consistency I think it's better to keep an available "multi-line" option, rather than to make the user wonder, or to go to the documentation to find it out.
    See also: https://www.regular-expressions.info/dot.html
  2. If you want to search for some non-printing characters, like a tab or a newline, you could just enable the "Use escape sequences" option and use the corresponding escape sequences.
  3. If, besides that, you need the full regex capability, you could just enable the "Use multi-line matching" option once and it will remain enabled until you disable it.

@elextr
Copy link
Member

elextr commented May 22, 2019

Its not just a regex option, when the multiline option is clear Geany actually searches line by line.

This is also a premature optimization since it can save having to coalesce the Scintilla buffer which can result in moving the entire document in memory if the cursor is near the start of a document. If thats really an issue requires benchmarking, but its likely to only apply to very large documents (and whats large depends on your machine).

As to which should be the default, myeh.

@Johndeep
Copy link
Author

Johndeep commented May 24, 2019

Hmm, ah, so the main purpose for single line matching is to save memory, since a 5-6 GB txt file might cause some serious trouble.. and is just not as optimized as searching line by line, no?

But again especially regex like "(.+)\nfoobar" or anything with "\n" will fail without multi-line search. Anyway, I see why there are both options. But at least there should be something like a warning that "\n cannot be found in single-line mode" or something, unless someone has a better solution, which could also be optimized while being able to detect every symbol and regex correctly (one less option to care about).

@elextr
Copy link
Member

elextr commented May 24, 2019

since a 5-6 GB txt file might cause some serious trouble

Remember Geany is intended to be fairly fast and lightweight, so its useful on old wimpy laptops, where much less than that is noticeably slow. And people insist on opening hundreds of megabyte log files, and search them.

But at least there should be something like a warning that "\n cannot be found in single-line mode" or something

See the tooltip when you hover over the option.

@codebrainz
Copy link
Member

Hmm, ah, so the main purpose for single line matching is to save memory

@Johndeep, I don't think that's the (main) reason, just a side-effect. According to the PR I linked above, where it was implemented:

This is the "simple" version of regular expressions, where it cannot match newlines characters in the middle of the expression. This is the behavior of most CLI tools, like grep or sed. Notably a negative range (like [^s]) won't match a newline.

@Johndeep
Copy link
Author

I see, I must have miss the line at that time, while I desperately wanted to match the newline without looking. But I didn't know that this behavior is normal for CLI tools like grep or sed, since many editors like gedit or vim match newlines per default (a behavior I actually expected).

Because this post is quite old already, I even forgot this issue myself, so, since this seems to be rather a feature than an issue, I'll just close this issue for good then. But nevertheless, thank you for the insight of this option and sorry for triggering this as an issue.

@codebrainz
Copy link
Member

codebrainz commented May 25, 2019

sorry for triggering this as an issue

No need to apologize, it's a perfectly valid issue, it's just unlikely to be acted upon since changing the default was intentional, for better or worse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants