Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Word boundary issue with a generic subpattern next to it #1404

Closed
stribizhev opened this issue Jan 25, 2016 · 5 comments
Closed

Word boundary issue with a generic subpattern next to it #1404

stribizhev opened this issue Jan 25, 2016 · 5 comments

Comments

@stribizhev
Copy link

Description:

The word boundary as a BOW (beginning of word) does not work properly in Notepad++.

Current Behavior:

Using \b\w as a regex pattern, I'd expect Notepad++ to match h and w in hello world string. However, all the letters are matched:

image

The issue is repro with the following patterns:

\b[a-z]
\b\pL
\<[[:alpha:]]
\b[[:alpha:]]
\b[[:alnum:]]

I suspect the meaning of a BOW (beginning of word) is lost when there is only a word boundary followed by a generic subpattern (a character class, a shorthand character class, a POSIX character class.

If there is a literal symbol following the word boundary, all works as expected:

\bw

highlights w in hello world.

Expected behavior: \b\w should only match h and w in hello world.

Note: The end of word (EOW) meaning works right: \w\b matches o and d in hello world as expected.

Notepad version: v.6.8.8 Build time: Dec 2015 - 01:11:37

@MetaChuh
Copy link

we are closing this issue due to multiple newer notepad++ versions since it was filed.

please update your notepad++ to the latest version Notepad++ 7.6.6 and migrate or reinstall your plugins.
if your issue still persists, and it is still of relevance to you, please feel free to file a new issue as detailed and reproducible as possible.

thank you for your help and best regards.


general notes:

please close your issues as soon as you are satisfied, to help
keeping the notepad++ issue tracker clean.

when creating a new issue, please provide all information as shown at the issue template.
your notepad++ debug info is part of the minimum requirements.
(? > Debug Info... > Copy debug info into clipboard).

please also conduct a search at the issue tracker beforehand, to check if a similar issue
already exists.

for general questions, or if it is not sure, whether your issue is directly related to
the notepad++ source code, please visit us at the notepad++ community forum, and
search if related topics exists. you are welcome to post either at similar
topics, or to create a new topic at Help Wanted or General Discussion.
(no extra account is needed, just use your github account to sign in)

                        Click here to visit the Notepad++ Community Forum                         

@stribizhev
Copy link
Author

Tried in Notepad++ v.7.7 - the issue still exists. There is no word boundary between h and e, why does \b\w find an e match? Try replacing \b\w with x against a he string, and it will be turned to xx instead of xe.

@ghost
Copy link

ghost commented Feb 12, 2020

Issue still present in v7.8.4 64-bit on Windows.

@sasumner
Copy link
Contributor

Related: #6194

@sasumner
Copy link
Contributor

@OctaneTwisted This has to do with how multiple searches happen as each search is independent and happens with the starting point being the ending point of the last search.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants