Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fallback to PCRE2 if match whole word used with regexp #82072

Merged
merged 1 commit into from Oct 8, 2019

Conversation

IllusionMH
Copy link
Contributor

Fixes #82071

This PR ensures that --auto-hybrid-regex will be added to Regexp searches regardless of Match Whole Word state.

Changed order of operations a bit to keep related checks and CLI arguments nearby.

@roblourens
Copy link
Member

roblourens commented Oct 8, 2019

Thanks for the PR! But if it is a non-regex whole word search, it won't be added, right? Maybe it should just be duplicated and put next to the --regexp switch to make sure it always gets set for regex searches

@roblourens roblourens added this to the October 2019 milestone Oct 8, 2019
@IllusionMH
Copy link
Contributor Author

But if it is a non-regex whole word search, it won't be added, right?

Correct. If isWordMatch is true but isRegExp is false - it won't be added [1].

Maybe it should just be duplicated and put next to the --regexp switch to make sure it always gets set for regex searches

That was my first intention, but if it is added unconditionally then it will be also applied to all non-regex whole word searches. If condition added - that there will be 2 similar conditions which looks confusing as for me. Therefore I moved args.push('--auto-hybrid-regex'); in check right above blocks where ---regexp will be added.

Example with duplicate + condition.

if (query.isWordMatch) {
	const regexp = createRegExp(query.pattern, !!query.isRegExp, { wholeWord: query.isWordMatch });
	const regexpStr = regexp.source.replace(/\\\//g, '/'); // RegExp.source arbitrarily returns escaped slashes. Search and destroy.
	args.push('--regexp', regexpStr);
	if (query.isRegExp) { // same as below
		args.push('--auto-hybrid-regex');
	}
} else if (query.isRegExp) { // same as above
	let fixedRegexpQuery = fixRegexNewline(query.pattern);
	fixedRegexpQuery = fixNewline(fixedRegexpQuery);
	args.push('--regexp', fixedRegexpQuery);
	args.push('--auto-hybrid-regex');
}

[1] only exception is special case when search query is exactly --. In this case isRegExp will be overridden to true, and --auto-hybrid-regex will be added even if user doesn't use regexp search.

@roblourens
Copy link
Member

then it will be also applied to all non-regex whole word searches

But we convert all whole word searches to regex searches, even if they are not originally regex searches, that is my point.

@IllusionMH
Copy link
Contributor Author

IllusionMH commented Oct 8, 2019

Yes.
It was my assumption that adding --auto-hybrid-regex for cases when it won't be used is undesirable (I would expect that whole word search should always be valid input for non-PCRE2 implementation).

I'll happily update PR to simply duplicate line if it's OK to always include it.

@roblourens
Copy link
Member

I would expect that whole word search should always be valid input for non-PCRE2 implementation

Ok, that's a good point.

@roblourens roblourens merged commit 8f3b739 into microsoft:master Oct 8, 2019
@IllusionMH IllusionMH deleted the pcre2-for-whole-word-82071 branch October 8, 2019 22:34
@IllusionMH
Copy link
Contributor Author

Yes, so it's just about preference. I don't think that it has any performance impact in any of two options (this PR or include for every regexp search).

Thanks for reviewing!

@github-actions github-actions bot locked and limited conversation to collaborators Mar 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Find in Files: Lookbehind with Match Whole Word leads to "Regex parse error"
2 participants