Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIMILAR TO does not return result when invalid pattern is used (with two adjacent special characters that should be escaped but aren't) [CORE5931] #6188

Closed
firebird-issue-importer opened this issue Oct 1, 2018 · 9 comments

Comments

@firebird-issue-importer
Copy link

@firebird-issue-importer firebird-issue-importer commented Oct 1, 2018

Submitted by: @pavel-zotov

Attachments:
difference-on-patterns-that-leads-to-POOR-SIMILAR-TO.png
firebird_1820.20181001_012003.stack_trc.txt.7z
firebird_1820.20181001_012034.stack_trc.txt.7z

(yes, i know about tickets CORE3858, CORE4893 & CORE5664 that also relate to SIMILAR TO performance; perhaps this new sample can help to solve them so i decided to create separate ticket).

Consider following command:

set count on;
set list on;
select 1 from rdb$database where
'
group by
f01,
f02,
f03,
f04,
f05,
f06,
f07,
f08,
f09,
f10,
f11,
f12,
f13,
'
similar to
'%group[[:WHITESPACE:]]+by[[:WHITESPACE:]]+([[:ALNUM:]])+([[:WHITESPACE:]]*,[[:WHITESPACE:]]*[[:ALNUM:]]+){12,}[[:WHITESPACE:]]*%'
;

It will issue result instantly:

CONSTANT 1

Records affected: 1

Now let's change pattern by adding characters "|" and "_" (PIPE and UNDERSCORE) after 1st ALNUM, but with missed escaping (and this means that pattern will be INVALID):

select 1 from rdb$database where
'
group by
f01,
f02,
f03,
f04,
f05,
f06,
f07,
f08,
f09,
f10,
f11,
f12,
f13
'
similar to
'%group[[:WHITESPACE:]]+by[[:WHITESPACE:]]+([[:ALNUM:]]|_)+([[:WHITESPACE:]]*,[[:WHITESPACE:]]*[[:ALNUM:]]+){12,}[[:WHITESPACE:]]*%'
;

This statement will HANGS on 2.5, 3.0 and 4.0.
Difference between patterns - see in attached .pgn
Two stack traces (for WI-V3.0.4.33054 ) are also attached.

NB: This trouble with SIMILAR TO seems to be CRITICAL: no new attachment can be established while engine "does something inside itself" while working with parsing such statement!

Commits: 28e1874

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 1, 2018

Modified by: @pavel-zotov

Attachment: difference-on-patterns-that-leads-to-POOR-SIMILAR-TO.png [ 13304 ]

Attachment: firebird_1820.20181001_012003.stack_trc.txt.7z [ 13305 ]

Attachment: firebird_1820.20181001_012034.stack_trc.txt.7z [ 13306 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 1, 2018

Commented by: @pavel-zotov

PPS.

Links for two full memory dumps (relates to attached stack traces for WI-V3.0.4.33054):

https://drive.google.com/open?id=1zGWHp0Eakl1pL4SlULqOQk6UvZyhsSV6
https://drive.google.com/open?id=1cs1Ny6tUAjpZW7YncADbL_xM07RHSPt-

And this is link to FB build WI-V3.0.4.33054 with corresponding PDB files:

https://drive.google.com/open?id=1vI-hOn9-OzB33xkfJuKoc49G6zIFvrmm

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 1, 2018

Modified by: @pavel-zotov

description: (yes, i know about tickets CORE3858, CORE4893 & CORE5664 that also relate to SIMILAR TO performance; perhaps this new sample can help to solve them so i decided to create separate ticket).

Consider following command:

set count on;
set list on;
select 1 from rdb$database where
'
group by
f01,
f02,
f03,
f04,
f05,
f06,
f07,
f08,
f09,
f10,
f11,
f12,
f13,
'
similar to
'%group[[:WHITESPACE:]]+by[[:WHITESPACE:]]+([[:ALNUM:]])+([[:WHITESPACE:]]*,[[:WHITESPACE:]]*[[:ALNUM:]]+){12,}[[:WHITESPACE:]]*%'
;

It will issue result instantly:

CONSTANT 1

Records affected: 1

Now let change pattern by adding characters "|" and "_" (PIPE and UNDERSCORE) after 1st ALNUM, but with missed escaping (and this means that pattern will be INVALID):

select 1 from rdb$database where
'
group by
f01,
f02,
f03,
f04,
f05,
f06,
f07,
f08,
f09,
f10,
f11,
f12,
f13
'
similar to
'%group[[:WHITESPACE:]]+by[[:WHITESPACE:]]+([[:ALNUM:]]|_)+([[:WHITESPACE:]]*,[[:WHITESPACE:]]*[[:ALNUM:]]+){12,}[[:WHITESPACE:]]*%'
;

This statement will HANGS on 2.5, 3.0 and 4.0.
Difference between patterns - see in attached .pgn
Two stack traces (for WI-V3.0.4.33054 ) are also attached.

NB: This trouble with SIMILAR TO seems to be CRITICAL: no new attachment can be established while engine "does something inside itself" while working with parsing such statement!

=>

(yes, i know about tickets CORE3858, CORE4893 & CORE5664 that also relate to SIMILAR TO performance; perhaps this new sample can help to solve them so i decided to create separate ticket).

Consider following command:

set count on;
set list on;
select 1 from rdb$database where
'
group by
f01,
f02,
f03,
f04,
f05,
f06,
f07,
f08,
f09,
f10,
f11,
f12,
f13,
'
similar to
'%group[[:WHITESPACE:]]+by[[:WHITESPACE:]]+([[:ALNUM:]])+([[:WHITESPACE:]]*,[[:WHITESPACE:]]*[[:ALNUM:]]+){12,}[[:WHITESPACE:]]*%'
;

It will issue result instantly:

CONSTANT 1

Records affected: 1

Now let's change pattern by adding characters "|" and "_" (PIPE and UNDERSCORE) after 1st ALNUM, but with missed escaping (and this means that pattern will be INVALID):

select 1 from rdb$database where
'
group by
f01,
f02,
f03,
f04,
f05,
f06,
f07,
f08,
f09,
f10,
f11,
f12,
f13
'
similar to
'%group[[:WHITESPACE:]]+by[[:WHITESPACE:]]+([[:ALNUM:]]|_)+([[:WHITESPACE:]]*,[[:WHITESPACE:]]*[[:ALNUM:]]+){12,}[[:WHITESPACE:]]*%'
;

This statement will HANGS on 2.5, 3.0 and 4.0.
Difference between patterns - see in attached .pgn
Two stack traces (for WI-V3.0.4.33054 ) are also attached.

NB: This trouble with SIMILAR TO seems to be CRITICAL: no new attachment can be established while engine "does something inside itself" while working with parsing such statement!

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 1, 2018

Commented by: @mrotteveel

That is not an invalid pattern (or at least: it is a syntactically correct pattern).

That this syntactically correct pattern then leads to a undecidable pattern (where the regex engine is probably stuck backtracking) is in itself an undecidable problem (in this case you probably can, but for more complex patterns, you can't).

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 1, 2018

Commented by: @mrotteveel

Specifically ([[:ALNUM:]]|_)+

Breaks down to:

<regular expression> = <regular term> = <regular factor> = ([[:ALNUM:]]|_)+ (ignoring quantifier in next steps)
<regular primary> = ([[:ALNUM:]]|_)
<regular expression> = [[:ALNUM:]]|_ (splits up in a|b)
a. <regular term> = <regular factor> = <regular primary> <character class> = [[:ALNUM:]]
a. <member> = <predefined class> = [:ALNUM:]
b. <regular term> = <regular factor> = <regular primary> <character class> = _

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Aug 10, 2019

Modified by: @asfernandes

assignee: Adriano dos Santos Fernandes [ asfernandes ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Sep 6, 2019

Modified by: @asfernandes

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 4.0 Beta 2 [ 10888 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Sep 8, 2019

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Resolved [ 5 ]

QA Status: No test => Done successfully

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Sep 8, 2019

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Closed [ 6 ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants