Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for escape character strings to Postgres #1409

Merged
merged 5 commits into from
Sep 15, 2021

Conversation

WittierDinosaur
Copy link
Contributor

Brief summary of the change made

Added support for the different escape characters and string types in Postgres.
Fixes #1069

Are there any other side effects of this change that we should be aware of?

Creating the correct regex may have cost me my sanity.

Pull Request checklist

Test cases added
Added a comment block to explain the regex

@codecov
Copy link

codecov bot commented Sep 15, 2021

Codecov Report

Merging #1409 (11eb9ed) into main (178e2a1) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##              main     #1409   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files          127       127           
  Lines         8553      8581   +28     
=========================================
+ Hits          8553      8581   +28     
Impacted Files Coverage Δ
src/sqlfluff/dialects/dialect_postgres.py 100.00% <100.00%> (ø)
src/sqlfluff/rules/L010.py 100.00% <100.00%> (ø)
src/sqlfluff/rules/L011.py 100.00% <0.00%> (ø)
src/sqlfluff/rules/L012.py 100.00% <0.00%> (ø)
src/sqlfluff/rules/L016.py 100.00% <0.00%> (ø)
src/sqlfluff/cli/commands.py 100.00% <0.00%> (ø)
src/sqlfluff/core/rules/config_info.py 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 398ed74...11eb9ed. Read the comment docs.

Copy link
Member

@tunetheweb tunetheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions and minor nits. But in general, assuming that horrendous regex is accurate the change LGTM.

Comment on lines +45 to +49
RegexLexer(
"unicode_single_quote",
r"(?s)U&(('')+?(?!')|('.*?(?<!')(?:'')*'(?!')))(\s*UESCAPE\s*'[^0-9A-Fa-f'+\-\s)]')?",
CodeSegment,
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not even going to pretend to understand this regex! But great that you added all the explanation above!

Any concerns on performance impact of such a complex regex? Or getting into some infinite loop?

On plus side at least only will impact Postgres and have test cases for this...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran all these regexes through a tool called dlint, which checks for the possibility of catastrophic backtracking, where a regex runs basically forever.

Reference:

The example script below includes an example regex known to have this issue, along with the 4 regexes from the PR.

import re

subject = 'x' * 64
re.search(r'(x+x+)+y', subject)  # Boom

# "unicode_single_quote",
re.search(r"(?s)U&(('')+?(?!')|('.*?(?<!')(?:'')*'(?!')))(\s*UESCAPE\s*'[^0-9A-Fa-f'+\-\s)]')?", subject)
# "escaped_single_quote",
re.search(r"(?s)E(('')+?(?!')|'.*?((?<!\\)(?:\\\\)*(?<!')(?:'')*|(?<!\\)(?:\\\\)*\\(?<!')(?:'')*')'(?!'))", subject)
# "unicode_double_quote",
re.search(r'(?s)U&".+?"(\s*UESCAPE\s*\'[^0-9A-Fa-f\'+\-\s)]\')?', subject)
# "single_quote"
re.search(r"(?s)('')+?(?!')|('.*?(?<!')(?:'')*'(?!'))", subject)

Here's the command I ran plus the output. Note that only the test regex is flagged:

(sqlfluff-3.7.9) ➜  sqlfluff git:(bhart-issue_845_l016_compute_line_length_before_template_expansion) ✗ python -m flake8 --select=DUO t.py
t.py:4:1: DUO138 catastrophic "re" usage - denial-of-service possible

test/fixtures/parser/postgres/postgres_single_quote.sql Outdated Show resolved Hide resolved
@tunetheweb tunetheweb merged commit 05a6f18 into sqlfluff:main Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants