-
-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix percent escape not working when not at the beginning of the line #383
Conversation
Hi, @zzzeek can you please take a look? The fix passed all the tests on my local machine。 |
this does change the output of rendering so I'm a little concerned about old templates relying on the broken behavior. I can just put it out there as 1.3.1 and see what happens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this is sqla-tester setting up my work on behalf of zzzeek to try to get revision 773b6bd of this pull request into gerrit so we can run tests and reviews and stuff
New Gerrit review created for change 773b6bd: https://gerrit.sqlalchemy.org/c/sqlalchemy/mako/+/5111 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some simplifications to the regex, can you try what I suggested ? the tests all pass
mako/lexer.py
Outdated
@@ -357,10 +357,12 @@ def match_text(self): | |||
r""" | |||
(.*?) # anything, followed by: | |||
( | |||
(?<=\n)(?=[ \t]*(?=%|\#\#)) # an eval or line-based | |||
((?<=\n)|^)(?=[ \t]*(?=%|\#\#)) # an eval or line-based |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i dont think the |^
is needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also we can remove the "%" from this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review ! I think the '%' here cannot be removed. Doing so would cause the regex in match_text
to fail to consume the newline character '\n' in cases such as \n %for
. Removal would result in the match_control_line
's regular expression failing to recognize these control keywords.
Lines 424 to 429 in dc66614
def match_control_line(self): | |
match = self.match( | |
r"(?<=^)[\t ]*(%(?!%)|##)[\t ]*((?:(?:\\\r?\n)|[^\r\n])*)" | |
r"(?:\r?\n|\Z)", | |
re.M, | |
) |
mako/lexer.py
Outdated
# comment preceded by a | ||
# consumed newline and whitespace | ||
| | ||
(?<!%)(?=%%+) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can read:
(?<!%)(?=%%+) # consume the first percent sign out of a group of percent signs
so the overall block looks like:
match = self.match(
r"""
(.*?) # anything, followed by:
(
(?<=\n)(?=[ \t]*(?=\#\#)) # an eval or line-based
# comment, preceded by a
# consumed newline and whitespace
|
(?<!%)(?=%%+) # consume the first percent sign out of a group of percent signs
|
(?=\${) # an expression
|
(?=</?[%&]) # a substitution or block or call start or end
# - don't consume
|
(\\\r?\n) # an escaped newline - throw away
|
\Z # end of string
)""",
re.X | re.S,
)
At the same time, I found that the current fix is incorrect and generates results that are missing spaces in some escape percent cases. case: from mako.template import Template
# 4 spaces before %
template = """
%%%% do something
"""
print(Template(template).render()) result only has 3 spaces before "%":
|
This regex block fixes the issues mentioned above and works correctly. The regex block now: match = self.match(
r"""
(.*?) # anything, followed by:
(
(?<=\n)(?=[ \t]*(?=%(?!%)|\#\#)) # an eval or line-based
# comment preceded by a
# consumed newline and whitespace
|
(?<!%)(?=%%+) # consume the first percent sign out of a group of percent signs
|
(?=\${) # an expression
|
(?=</?[%&]) # a substitution or block or call start or end
# - don't consume
|
(\\\r?\n) # an escaped newline - throw away
|
\Z # end of string
)""",
re.X | re.S,
) However, there might be room for improvement, as I am not very familiar with regular expressions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make those changes that you suggested and then also add a test to confirm the case that you found wasn't working? update here and I can then update the gerrit, thanks
The test code and regex has been updated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this is sqla-tester setting up my work on behalf of zzzeek to try to get revision 5d3e742 of this pull request into gerrit so we can run tests and reviews and stuff
Patchset 5d3e742 added to existing Gerrit review https://gerrit.sqlalchemy.org/c/sqlalchemy/mako/+/5111 |
you're working at expert level already so consider yourself familiar! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this is sqla-tester setting up my work on behalf of zzzeek to try to get revision f282029 of this pull request into gerrit so we can run tests and reviews and stuff
Patchset f282029 added to existing Gerrit review https://gerrit.sqlalchemy.org/c/sqlalchemy/mako/+/5111 |
for the formatting, use the tooling we have :
|
758ab44
to
ab8e747
Compare
Thanks, it has been updated! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this is sqla-tester setting up my work on behalf of zzzeek to try to get revision ab8e747 of this pull request into gerrit so we can run tests and reviews and stuff
Patchset ab8e747 added to existing Gerrit review https://gerrit.sqlalchemy.org/c/sqlalchemy/mako/+/5111 |
Michael Bayer (zzzeek) wrote: thank you! View this in Gerrit at https://gerrit.sqlalchemy.org/c/sqlalchemy/mako/+/5111 |
Gerrit review https://gerrit.sqlalchemy.org/c/sqlalchemy/mako/+/5111 has been merged. Congratulations! :) |
mako 1.3.1 is released. now we'll find out if anyone was relying on two percent signs rendering as two of them |
Hi - I've reverted the change. Can we please try again, adding new tests that test the case in #384
The change needs to be limited to only percent signs as the first non-whitespace character, not any percent signs. |
Mako 1.3.1 is yanked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this is sqla-tester setting up my work on behalf of zzzeek to try to get revision ab8e747 of this pull request into gerrit so we can run tests and reviews and stuff
New Gerrit review created for change ab8e747: https://gerrit.sqlalchemy.org/c/sqlalchemy/mako/+/5141 |
Michael Bayer (zzzeek) wrote: please add tests for double percents in the middle of non-whitespace lines View this in Gerrit at https://gerrit.sqlalchemy.org/c/sqlalchemy/mako/+/5141 |
Oh sorry, I will revise and test this part again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this is sqla-tester setting up my work on behalf of zzzeek to try to get revision db93097 of this pull request into gerrit so we can run tests and reviews and stuff
Patchset db93097 added to existing Gerrit review https://gerrit.sqlalchemy.org/c/sqlalchemy/mako/+/5141 |
Michael Bayer (zzzeek) wrote: wow a whole new method. OK. I dont have a lot of time today so ill try to look more closely at this soon. View this in Gerrit at https://gerrit.sqlalchemy.org/c/sqlalchemy/mako/+/5141 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi , here's a new patch that preserves the line numbers more accurately and produces fewer artifacts....can you test this and use this for the patch if it's OK?
diff --git a/mako/lexer.py b/mako/lexer.py
index 64ad491..15975b4 100644
--- a/mako/lexer.py
+++ b/mako/lexer.py
@@ -355,23 +355,11 @@ class Lexer:
return True
def match_percent(self):
- match = self.match(r"([^%]*?)%(%+)", re.S)
+ match = self.match(r"(?<=^)(\s*)%%(%*)", re.M)
if match:
- text = match.group(1)
- if text:
- self.append_node(parsetree.Text, text)
- for char in text[::-1]: # Look back, check wheither '%'
- # is the first non-whitespace character in a line
- if char == "\n":
- break
- elif char in ("\t", " "):
- continue
- else:
- self.append_node(parsetree.Text, "%")
- break
- self.append_node(parsetree.Text, match.group(2))
- else:
- self.append_node(parsetree.Text, "%")
+ self.append_node(
+ parsetree.Text, match.group(1) + "%" + match.group(2)
+ )
return True
else:
return False
diff --git a/test/test_lexer.py b/test/test_lexer.py
index 5bf148d..362ac70 100644
--- a/test/test_lexer.py
+++ b/test/test_lexer.py
@@ -200,10 +200,9 @@ class LexerTest(TemplateTest):
TemplateNode(
{},
[
- Text("\n\n", (1, 1)),
- Text("%", (1, 1)),
- Text(" some whatever.\n\n ", (3, 3)),
- Text("%", (3, 3)),
+ Text("\n\n%", (1, 1)),
+ Text(" some whatever.\n\n", (3, 3)),
+ Text(" %", (5, 1)),
Text(" more some whatever\n", (5, 7)),
ControlLine("if", "if foo:", False, (6, 1)),
ControlLine("if", "endif", True, (7, 1)),
@@ -226,9 +225,9 @@ if <some condition>:
[
Text("%", (1, 1)),
Text(" do something\n", (1, 3)),
- Text("%%", (1, 3)),
- Text(" do something\nif <some condition>:\n ", (2, 4)),
- Text("%%%", (2, 4)),
+ Text("%%", (2, 1)),
+ Text(" do something\nif <some condition>:\n", (2, 4)),
+ Text(" %%%", (4, 1)),
Text(" do something\n ", (4, 9)),
],
),
@@ -248,8 +247,7 @@ if <some condition>:
[
Text("\n", (1, 1)),
ControlLine("for", "for i in [1, 2, 3]:", False, (2, 1)),
- Text(" ", (3, 1)),
- Text("%", (3, 1)),
+ Text(" %", (3, 1)),
Text(" do something ", (3, 7)),
Expression("i", [], (3, 21)),
Text("\n", (3, 25)),
@@ -269,12 +267,8 @@ bar %% baz
TemplateNode(
{},
[
- Text("\n", (1, 1)),
- Text("%", (1, 1)),
- Text(" foo\nbar ", (2, 3)),
- Text("%", (2, 3)),
- Text("%", (2, 3)),
- Text(" baz\n", (3, 7)),
+ Text("\n%", (1, 1)),
+ Text(' foo\nbar %% baz\n', (2, 3))
],
),
)
db93097
to
ff8ce2e
Compare
All has been updated . Thanks for taking the time to review ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this is sqla-tester setting up my work on behalf of zzzeek to try to get revision ff8ce2e of this pull request into gerrit so we can run tests and reviews and stuff
Patchset ff8ce2e added to existing Gerrit review https://gerrit.sqlalchemy.org/c/sqlalchemy/mako/+/5141 |
Michael Bayer (zzzeek) wrote: thank you again! we try again. can keep yanking/reverting til it works :) View this in Gerrit at https://gerrit.sqlalchemy.org/c/sqlalchemy/mako/+/5141 |
Gerrit review https://gerrit.sqlalchemy.org/c/sqlalchemy/mako/+/5141 has been merged. Congratulations! :) |
That positive lookbehind appears to do nothing. Your tests still pass if you use just |
Ahh, it seems that there is no difference between the two in MULTILINE mode, if the positive lookbehind will affect performance, maybe it should be changed here. |
I believe the positive lookbehind does not get exercised because a pair of percent signs preceded by characters on a line would have been consumed by the "text" regexp. However, if that were not the case, then this regexp would still need to confirm the percent signs occur subsequent to the beginning of a line with only whitespace in between.
it would be negligible, and in fact the lookbehind version is slightly faster (note this is one million calls to the match):
the overhead of a new python call to "match_percent()" is going to be much more significant than the reg itself, and this is fine. compilation performance is not really at stake here these are all minimal changes.
definitely not |
Fixes: #323
The
Lexer
now will generate wrong result when there is no\n
before%%
.Case:
The result before fix:
The result after fix: