Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix some issues with painless's strings #22393

Merged
merged 3 commits into from Jan 6, 2017

Conversation

Projects
None yet
3 participants
@nik9000
Copy link
Contributor

commented Dec 30, 2016

  1. Escape sequences we're working. For example \\ is now correctly
    interpreted as \ instead of \\. Same with \' being ' and
    \" being ".
  2. ' delimited strings weren't allowed to contain "s but it looked
    like they were intended to support it. Now they do.
  3. Improves the error message when the script contains an invalid
    escape sequence inside a string to include a list of the valid
    escape sequences.

Closes #22372

// `'` inside of a string delimited with `"` should be ok
assertEquals("test'", exec("\"test'\""));
// `"` inside of a string delimited with `'` should be ok
assertEquals("test\"", exec("'test\"'"));

This comment has been minimized.

Copy link
@nik9000

nik9000 Dec 30, 2016

Author Contributor

This one didn't work without the lexer change. I think we intended it to work.

throw location.createError(new IllegalArgumentException("unexpected character [" + getErrorDisplay(text) + "].", lnvae));
String message = "unexpected character [" + getErrorDisplay(text) + "].";
char firstChar = text.charAt(0);
if ((firstChar == '\'' || firstChar == '"') && text.length() - 2 > 0 && text.charAt(text.length() - 2) == '\\') {

This comment has been minimized.

Copy link
@nik9000

nik9000 Dec 30, 2016

Author Contributor

I thought about using fancy lexer changes to make this more correct or allowing all escapes in the lexer but detecting invalid escapes in the walker. I'm not sure which is better but I'd already written this one so I figured I start with it.

modules/lang-painless/src/main/java/org/elasticsearch/painless/antlr/Walker.java Outdated
boolean doubleQuoted = string.charAt(0) == '"';

// Strip the leading and trailing quotes
string = string.substring(1, ctx.STRING().getText().length() - 1);

This comment has been minimized.

Copy link
@rjernst

rjernst Dec 30, 2016

Member

Can we do all of these things (stripping last char, backslash escape and quote escape) in one pass with a string builder? Right now 3 copies are made of the string.

This comment has been minimized.

Copy link
@nik9000

nik9000 Dec 30, 2016

Author Contributor

We can. I figured the replacements would mostly end up as noops so it wasn't worth it. Especially with this on the compile side. If you'd prefer it I'm happy to do it though.

@nik9000

This comment has been minimized.

Copy link
Contributor Author

commented Jan 2, 2017

@rjernst, I replaced the replaces with walking a string builder. I expect I could save some CPU by copying the strings in longer chunks if possible but I figure this performance is adequate and more readable.

@rjernst

rjernst approved these changes Jan 3, 2017

Copy link
Member

left a comment

LGTM

StringBuilder string = new StringBuilder(ctx.STRING().getText());

// Strip the leading and trailing quotes and replace the escape sequences with their literal equivalents
int src = 1;

This comment has been minimized.

Copy link
@rjernst

rjernst Jan 3, 2017

Member

Add asserts for the leading and trailing quotes?

This comment has been minimized.

Copy link
@nik9000

nik9000 Jan 3, 2017

Author Contributor

Can do.

nik9000 added some commits Dec 30, 2016

Fix some issues with painless's strings
1. Escape sequences we're working. For example `\\` is now correctly
interpreted as `\` instead of `\\`. Same with `\'` being `'` and
`\"` being `"`.
2. `'` delimited strings weren't allowed to contain `"`s but it looked
like they were intended to support it. Now they do.
3. Improves the error message when the script contains an invalid
escape sequence inside a string to include a list of the valid
escape sequences.

Closes #22372

@nik9000 nik9000 force-pushed the nik9000:painless_string_escapes branch to 6648937 Jan 6, 2017

@nik9000 nik9000 merged commit f24ca51 into elastic:master Jan 6, 2017

1 of 2 checks passed

elasticsearch-ci Build started sha1 is merged.
Details
CLA Commit author has signed the CLA
Details

@nik9000 nik9000 added v5.3.0 and removed v5.2.0 labels Jan 6, 2017

@nik9000

This comment has been minimized.

Copy link
Contributor Author

commented Jan 6, 2017

Thanks for reviewing @rjernst!

master: f24ca51
5.x: 55aed74

nik9000 added a commit that referenced this pull request Jan 6, 2017

Fix some issues with painless's strings (#22393)
1. Escape sequences we're working. For example `\\` is now correctly
interpreted as `\` instead of `\\`. Same with `\'` being `'` and
`\"` being `"`.
2. `'` delimited strings weren't allowed to contain `"`s but it looked
like they were intended to support it. Now they do.
3. Improves the error message when the script contains an invalid
escape sequence inside a string to include a list of the valid
escape sequences.

Closes #22372
@clintongormley

This comment has been minimized.

Copy link
Member

commented Jan 9, 2017

@nik9000 shouldn't this be backported to 5.2?

@nik9000

This comment has been minimized.

Copy link
Contributor Author

commented Jan 9, 2017

@clintongormley, probably. I'm not sure why I didn't do it automatically. The workaround I mention in
#22372 works but it isn't great. I'll backport.

nik9000 added a commit that referenced this pull request Jan 9, 2017

Fix some issues with painless's strings (#22393)
1. Escape sequences we're working. For example `\\` is now correctly
interpreted as `\` instead of `\\`. Same with `\'` being `'` and
`\"` being `"`.
2. `'` delimited strings weren't allowed to contain `"`s but it looked
like they were intended to support it. Now they do.
3. Improves the error message when the script contains an invalid
escape sequence inside a string to include a list of the valid
escape sequences.

Closes #22372
@nik9000

This comment has been minimized.

Copy link
Contributor Author

commented Jan 9, 2017

5.2: 3a80f74

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.