Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic Lexer: StringDelimiter always determined by last GenericToken.String #175

Closed
CP3088 opened this issue May 9, 2020 · 3 comments
Closed

Comments

@CP3088
Copy link
Contributor

CP3088 commented May 9, 2020

Hello again,

Noticed some strange behavior with the LexerBuilder...
Let string A -> `PL_9848`` [Extra accent is because of GitHub formatting...]
Let string B -> "Test String"

Given these two Lexemes:

[Lexeme(GenericToken.String, "`", "\\")] PLACE_HOLDER = 99,
[Lexeme(GenericToken.String, "'", "\\")] [Lexeme(GenericToken.String, "\"", "\\")] STR_LITERAL = 100,

A.StringWithoutQuotes = `PL_9848`` Extra accent is because of GitHub formatting... again]
A.StringDelimiter = " (34)

B.StringWithoutQuotes = Test String
B.StringDelimiter = " (34)

When swapped:

[Lexeme(GenericToken.String, "'", "\\")] [Lexeme(GenericToken.String, "\"", "\\")] STR_LITERAL = 99,
[Lexeme(GenericToken.String, "`", "\\")] PLACE_HOLDER = 100,

A.StringWithoutQuotes = PL_9848
A.StringDelimiter = ` (96)

B.StringWithoutQuotes = "Test String"
B.StringDelimiter = ` (96)

Sorry it took so much to explain the issue, I was not sure how else to put it.
Eventually I will use the SUB char (26) instead of `

I looked through the code, but did not see the problem right away.

Thanks,
CP3088

@CP3088
Copy link
Contributor Author

CP3088 commented May 14, 2020

I noticed that the lexer holds the delimiters globally and is set by the final string lexeme... I might be able to patch it, but for now I've found a temporary solution!

When parsing the production rules, you can simply set the token's StringDelimiter

if (value.TokenID == PineToken.STR_LITERAL && value.Value.Length > 0)
{
    value.StringDelimiter = value.Value[0]; // Hotfix cSly #175
}

This works for a string lexeme which can be delimited by ' or " (Java like)

@b3b00
Copy link
Owner

b3b00 commented May 15, 2020

I am not sure to understand it well but i get the idea. Your fix although working is a bit hacky and you should not have to do it : it's a general principle, parsing and lexing are 2 separate concerns and one must not care about the other.
I would look at it if you could provide some code to test it. A unit test would be perfect.

b3b00 added a commit that referenced this issue May 15, 2020
@b3b00
Copy link
Owner

b3b00 commented May 15, 2020

issue solved with PR #176. will be available on dev branch. I would rather not publish a new nuget only for such a little fix.

@b3b00 b3b00 closed this as completed May 15, 2020
b3b00 added a commit that referenced this issue May 15, 2020
b3b00 added a commit that referenced this issue May 25, 2020
b3b00 added a commit that referenced this issue May 26, 2020
…om:b3b00/csly into feature/#177-error-messages-and-EOL-tokens

* 'feature/#177-error-messages-and-EOL-tokens' of github.com:b3b00/csly:
  bugfix #175 :  column counter
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants