Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Parser breaks on documents with special characters #162

Open
EliotVU opened this issue Feb 10, 2023 · 6 comments
Open

[Bug]: Parser breaks on documents with special characters #162

EliotVU opened this issue Feb 10, 2023 · 6 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@EliotVU
Copy link
Owner

EliotVU commented Feb 10, 2023

Describe the bug

When the parser stumbles on special characters it will fail with the following error:

Processing pending document "file:///c%3A/.../Engine/Classes/PlaylistParserBase.uc":1, source:change.
Invalidating document "PlaylistParserBase".
building document PlaylistParserBase
PredictionMode SLL has failed, rolling back to LL.
An error was thrown while parsing document: "file:///c%3A/.../Engine/Classes/PlaylistParserBase.uc" Error: cannot consume EOF
    at UCTokenStream.consume (c:\Projecten\UnrealScriptLang\out\server.js:28905:19)
    at UCParser.skipLine (c:\Projecten\UnrealScriptLang\out\server.js:5417:25)
    at UCParser.directive (c:\Projecten\UnrealScriptLang\out\server.js:5544:22)
    at UCParser.member (c:\Projecten\UnrealScriptLang\out\server.js:5711:30)
    at UCParser.program (c:\Projecten\UnrealScriptLang\out\server.js:5581:42)
    at UCDocument.build (c:\Projecten\UnrealScriptLang\out\server.js:20658:34)
    at indexDocument (c:\Projecten\UnrealScriptLang\out\server.js:24237:18)
    at Object.next (c:\Projecten\UnrealScriptLang\out\server.js:26656:41)

Appears to be caused by the following code, when an unescaped string literal is proceeded by an eventual hash character:

     ...
     SpecialChars(1)=(Plain=""",Coded=""")
     ...
     SpecialChars(6)=(Plain="�",Coded="™")

Screenshots

No response

@EliotVU EliotVU added the bug Something isn't working label Feb 10, 2023
@EliotVU
Copy link
Owner Author

EliotVU commented Feb 10, 2023

Weird, as far as UT2004 goes, the string is actually escaped:

     SpecialChars(0)=(Plain="&",Coded="&")
     SpecialChars(1)=(Plain="\"",Coded=""")
     SpecialChars(2)=(Plain=" ",Coded=" ")
     SpecialChars(3)=(Plain="<",Coded="&lt;")
     SpecialChars(4)=(Plain=">",Coded="&gt;")
     SpecialChars(5)=(Plain="©",Coded="&copy;")
     SpecialChars(6)=(Plain="™",Coded="&#8482;")
     SpecialChars(7)=(Plain="®",Coded="&reg;")
     SpecialChars(8)=(Plain="'",Coded="&apos;")

@PolaricEntropy
Copy link

I had this happen in the following file, that does not seem to contain special characters: DeusExText.zip

@Shtoyan
Copy link
Contributor

Shtoyan commented Oct 13, 2023

I had this happen in the following file, that does not seem to contain special characters: DeusExText.zip

Did a quick check from interest - extension starts to work when you comment exec directive line.

@EliotVU
Copy link
Owner Author

EliotVU commented Oct 13, 2023

@Shtoyan or by appending function test(); after the exec line seems to work too, I suppose this is because ANTLR is then able to find an alternative rule to match :|

FYI: This is caused by the hacky-code in the ANTLR parser

skipLine(i = this._input.index): void {

: SHARP { const i = this.getIndex(); } identifier? { this.skipLine(i); }

@EliotVU EliotVU added the help wanted Extra attention is needed label Oct 13, 2023
@EliotVU
Copy link
Owner Author

EliotVU commented Oct 13, 2023

Unfortunately I have to admit I'm unable to fix this issue in a way that ANTLR would understand.

A proper solution to this #directive parsing would be to switch the lexer's channel when a statement directive has been detected, but ANTLR is incapable of this as far as I know.

@Shtoyan
Copy link
Contributor

Shtoyan commented Oct 14, 2023

No way to just ignore all #directives during parsing/lexing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants