Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for the inconsistent internal parser state bug #54

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

pithub
Copy link

@pithub pithub commented Nov 14, 2022

This code change fixes the following issues:

and makes the following pull requests superfluous:


As described in issue #53, there's a bug in the Elm.Kernel.Parser.findSubString function,
that leads to inconsistent internal parser positions where:

  • the offset into the source string is positioned before a token
  • the row and column are positioned after the token

This code change fixes that bug, so that both offset and row/column are consistently positioned after the token.


There are two reasons why I chose the after token position and not the before token position:

import Parser exposing ((|.), (|=), Parser)

testParser : Parser { row : Int, col : Int, offset : Int }
testParser =
    Parser.succeed (\row col offset -> { row = row, col = col, offset = offset })
        |. Parser.multiComment "{-" "-}" Parser.Nestable
        |= Parser.getRow
        |= Parser.getCol
        |= Parser.getOffset

Parser.run testParser "{- -}"
--> Ok { row = 1, col = 6, offset = 5 }

The real bug fix is on this line:

src/Elm/Kernel/Parser.js
@@ -133 +133 @@ var _Parser_findSubString = F5(function(smallString, offset, row, col, bigString
-       return __Utils_Tuple3(newOffset, row, col);
+       return __Utils_Tuple3(index < 0 ? -1 : target, row, col);

where we either return -1 to signal the "subString not found" case, or the offset after the subString, which is stored in variable target.

So what about all the other changes?


First, I decided to rename the newOffset variable:

src/Elm/Kernel/Parser.js
@@ -122,2 +122,2 @@ var _Parser_findSubString = F5(function(smallString, offset, row, col, bigString
-       var newOffset = bigString.indexOf(smallString, offset);
-       var target = newOffset < 0 ? bigString.length : newOffset + smallString.length;
+       var index = bigString.indexOf(smallString, offset);
+       var target = index < 0 ? bigString.length : index + smallString.length;

It doesn't contain the new offset anymore (since we return either -1 or target), and I thought the name would be misleading now. Therefore I changed the name to "index", because it contains the result of the indexOf function.


In the Parser.Advanced module, there's a wrapper function for every Kernel function. The comment of the findSubString wrapper function has been wrong before this code change (see #37 "Fix comment in findSubString"), and it was still wrong after the code change, so I changed it to document the fact, that we return the position after the subString:

src/Parser/Advanced.elm
@@ -1125,7 +1125,7 @@ isAsciiCode =
     findSubString "42" offset row col "Is 42 the answer?"
         --==> (newOffset, newRow, newCol)
 
-If `offset = 0` we would get `(3, 1, 4)`
+If `offset = 0` we would get `(5, 1, 6)`
 If `offset = 7` we would get `(-1, 1, 18)`
 -}
 findSubString : String -> Int -> Int -> Int -> String -> (Int, Int, Int)

The wrapper functions hide the fact, that they are implemented in JavaScript rather than in Elm, from the rest of the module code. The rest of the code only uses the wrapper functions, just as if they had been implemented in Elm.

The only exception from this rule was a line where the findSubString Kernel function has been called directly, and I thought it was appropriate to change this line, too, to be consistent with the rest of the code, when we are modifying the findSubString function itself:

src/Parser/Advanced.elm
@@ -913 +913 @@ chompUntilEndOr str =
-        Elm.Kernel.Parser.findSubString str s.offset s.row s.col s.src
+        findSubString str s.offset s.row s.col s.src

@rupertlssmith
Copy link

If this patch fixes:

#20

Does that mean it should be preferred to the PR that fixes just that issue:

#21

This PR seems more general than that one and fixes a number of issues together.

rupertlssmith pushed a commit to elm-janitor/parser that referenced this pull request Feb 17, 2023
fixes elm#53

Fix bug in Elm.Kernel.Parser.findSubString
@pithub
Copy link
Author

pithub commented Feb 18, 2023

Hi Rupert, I think you addressed the question to me, but as the author I'm biased, of course. If I wouldn't prefer this PR to #21, then I wouldn't have added it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants