Parse whitespace more precisely #1483

sjakobi · 2019-10-28T16:38:05Z

This is preparatory work for #1454.

sjakobi · 2019-10-28T16:39:56Z

So far, the parser is still quite broken. For example:

      ./dhall-lang/tests/parser/success/unit/MergeXYZ:                        FAIL
        Exception: 
        Error: Invalid input
        
        1:9:
          |
        1 | merge x y z
          |         ^^
        unexpected "y "
        expecting operator or whitespace
      ./dhall-lang/tests/parser/success/unit/MergeParenAnnotation:            FAIL
        Exception: 
        Error: Invalid input
        
        1:10:
          |
        1 | (merge x y) : t
          |          ^^
        unexpected "y)"
        expecting operator or whitespace
      ./dhall-lang/tests/parser/success/unit/Merge:                           FAIL
        Exception: 
        Error: Invalid input
        
        1:9:
          |
        1 | merge x y
          |         ^^
        unexpected "y<newline>"
        expecting operator or whitespace
      ./dhall-lang/tests/parser/success/unit/ListLitEmptyPrecedence:          FAIL
        Exception: 
        Error: Invalid input
        
        1:11:
          |
        1 | [] : List T U
          |           ^
        unexpected 'T'
        expecting ->, :, end of input, operator, or whitespace
      ./dhall-lang/tests/parser/success/unit/ListLitNonEmptyAnnotated:        FAIL
        Exception: 
        Error: Invalid input
        
        1:15:
          |
        1 | [x, y] : List T
          |               ^
        unexpected 'T'
        expecting ->, :, end of input, operator, or whitespace

Gabriella439 · 2019-10-29T03:13:40Z

@sjakobi: I pushed a change with some fixes that illustrates what went wrong. Many of the remaining fixes are due to the import parser needing to be fixed to not consume trailing whitespace

sjakobi · 2019-10-31T18:53:01Z

Thanks a lot @Gabriel439! :)

This is the last parser test failure:

      ./dhall-lang/tests/parser/success/largeExpression:                                     FAIL
        Exception: 
        Error: Invalid input
        
        67:3:
           |
        67 |   (   λ ( x
           |   ^
        unexpected '('
        expecting end of input or whitespace

There are a bunch of other test failures remaining though.

sjakobi · 2019-10-31T19:20:55Z

CI should turn green now. I'll do some cleanup.

sjakobi · 2019-10-31T20:07:13Z

A few test failures in dhall-lsp-server:

  Completion
    Dhall.Completion
      suggests user defined types:     FAIL (0.03s)
        expected: "Config"
         but got: "toMap"
      suggests user defined functions: FAIL (0.03s)
        expected: "makeUser"
         but got: "toMap"
      suggests user defined bindings:  FAIL (0.03s)
        expected: "bob"
         but got: "toMap"
      suggests functions from imports: FAIL (0.03s)
        uncaught exception: ErrorCall
        Prelude.head: empty list
  Hovering
    Dhall.Hover
      reports types on hover:          FAIL (0.03s)
        expected: "{ home : Text, name : Text }"
         but got: "Type"

sjakobi · 2019-10-31T22:52:11Z

dhall/src/Dhall/Parser/Expression.hs

            _arrow
+            whitespace


I wonder whether we should add aliases whsp and whsp1 to reduce the noise a bit.

sjakobi · 2019-11-01T17:01:42Z

@EggBaconAndSpam Do you have some advice on how to best update dhall-lsp-server? Should I attempt to fix the parsers in Dhall.LSP.Backend.Parsing in analogy to the changes in Dhall.Parser.Expression or could I potentially completely get rid of them now?

sjakobi · 2019-11-01T21:17:29Z

It took me way too long to figure out why my changes to dhall-lsp-server wouldn't change the test results from stack test dhall-lsp-server:tests:

dhall-lsp-server:tests is an integration test suite that depends on the executable dhall-lsp-server.
While stack happily reports

Installing executable dhall-lsp-server in <path>

every time I change the library, it will only re-compile the executable if it's one of the targets!

So you have to include the executable in the targets and run

stack test dhall-lsp-server:tests dhall-lsp-server:dhall-lsp-server

😱 😱 😱

sjakobi · 2019-11-01T22:56:45Z

I've modified dhall-lsp-server's just enough now that the tests pass. dhall-lsp-server is very likely still broken for other inputs though.

Context: #1483 (comment)

sjakobi · 2019-11-02T03:40:43Z

While looking at dhall-lsp-server's tests I was surprised by this expectation:

dhall-haskell/dhall-lsp-server/tests/Main.hs

Line 40 in 1b46f18

getValue functionContent `shouldBe` "{ home : Text, name : Text }"

My understanding is that this should be the type of mkUser here:

dhall-haskell/dhall-lsp-server/tests/fixtures/hovering/Types.dhall

Lines 3 to 9 in 1b46f18

    
           let mkUser = 
        
                   λ(_isAdmin : Bool) 
        
                 →       if _isAdmin 
        
                   then  { name = "admin", home = "/home/admin" } 
        
                   else  { name = "default", home = "/home/user" }

Shouldn't the type be

Bool -> { home : Text, name : Text }

then?

(CC @mujx)

Context: #1483 (comment)

Gabriella439 · 2019-11-02T18:45:43Z

@sjakobi: It's a bug in dhall-lsp-server. I can reproduce:

sjakobi · 2019-11-03T17:32:29Z

@Gabriel439 I have made a new issue for this: #1510

Would you mind giving this a final review, so we can get this merged?

Gabriella439 · 2019-11-03T18:05:55Z

dhall/tests/Dhall/Test/Parser.hs

-              parseDirectory </> "failure/unit/ImportEnvWrongEscape.dhall"
-
-              -- Other spacing related unexpected successes:
-            , parseDirectory </> "failure/spacing/AnnotationNoSpace.dhall"


It's great that these are finally fixed! 🙂

This is preparatory work for #1454. This also fixes some cases where dhall would previously accept malformatted inputs. The changes to dhall-lsp-server are mostly untested. See #1510. Co-authored-by: Gabriel Gonzalez <Gabriel439@gmail.com>

sjakobi · 2019-11-03T19:18:37Z

I remembered that we should look at the performance impact:

I've benchmarked dhall resolve --immediate-dependencies for cpkg's pkg-set.dhall:

`master`

$ bench "dhall resolve --immediate-dependencies --file pkgs/pkg-set.dhall"
benchmarking dhall resolve --immediate-dependencies --file pkgs/pkg-set.dhall
time                 154.7 ms   (151.9 ms .. 156.1 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 155.9 ms   (154.9 ms .. 156.7 ms)
std dev              1.217 ms   (701.7 μs .. 1.891 ms)
variance introduced by outliers: 12% (moderately inflated)

This branch

$ bench "dhall resolve --immediate-dependencies --file pkgs/pkg-set.dhall"
benchmarking dhall resolve --immediate-dependencies --file pkgs/pkg-set.dhall
time                 232.9 ms   (226.7 ms .. 246.0 ms)
                     0.999 R²   (0.993 R² .. 1.000 R²)
mean                 228.1 ms   (225.2 ms .. 233.7 ms)
std dev              4.848 ms   (1.233 ms .. 6.609 ms)
variance introduced by outliers: 14% (moderately inflated)

This is an increase of 50.5%!

In the micro-benchmarks I've noticed particular increases for the comment parsing:

dhall-haskell/dhall/benchmark/parser/Main.hs

Lines 95 to 96 in cc1814b

    
           , benchExprFromText "Line comment" ("x -- " <> T.replicate 1000000 " ") 
        
           , benchExprFromText "Block comment" ("x {- " <> T.replicate 1000000 " " <> "-}")

master

benchmarked Line comment
time                 11.86 ms   (11.69 ms .. 11.98 ms)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 11.84 ms   (11.79 ms .. 11.89 ms)
std dev              129.4 μs   (107.2 μs .. 164.1 μs)

benchmarked Block comment
time                 13.20 ms   (13.00 ms .. 13.41 ms)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 13.59 ms   (13.41 ms .. 13.94 ms)
std dev              600.0 μs   (142.2 μs .. 953.7 μs)
variance introduced by outliers: 15% (moderately inflated)

This branch

benchmarked Line comment
time                 228.6 ms   (225.6 ms .. 231.1 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 230.3 ms   (228.9 ms .. 233.8 ms)
std dev              3.631 ms   (1.668 ms .. 5.820 ms)

benchmarking Block comment ... took 14.31 s, total 56 iterations
benchmarked Block comment
time                 260.9 ms   (234.1 ms .. 298.2 ms)
                     0.976 R²   (0.940 R² .. 0.997 R²)
mean                 255.3 ms   (243.2 ms .. 273.0 ms)
std dev              25.54 ms   (17.55 ms .. 35.68 ms)
variance introduced by outliers: 29% (moderately inflated)

This got about 20x slower!

Gabriella439 · 2019-11-03T19:28:17Z

@sjakobi: The reason it is slower is because of backtracking. After it parses the x there are actually several candidate whitespace parsers guarded by try (I'm guessing roughly 20 of them!) so it has to try all 20 to know which one to commit to before it gives up and tries the final whitespace after completeExpression

This can be fixed by left-factoring the parsers so that they no longer need to wrap the whitespaces in trys, although it will be a little tricky to do so since there isn't really a systematic way to do so.

sjakobi · 2019-11-03T19:37:47Z

Right. Is this good to merge anyways or do you want to try left-factoring it first? I suspect that I won't be of much help with the parser performance. :/

Gabriella439 · 2019-11-03T19:38:37Z

@sjakobi: You can go ahead and merge. I can work on the parsing performance

This undoes some of the performance regression introduced in #1483 Before #1483: ``` benchmarked Line comment time 11.86 ms (11.69 ms .. 11.98 ms) 0.999 R² (0.999 R² .. 1.000 R²) mean 11.84 ms (11.79 ms .. 11.89 ms) std dev 129.4 μs (107.2 μs .. 164.1 μs) benchmarked Block comment time 13.20 ms (13.00 ms .. 13.41 ms) 0.999 R² (0.998 R² .. 1.000 R²) mean 13.59 ms (13.41 ms .. 13.94 ms) std dev 600.0 μs (142.2 μs .. 953.7 μs) ``` After #1483: ``` benchmarked Line comment time 288.7 ms (282.8 ms .. 294.7 ms) 1.000 R² (0.999 R² .. 1.000 R²) mean 292.3 ms (290.8 ms .. 294.6 ms) std dev 3.156 ms (2.216 ms .. 4.546 ms) benchmarked Block comment time 286.2 ms (280.9 ms .. 292.6 ms) 0.999 R² (0.998 R² .. 1.000 R²) mean 290.6 ms (288.3 ms .. 292.9 ms) std dev 3.875 ms (2.866 ms .. 5.500 ms) ``` After this change: ``` benchmarked Line comment time 61.44 ms (60.37 ms .. 63.03 ms) 0.999 R² (0.997 R² .. 1.000 R²) mean 61.41 ms (60.74 ms .. 62.25 ms) std dev 1.341 ms (945.0 μs .. 1.901 ms) benchmarked Block comment time 61.83 ms (60.97 ms .. 63.14 ms) 0.999 R² (0.998 R² .. 1.000 R²) mean 61.16 ms (60.33 ms .. 61.85 ms) std dev 1.396 ms (1.011 ms .. 1.907 ms) ```

@sjakobi

* Partially fix whitespace parsing performance regression This undoes some of the performance regression introduced in #1483 Before #1483: ``` benchmarked Line comment time 11.86 ms (11.69 ms .. 11.98 ms) 0.999 R² (0.999 R² .. 1.000 R²) mean 11.84 ms (11.79 ms .. 11.89 ms) std dev 129.4 μs (107.2 μs .. 164.1 μs) benchmarked Block comment time 13.20 ms (13.00 ms .. 13.41 ms) 0.999 R² (0.998 R² .. 1.000 R²) mean 13.59 ms (13.41 ms .. 13.94 ms) std dev 600.0 μs (142.2 μs .. 953.7 μs) ``` After #1483: ``` benchmarked Line comment time 288.7 ms (282.8 ms .. 294.7 ms) 1.000 R² (0.999 R² .. 1.000 R²) mean 292.3 ms (290.8 ms .. 294.6 ms) std dev 3.156 ms (2.216 ms .. 4.546 ms) benchmarked Block comment time 286.2 ms (280.9 ms .. 292.6 ms) 0.999 R² (0.998 R² .. 1.000 R²) mean 290.6 ms (288.3 ms .. 292.9 ms) std dev 3.875 ms (2.866 ms .. 5.500 ms) ``` After this change: ``` benchmarked Line comment time 61.44 ms (60.37 ms .. 63.03 ms) 0.999 R² (0.997 R² .. 1.000 R²) mean 61.41 ms (60.74 ms .. 62.25 ms) std dev 1.341 ms (945.0 μs .. 1.901 ms) benchmarked Block comment time 61.83 ms (60.97 ms .. 63.14 ms) 0.999 R² (0.998 R² .. 1.000 R²) mean 61.16 ms (60.33 ms .. 61.85 ms) std dev 1.396 ms (1.011 ms .. 1.907 ms) ``` * Correctly parse `https://example.com usingBla` ... as caught by @sjakobi

sjakobi force-pushed the sjakobi/precise-whitespace branch from 1575a46 to 10740cc Compare October 28, 2019 16:43

sjakobi commented Oct 31, 2019

View reviewed changes

This was referenced Nov 1, 2019

dhall-format removes comments #145

Open

Hydra should test dhall-lsp-server #1503

Open

sjakobi added a commit that referenced this pull request Nov 2, 2019

dhall-lsp-server: Document the integration tests gotcha

bfc4021

Context: #1483 (comment)

sjakobi mentioned this pull request Nov 2, 2019

dhall-lsp-server: Document the integration tests gotcha #1506

Merged

mergify bot pushed a commit that referenced this pull request Nov 2, 2019

dhall-lsp-server: Document the integration tests gotcha (#1506)

5c0c1f4

Context: #1483 (comment)

Gabriella439 approved these changes Nov 3, 2019

View reviewed changes

Parse whitespace more precisely

42e7240

This is preparatory work for #1454. This also fixes some cases where dhall would previously accept malformatted inputs. The changes to dhall-lsp-server are mostly untested. See #1510. Co-authored-by: Gabriel Gonzalez <Gabriel439@gmail.com>

sjakobi force-pushed the sjakobi/precise-whitespace branch from 7c622cb to 42e7240 Compare November 3, 2019 18:29

sjakobi changed the title ~~WIP: Parse whitespace more precisely~~ Parse whitespace more precisely Nov 3, 2019

sjakobi added the merge me label Nov 3, 2019

mergify bot merged commit 7eec31d into master Nov 3, 2019

mergify bot deleted the sjakobi/precise-whitespace branch November 3, 2019 19:43

Gabriella439 mentioned this pull request Nov 4, 2019

Partially fix whitespace parsing performance regression #1512

Merged

sjakobi mentioned this pull request Nov 5, 2019

(missing) is parsed as a variable #1454

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parse whitespace more precisely #1483

Parse whitespace more precisely #1483

sjakobi commented Oct 28, 2019

sjakobi commented Oct 28, 2019

Gabriella439 commented Oct 29, 2019

sjakobi commented Oct 31, 2019

sjakobi commented Oct 31, 2019

sjakobi commented Oct 31, 2019

sjakobi Oct 31, 2019

sjakobi commented Nov 1, 2019

sjakobi commented Nov 1, 2019 •

edited

Loading

sjakobi commented Nov 1, 2019

sjakobi commented Nov 2, 2019

Gabriella439 commented Nov 2, 2019

sjakobi commented Nov 3, 2019

Gabriella439 Nov 3, 2019

sjakobi commented Nov 3, 2019

Gabriella439 commented Nov 3, 2019 •

edited

Loading

sjakobi commented Nov 3, 2019

Gabriella439 commented Nov 3, 2019

Parse whitespace more precisely #1483

Parse whitespace more precisely #1483

Conversation

sjakobi commented Oct 28, 2019

sjakobi commented Oct 28, 2019

Gabriella439 commented Oct 29, 2019

sjakobi commented Oct 31, 2019

sjakobi commented Oct 31, 2019

sjakobi commented Oct 31, 2019

sjakobi Oct 31, 2019

Choose a reason for hiding this comment

sjakobi commented Nov 1, 2019

sjakobi commented Nov 1, 2019 • edited Loading

sjakobi commented Nov 1, 2019

sjakobi commented Nov 2, 2019

Gabriella439 commented Nov 2, 2019

sjakobi commented Nov 3, 2019

Gabriella439 Nov 3, 2019

Choose a reason for hiding this comment

sjakobi commented Nov 3, 2019

master

This branch

master

This branch

Gabriella439 commented Nov 3, 2019 • edited Loading

sjakobi commented Nov 3, 2019

Gabriella439 commented Nov 3, 2019

sjakobi commented Nov 1, 2019 •

edited

Loading

`master`

Gabriella439 commented Nov 3, 2019 •

edited

Loading