Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve formatting of operator chains #830

Merged
merged 1 commit into from
Dec 16, 2021

Conversation

tbagrel1
Copy link
Member

@tbagrel1 tbagrel1 commented Nov 3, 2021

Close #826, close #785, close #690, close #825.

@tbagrel1 tbagrel1 changed the title Use scrapped fixity info to reassociate OpTree Improve formatting of operator chains Nov 3, 2021
@tbagrel1 tbagrel1 linked an issue Nov 3, 2021 that may be closed by this pull request
@tbagrel1 tbagrel1 force-pushed the tbagrel1/826-operator-chain-improvement branch from 6f93427 to 200b651 Compare November 8, 2021 16:06
@tbagrel1
Copy link
Member Author

tbagrel1 commented Nov 9, 2021

When we have something like this:

a =
  1 + 2 + do
    3

Should it stay as it is, or because the layout of the whole tree is multiline, should it be formatted to

a =
  1
    + 2
    + do
      3

?

@mrkkrp
Copy link
Member

mrkkrp commented Nov 10, 2021

I would say it should stay as is, it is also the status quo.

@tbagrel1 tbagrel1 force-pushed the tbagrel1/826-operator-chain-improvement branch 2 times, most recently from fe2363e to 47c896f Compare November 12, 2021 12:30
@tbagrel1 tbagrel1 marked this pull request as ready for review November 12, 2021 12:31
Copy link
Member Author

@tbagrel1 tbagrel1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About examples where operator fixities cannot be ordered properly:

At the moment, with the non-configurable context-unaware "default" fixity map, we loose a lot of information about operators because of conflicting definitions. However, in a few days, I will improve the fixity map construction, by allowing

  • user custom fixity declarations
  • package detection from .cabal files
  • manual package selection
  • and more...

which should give way better results requiring little to no explicit configuration.
The fixity map generation will also be rewritten in Haskell.

format.sh Outdated Show resolved Hide resolved
src/Ormolu/Printer/Meat/Type.hs Outdated Show resolved Hide resolved
Copy link
Member

@mrkkrp mrkkrp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making such wonderful progress on this! I did the first pass and left some comments. I did not look at the implementation yet. At this point we want to make sure that the behavior is right. Once we are happy with that we can focus more on how this is implemented.

@mrkkrp
Copy link
Member

mrkkrp commented Nov 16, 2021

Also please note that we have merged #779 and this PR needs to be rebased on master.

@mrkkrp
Copy link
Member

mrkkrp commented Nov 17, 2021

I think we should incorporate your ideas from the #832 in this PR. In particular:

  • Perform generation of the fixity/precedence map in Haskell.
  • Store info for different packages but do not make a choice at compile time regarding what operator is more popular.
  • Rename --cabal-default-extensions to --cabal which should, in addition to what it is doing now, be used to extract the list of direct dependencies. Do not try to resolve transitive dependencies. Also add -p / --package option that can be repeated many times to provide "manually" a list of packages. These packages (either extracted from Cabal or supplied with -p) should be used to select the correct operator info. Otherwise, use a heuristic. An explicit config file can be added in a subsequent PR.

@tbagrel1
Copy link
Member Author

tbagrel1 commented Nov 17, 2021

I'm currently working on

@tbagrel1 tbagrel1 force-pushed the tbagrel1/826-operator-chain-improvement branch 2 times, most recently from 0575795 to 95db2c6 Compare November 17, 2021 16:15
@tbagrel1 tbagrel1 force-pushed the tbagrel1/826-operator-chain-improvement branch from e6fc8e3 to df8f3a5 Compare November 23, 2021 16:01
default.nix Outdated Show resolved Hide resolved
@tbagrel1
Copy link
Member Author

When trying a full nix-build, I encounter an error for the weeder target:

[nix-shell:~/tweag/ormolu]$ nix-build -A weeder
trace: No index state specified for haskell-project, using the latest index state that we know about (2021-11-16T00:00:00Z)!
trace: No index state specified for weeder, using the latest index state that we know about (2021-11-16T00:00:00Z)!
these derivations will be built:
  /nix/store/zh9nsakyl5sn18f7bky9ymjb10c4l8rv-ormolu-weeder.drv
building '/nix/store/zh9nsakyl5sn18f7bky9ymjb10c4l8rv-ormolu-weeder.drv'...
src/Ormolu/Fixity.hs:129: defaultFixityMap
src/Ormolu/Fixity.hs:65: hoogleHackageInfoFile
builder for '/nix/store/zh9nsakyl5sn18f7bky9ymjb10c4l8rv-ormolu-weeder.drv' failed with exit code 1
  • defaultFixityMap is used by ormolu-live , but not by ormolu directly (so this error is expected, but I don't know how to silence it)
  • hoogleHackageInfoFile contains the embedded JSON fixity file as a ByteString, and is actually used by the rest of the ormolu code, so I don't really know why weeder spots it :thinking_face:

@amesgen
Copy link
Member

amesgen commented Nov 25, 2021

  • We currently build Ormolu Live with GHCJS only (which does not support HIE files used by weeder), so we could either add defaultFixityMap as an exception (by adding "^Ormolu.Fixity.defaultFixityMap\$" to roots in weeder.dhall), or inline the definition of defaultFixityMap in Ormolu Live (and then remove it).
  • This appears to be a bug in weeder (I have a minimal reproducible example), I will report (edit: False positive (and possibly negative) with top level pattern matching ocharles/weeder#92 to fix this upstream. Until then, it can also be added to roots in weeder.dhall.

@tbagrel1
Copy link
Member Author

tbagrel1 commented Nov 25, 2021

We finally have a version that passes all tests (thanks again @amesgen)! But there is still some work to do:

TODO list:

  • Check and complete comments/documentation
  • Rename operator-chain-xxx.hs files with a more descriptive name
  • Add tests to check that fixity config is correctly parsed from command line no longer relevant
  • Add tests to check that fixity config gives the correct fixity map (with cabal dependencies taken into account)
  • Decide what to do with (duplicated sometimes) Could not find a .cabal file for ... no longer relevant

@tbagrel1 tbagrel1 force-pushed the tbagrel1/826-operator-chain-improvement branch from df8f3a5 to af9f7e4 Compare November 25, 2021 14:43
@amesgen
Copy link
Member

amesgen commented Nov 25, 2021

Decide what to do with (duplicated sometimes) Could not find a .cabal file for ...

Do you mean the ones in expected-failures? It is expected that these warnings appear from time to time, as many projects have .hs files which do not appear in a .cabal file (e.g. Setup.hs, a file mostly present for legacy cabal reasons, or other example/test files).

@tbagrel1
Copy link
Member Author

tbagrel1 commented Nov 25, 2021

Decide what to do with (duplicated sometimes) Could not find a .cabal file for ...

Do you mean the ones in expected-failures? It is expected that these warnings appear from time to time, as many projects have .hs files which do not appear in a .cabal file (e.g. Setup.hs, a file mostly present for legacy cabal reasons, or other example/test files).

I enabled by default the autodetection of cabal dependencies to build the fixity map. If a cabal file is not found, a warning is emitted (in the same way detectCabalExtensions does, but this option is disabled by default). As a result, a warning can be displayed in contexts where there wasn't any before. Moreover, when both detectCabalExtensions and detectCabalDependencies options are enabled, the same warning can be emitted twice (because atm, each function is responsible for finding the cabal file associated to a given source file).

How do you think we can improve this?

@amesgen
Copy link
Member

amesgen commented Nov 25, 2021

One possible thing would be to try (if necessary) to find the .cabal files for all input files in the beginning, so warnings would only appear once (and all in one place), and the .cabal files would also only be parsed once (and again and again not for every file formatted, even though I don't think that this is very relevant for performance). While I don't think it is hard to do, it might require quite some refactoring, so I would suggest to leave it as is for now?

@tbagrel1
Copy link
Member Author

tbagrel1 commented Nov 25, 2021

Also, the CI still fails, but I don't know why -_-
It works with GHC 8.x, but not 9.x. It seems related to some dependency for extract-hoogle-hackage-info not available with the correct versions in 9.x, but I added the fix you suggested to default.nix to indicate that extract-hoogle-hackage-info should only be built with 8.10...

@amesgen
Copy link
Member

amesgen commented Nov 25, 2021

This is due to weird cabal behavior in multi package projects (it tries to resolve dependencies for packages it does not build). You can fix this by adding these two lines to the extract-hoogle .cabal file:

if !impl(ghc == 8.10.*)
buildable: False

@tbagrel1
Copy link
Member Author

What would I do without you 😅
Thank you, I'll try that!

@tbagrel1 tbagrel1 force-pushed the tbagrel1/826-operator-chain-improvement branch from af9f7e4 to d99998f Compare November 25, 2021 15:22
@tbagrel1
Copy link
Member Author

tbagrel1 commented Dec 6, 2021

Now that FixityDirection is no longer the GHC defined one, why do we implement FromJSON and ToJSON manually? Is there a drawback to just deriving them thanks to Generic?

@tbagrel1
Copy link
Member Author

tbagrel1 commented Dec 6, 2021

@tbagrel1
Copy link
Member Author

tbagrel1 commented Dec 7, 2021

I just finished most of my assigned tasks (see d6fb306a41cb75ec18995eec443b233baaaff72c ); but I have a few questions:

  • do you think we need more comments/documentation in extract-hackage-info/src/Mains.hs?
  • do you think we need tests for the n-ary tree reassociation?
  • should we get rid of the manual implementation of FromJSON and ToJson for FixityDirection and FixityInfo (and derive it instead)?
  • should we get rid of Data.HashMap.Strict and just use Data.Map instead?

Also, I think that I will have time to implement the "only use NaryOpTree, get rid of OpTree" feature, if it is ok for you.

@amesgen
Copy link
Member

amesgen commented Dec 7, 2021

should we get rid of Data.HashMap.Strict and just use Data.Map instead?

I did this in issue-826-fixities-th, but it is not isolated there.

src/Ormolu/Fixity.hs Outdated Show resolved Hide resolved
@mrkkrp
Copy link
Member

mrkkrp commented Dec 7, 2021

@tbagrel1 Thanks, I'll review your work later this week.

do you think we need more comments/documentation in extract-hackage-info/src/Mains.hs?

Not at the moment. I'll need to find time to look in more detail at that file anyway, which make cause it to change :)

do you think we need tests for the n-ary tree reassociation?

Let's wait till we simplify the situation with trees.

should we get rid of the manual implementation of FromJSON and ToJson for FixityDirection and FixityInfo (and derive it instead)?

I like to control JSON instances manually because I'm not always happy with auto-generated instances. Let's keep the manual definitions for now.

should we get rid of Data.HashMap.Strict and just use Data.Map instead?

Let's not do this in this branch, instead let's take it from @amesgen 's branch when it is ready.

I will have time to implement the "only use NaryOpTree, get rid of OpTree" feature

Feel free to attempt this if you have time. Like we discussed last week, the idea would be to get rid of OpTree by i) directly converting AST to NaryOpTree and ii) printing NaryOpTree directly after it has been reassociated properly. We also want to get rid of SubTreeInfo if possible and instead do span calculations, etc. directly.

@amesgen
Copy link
Member

amesgen commented Dec 8, 2021

Here are the benchmarks for fixity parsing at compile time: https://gist.github.com/amesgen/64e86711d21fb7e2c4fa5b597438a349

@tbagrel1
Copy link
Member Author

tbagrel1 commented Dec 8, 2021

Note to myself: maybe we can get rid of the OpInfo wrapper at printing time

@tbagrel1
Copy link
Member Author

tbagrel1 commented Dec 9, 2021

Another round of questions:

  • Should we rename OpBranches to OpBranch now that OpTree is actually a N-ary tree? (the idea behind that would be to reduce diff size, maybe)?
  • In p_cmdOpTree and p_tyOpTree, there is no "hard splitter operator" ($-like) special case, so I don't need every operator to be wrapped in OpInfo op. Should I keep OpInfo to stay consistent with p_exprOpTree, or should I get rid of it? Or maybe there are operators in cmd and types contexts that would benefit from being placed in a trailing position?
  • With a really minor change, we could display operator priorities in multiline contexts (like I wanted initially when I started this PR, but with no known blockers this time). Here is a link to a before/after preview of what it would give: https://gist.github.com/tbagrel1/d86e067fbdfea317713c561743a6dbe1. I think that it might be interesting to reconsider this idea now that the implementation is way more robust.

@mrkkrp
Copy link
Member

mrkkrp commented Dec 9, 2021

Should we rename OpBranches to OpBranch now that OpTree is actually a N-ary tree?

I think we can rename NaryOpTree to OpTree. Not sure about OpBranches because the way it is, it reflects the possibility of having multiple branches there, which is nice.

Should I keep OpInfo to stay consistent with p_exprOpTree, or should I get rid of it?

If this is going to require an extra function that strips OpInfo, I'd say let's keep it even if it is not used.

display operator priorities in multiline contexts

I'd say let's not do this because this is not how Haskellers write code. For example here

type PermuteRef =
         "a"
           :> ( "b" :> "c" :> End
                  :<|> "c" :> "b" :> End
              )
           :<|> "b"
             :> ( "a" :> "c" :> End
                    :<|> "c" :> "a" :> End
                )

No one would actually indent :> one level more just to show that it has higher priority. I think it's going to look a bit strange and perhaps even provoke bug reports.

Copy link
Member Author

@tbagrel1 tbagrel1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did another pass on every changed file, and I added a few questions along the way

src/Ormolu/Printer/Internal.hs Show resolved Hide resolved
src/Ormolu/Printer/Meat/Declaration/Warning.hs Outdated Show resolved Hide resolved
src/Ormolu/Printer/Meat/Pragma.hs Show resolved Hide resolved
@tbagrel1 tbagrel1 force-pushed the tbagrel1/826-operator-chain-improvement branch from 7934fd1 to 82638bc Compare December 10, 2021 09:48
default.nix Outdated Show resolved Hide resolved
weeder.dhall Show resolved Hide resolved
extract-hackage-info/src/Main.hs Show resolved Hide resolved
extract-hackage-info/src/Main.hs Show resolved Hide resolved
extract-hackage-info/src/Main.hs Show resolved Hide resolved
extract-hackage-info/src/Main.hs Show resolved Hide resolved
extract-hackage-info/src/Main.hs Outdated Show resolved Hide resolved
extract-hackage-info/src/Main.hs Outdated Show resolved Hide resolved
extract-hackage-info/src/Main.hs Outdated Show resolved Hide resolved
@mrkkrp mrkkrp force-pushed the tbagrel1/826-operator-chain-improvement branch from 17784e3 to f9e1660 Compare December 14, 2021 16:53
src/Ormolu/Fixity.hs Show resolved Hide resolved
src/Ormolu/Fixity.hs Outdated Show resolved Hide resolved
tests/Ormolu/HackageInfoSpec.hs Show resolved Hide resolved
tests/Ormolu/HackageInfoSpec.hs Outdated Show resolved Hide resolved
src/Ormolu/Printer/Operators.hs Show resolved Hide resolved
@mrkkrp
Copy link
Member

mrkkrp commented Dec 14, 2021

I played with profiling today a little bit and it looks like if we delay construction of remainingFixityMap until we really don't find an operator in the "more common" maps we can spend in buildFixityMap less than 1 % of time in most cases (e.g. when all operators come from dependencies or are built-in). We could define the resulting fixity map as a type that wraps two maps perhaps and then make a two-stage lookup function. I think this might solve the performance issue.

@tbagrel1
Copy link
Member Author

I played with profiling today a little bit and it looks like if we delay construction of remainingFixityMap until we really don't find an operator in the "more common" maps we can spend in buildFixityMap less than 1 % of time in most cases (e.g. when all operators come from dependencies or are built-in). We could define the resulting fixity map as a type that wraps two maps perhaps and then make a two-stage lookup function. I think this might solve the performance issue.

Who is/will be working on this? Should I take care of it?

@mrkkrp
Copy link
Member

mrkkrp commented Dec 15, 2021

Who is/will be working on this? Should I take care of it?

I think we should stick to our plan that we outlined on Monday:

  • @tbagrel1 works on the tests for reassociation logic and addresses my feedback as I do the final pass of my review.
  • @amesgen Tries to improve the performance.

@mrkkrp
Copy link
Member

mrkkrp commented Dec 15, 2021

I have finished reviewing the PR. There are two things that we still need to address before we merge it:

  • Tests for operator chain reassociation.
  • Improve performance.

Other than that, it looks really good. Great job @tbagrel1 !

tests/Ormolu/OpTreeSpec.hs Outdated Show resolved Hide resolved
@tbagrel1 tbagrel1 force-pushed the tbagrel1/826-operator-chain-improvement branch from 4aa9bbd to 469e7b9 Compare December 16, 2021 13:56
@tbagrel1
Copy link
Member Author

I think that we also need to fix this before merging: #845 (comment)
I'll do it right now

See the changelog for details.

Co-authored-by: Alexander Esgen <alexander.esgen@tweag.io>
Co-authored-by: Mark Karpov <mark.karpov@tweag.io>
@mrkkrp mrkkrp force-pushed the tbagrel1/826-operator-chain-improvement branch from 313c39c to c57544b Compare December 16, 2021 15:18
@mrkkrp
Copy link
Member

mrkkrp commented Dec 16, 2021

@tbagrel1 Let's do it in a separate PR. I'm going to merge this once it is green.

@mrkkrp mrkkrp merged commit 1f63136 into master Dec 16, 2021
@mrkkrp mrkkrp deleted the tbagrel1/826-operator-chain-improvement branch December 16, 2021 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants