Split out happy-codegen-common #221

Ericson2314 · 2022-01-14T00:33:32Z

There was some extra information stuff in Grammar which had nothing to
do with the grammar, but was simply there because Grammar was also
playing the role of capturing all information from the abstract syntax.

I don't think that's good. As we really try to really make libraries out
of this stuff we should be stricter and stricter about separating
concerns. Grammar should really just be that, the grammar, and the code
in happy-tabular should not be privy to information that is just for
the backend.

Splitting out a code generation CommonOptions type is a first step to rectifying this. I
hope we can do a few more refactors like this to really make each data
type shine on its own.

I don't want to sound to harsh though, since we purposely made the split
low impact with more cleanups -- such as this -- left for later. It is
of course easier to see what's good and what's bad once the code is
split up!

Also, this change calls into question my previous BookendedAbsSyn.
We're now acknowledging the abstract syntax mixes "middle" and back end
concerns, and those are not properly separated until the nest step.
Given that, there isn't much use of making BookendedAbsSyn when we
could just stick the header and footer in CommonOptions.

int-index · 2022-01-14T01:40:46Z

I’m fine with splitting out a data type, but I think that making it a separate package is overkill. Nothing stops us from making a separate package for every data type and every function, and then we could be really explicit about what depends on what, but there’s a cost to this: uploading more packages to Hackage, maintaining correct version bounds, cluttering the dependency graph for end-users, etc.

I actually think the separation between happy-grammar and happy-tabular is also gratuitous. We don’t have use cases where one would be used without the other.

There’s a trade-off between fine-grained and coarse-grained dependencies, and I’d like our decisions to be informed by (at least imaginary) use cases:

happy-cli can be a separate package because I can imagine using happy as a library (e.g. GHC could add built-in support for .y files)
happy-frontend can be a separate package because with a TH-based approach we wouldn’t be parsing AbsSyn
happy-backend-lalr and happy-backend-glr can be separate packages because in any given application, only one backend is likely to be used
happy-tabular must be separated out to provide a common interface for happy-frontend and happy-backend-* to communicate

happy-grammar can be easily merged into happy-tabular. Yes, that would mean that happy-frontend depends on more definitions than the code in it requires, but: would anyone ever use happy-grammar without happy-tabular? I don’t think so.

int-index · 2022-01-14T02:00:00Z

To elaborate a bit more on this, packages are a means of code distribution, and should be motivated by distribution needs.

As long as we ship happy as a single binary, it’s fine to have everything in a single package. I want to ship a TH-based version, so I’d like to factor out anything related with .y-files and CLI into their own packages, so that TH users wouldn’t depend on those components.

The backends go into their own packages because we could easily ship happy-lalr, happy-glr, and happy-rad as separate binaries, and I think most users would just pick and use one of those.

Now, what this patch is doing is related to separation of concerns rather than distribution. And I’m more than happy to separate concerns: dedicated functions, data types, and modules, are all very good (that is also the stated motivation: “make each data type shine on its own”). But there’s no need for separate .cabal packages.

Ericson2314 · 2022-01-14T04:11:05Z

@int-index I agree we don't want to go overkill on the packages, and indeed it's already unwieldy. But, I do think there is some utility in ripping things into too many pieces on purpose, just so you have a clean slate and more flexibility to decide to decide how to to put them back together.

I agree with your breakdown of concerns and survey of potential distributions. I agree too it is likely happy-grammar and happy-tabular should be recombined, But I also don't think that while this type certainly doesn't deserve it's own package, it doesn't belong in any of the others existing currently either.

As far as concerns go, this stuff is really backend concerns that need an interface so we show-horn them in the frontend. The core middle parts of the compiler (which I think are destined to become the most general-purpose librar(y|ies)) don't care about them at all.

As far as distributions go, I think use-case: along these lines are useful and plausible:

Work with plain grammars, not ornamented with fused elimination rules in the YACC tradition.
Something that just says diagnostics about your grammar, like --info and doesn't actually compiler it into anything. Hell, we could have a full-on language server!
Interpret a grammar. Dispense with the phase restrictions and just interpret a (possibly ornamented) grammar right into a Haskell function. Slow, but might have advantages. Also can do some ridiculously context-dependent shenanagins.

All of these would avoid the frontend, avoid the CLI, avoid the backends, and avoid this Directives type. In other words they would just use happy-gramar and happy tabular.

I think a likely happy ending for happy-directives is for it to evolve into some sort of happy-backend-common. (And I am also thinking s/backend/codegen because codegen is intrinsic to the interface whereas backend is merely how it relates to other parts of existing use-cases.) This would contain Directives (which should be something like CommonCodegenOptions), and also your new syntax-building combinators. This makes it a meatier library more worthy of existing!

How does that sound?

int-index · 2022-01-14T14:16:09Z

How does that sound?

Yes, sounds reasonable overall. But if you put the directives in happy-codegen-common, then happy-frontend will depend on it (since it needs to produce a grammar+directives that it parser from .y), and that would be quite strange.

Ericson2314 · 2022-01-14T17:58:54Z

I think that's not so bad, because the current frontend is specific to the use-cases that do codegen.

There was some extra information stuff in `Grammar` which had nothing to do with the grammar, but was simply there because `Grammar` was also playing the role of capturing all information from the abstract syntax. I don't think that's good. As we really try to really make libraries out of this stuff we should be stricter and stricter about separating concerns. Grammar should really just be that, the grammar, and the code in `happy-tabular` should not be privy to information that is just for the backend. Splitting out a code generation `CommonOptions` type is a first step to rectifying this. I hope we can do a few more refactors like this to really make each data type shine on its own. --- I don't want to sound to harsh though, since we purposely made the split low impact with more cleanups -- such as this -- left for later. It is of course easier to see what's good and what's bad once the code is split up! Also, this change calls into question my previous `BookendedAbsSyn`. We're now acknowledging the abstract syntax mixes "middle" and back end concerns, and those are not properly separated until the nest step. Given that, there isn't much use of making `BookendedAbsSyn` when we could just stick the header and footer in `CommonOptions`.

This temporarily undoes the patch

int-index · 2022-01-16T07:46:42Z

packages/grammar/src/Happy/Grammar.lhs

->               token_type        :: String,
->               imported_identity :: Bool,
->               monad             :: (Bool,String,String,String,String),
->               expect            :: Maybe Int,
 >               attributes        :: [(String,String)],


Shouldn’t attribute and attributetypes also move to CommonOptions? They’re not used by happy-tabular

Yes they should be moved out, but I figured I would deal with all attribute things in the next PR.

Also I vaguely recall only one backend supported attribute grammars?

Yeah, I think GLR doesn’t.

OK Yeah so the attributes stuff should be moved out to a different data type, and with that change we should get some better errors if one tries to do attribute stuff with the GLR backend.

I don’t know if GLR and attributes are fundamentally incompatible or if it’s a limitation of the current backend. In the latter case, we should probably still pass the attribute stuff to the backend and throw “NotImplemented” or something of this sort.

happy’s documentation is unhelpfully vague on this point:

Currently, attribute grammars cannot be generated for GLR parsers (It's not exactly clear how these features should interact...)

Also I think if we had a separate datatype, the lack of attribute grammars during bootstrapping could perhaps become a little cleaner.

@int-index between boostrapping and the uncertainty around GLR, are you fine leaving it as-is for now?

Yes, fine with me. I was concerned that happy-directives is too fine-grained, but happy-codegen-common is reasonable.

Ericson2314 mentioned this pull request Jan 14, 2022

Extract frontend work into frontend-cli #219

Closed

Ericson2314 force-pushed the remove-off-topic-from-grammar-type branch from edd8d35 to 71edaaf Compare January 16, 2022 05:19

Ericson2314 added 3 commits January 16, 2022 01:32

Regenerate CI

9e3e765

This temporarily undoes the patch

Update and reapply CI patch

b6910c3

Ericson2314 force-pushed the remove-off-topic-from-grammar-type branch from 71edaaf to b6910c3 Compare January 16, 2022 06:34

Ericson2314 changed the title ~~WIP: Split out happy-directives~~ Split out happy-directives Jan 16, 2022

int-index reviewed Jan 16, 2022

View reviewed changes

Ericson2314 changed the title ~~Split out happy-directives~~ Split out happy-codegen-common Jan 20, 2022

Ericson2314 merged commit 9347634 into master Jan 20, 2022

Ericson2314 deleted the remove-off-topic-from-grammar-type branch January 20, 2022 07:13

sgraf812 mentioned this pull request Sep 13, 2024

Vendor happy-codegen-common into happy-grammar #286

Closed

int-index mentioned this pull request Sep 13, 2024

Move contents of happy-codegen-common into happy-grammar; delete #287

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split out happy-codegen-common #221

Split out happy-codegen-common #221

Ericson2314 commented Jan 14, 2022 •

edited

Loading

int-index commented Jan 14, 2022

int-index commented Jan 14, 2022 •

edited

Loading

Ericson2314 commented Jan 14, 2022 •

edited

Loading

int-index commented Jan 14, 2022

Ericson2314 commented Jan 14, 2022

int-index Jan 16, 2022

Ericson2314 Jan 16, 2022

int-index Jan 16, 2022

Ericson2314 Jan 16, 2022

int-index Jan 16, 2022

int-index Jan 16, 2022

Ericson2314 Jan 17, 2022

Ericson2314 Jan 19, 2022

int-index Jan 20, 2022

Split out happy-codegen-common #221

Split out happy-codegen-common #221

Conversation

Ericson2314 commented Jan 14, 2022 • edited Loading

int-index commented Jan 14, 2022

int-index commented Jan 14, 2022 • edited Loading

Ericson2314 commented Jan 14, 2022 • edited Loading

int-index commented Jan 14, 2022

Ericson2314 commented Jan 14, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ericson2314 commented Jan 14, 2022 •

edited

Loading

int-index commented Jan 14, 2022 •

edited

Loading

Ericson2314 commented Jan 14, 2022 •

edited

Loading