Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modularize happy #191

Closed
wants to merge 82 commits into from
Closed

Modularize happy #191

wants to merge 82 commits into from

Conversation

knothed
Copy link
Contributor

@knothed knothed commented May 31, 2021

Following #167 (comment) and #187, this PR aims to split up happy into several components. Two main goals are accomplished:

  • Introducing a clear and sensible separation between different parts of happy like frontend, middleend and backend.
  • Making it easy for anyone to extend happy or to create and distribute new, possibly experimental, happy-features and -components (like a TemplateHaskell-frontend or a recursive ascent-descent-backend).

Ideally, happy programmers should be able to combine happy packages almost arbitrarily: Each frontend should, in theory, be able to work with each middleend, which in turn should work with each backend.

To achieve this goal, we (@sgraf812 and me) have decided to split happy into six packages – five libraries and one executable – as follows:

  • happy-frontend is responsible for reading .y-files and parsing them into the Grammar datatype.
  • happy-middleend transforms the Grammar into an ActionTable and a GotoTable.
  • happy-backend takes the these tables and generates template-based Haskell code.
  • happy-core defines the interfaces (IRs) between the frontend, middleend and backend packages. Also, it contains core utilities like option parsing.
  • happy-test hosts the test files and contains logic to execute the test suite.
  • happy itself just puts the above packages together and contains little additional logic.

Why six packages?

We want users to be able to combine different frontend-, middleend- and backend-packages together. Therefore, happy-core is vital as it defines data types that every such package should be able to work with; in particular Grammar, ActionTable and GotoTable.

We'll look at happy-test shortly.

How does a package look like?

The three packages happy-frontend, happy-middleend and happy-backend are all built similarly: they have a public runFrontend/runMiddleend/runBackend function which performs the package's main task.
Also, they have a common interface for providing and parsing command line arguments.

Let's take a look at happy–backend (packages/backend). It has the following exposed components:

  1. A main entry point function, runBackend, which performs the code-gen:

    runBackend :: BackendArgs -> Grammar -> ActionTable -> GotoTable -> IO ()
    
    data BackendArgs = BackendArgs {
      outFile :: String,
      templateDir :: Maybe String,
      magicName :: Maybe String,
      strict :: Bool,
      ghc :: Bool,
      coerce :: Bool,
      target :: Target,
      debug :: Bool,
      glr :: Bool,
      glrDecode :: Bool,
      glrFilter :: Bool
    }

    BackendArgs are user-provided arguments, while Grammar, ActionTable and GotoTable come from the frontend and middleend (as results of runFrontend and runMiddleend).

    happy could call runBackend with arbitrary BackendArgs. There is no requirement for a command line interface per se.

  2. But ultimately, we do want to perform happy via a CLI, meaning we want to create BackendArgs from parsing the command line arguments. Just as previously in happy, we define the CLI arguments using GetOpt, but on package-level:

    data BackendFlag =
        OptOutputFile String |
        OptTemplateDir String |
        OptMagicName String |
    		...
    
    backendOptions :: [OptDescr BackendFlag] = [
        Option "o" ["outfile"] (ReqArg OptOutputFile "FILE"),
        Option "t" ["template"] (ReqArg OptTemplate "DIR"),
        Option "m" ["magic-name"] (ReqArg OptMagicName "NAME"),
        ...
    ]
    
    parseBackendFlags :: [BackendFlag] -> String -> IO BackendArgs

So, each package defines its own CLI Flags and options and, in addition, a parseFlags function which converts these CLI flags into PackageArgs.

Then, option parsing takes place as follows:

  1. The main happy executable sticks the options from each package together and calls getOpt.
  2. The resulting Flags are then used to call parseFlags and execute each package's respective main entry point function: runFrontend, runMiddleend and runBackend.

In this way, option parsing is fully modularized such that new packages can define and use their own CLI options.

We just have to pay attention to one thing: with getOpt, it is not possible to have multiple options with the same option character or option string. This means, if there are two different packages which both define an -o option, these cannot be used together in the same happy executable. Package authors have to watch out for such option collisions.

Example: happy-rad

As an example for how this modularization can be used to create a new executable, let's take a look at happy-rad. happy-rad extends happy by a recursive ascent-descent (RAD) backend. It contains two packages:

  • rad-backend: a backend which produces RAD code. This package is completely independent of happy-backend. Still, it works with the same data types (Grammar, ActionTable, GotoTable) as happy-backend.
  • happy-rad: it depends (a.o.) on happy-frontend, happy-middleend, happy-backend (which are remote packages) and rad-backend (local package). First, frontend and middleend are executed normally. Then, depending on whether the --rad option is set, either rad-backend or happy-backend is invoked.

As happy-frontend, happy-middleend and happy-backend are (will be) remote packages, we can just reuse their functionality. In addition, we get testing for free:

testing happy and happy-rad

How does the happy-test package help package authors in testing their packages?

We've moved all of the existing 26 test grammars (e.g. ParGF.y) into happy-test. happy-test exposes the following function:

test :: TestSetup -> IO a

data TestSetup = TestSetup {
  happyExec :: String, -- e.g. happy or happy-rad
  defaultTests :: [String],
  customTests :: [String],
  customDataDir :: String,
  allArguments :: [String],
  stopOnFailure :: Bool
}

In their test suite, the package author can call test and customize the testing procedure: they can specify which of the 26 grammars should or should not be tested and which argument combinations their executable should be tested with. In addition, they can specify further, custom, testing grammars.

For example, happy-rad could provide [--rad] for the set of arguments it should be tested with, while normal happy could provide [-a, -ag, -cg, -acg].

happy-test can also do something different: The functionality of make sdist-test, which was previously in the Makefile, has now moved into happy-test. This way, every extension author can test their sdists. What does make sdist-test do?

  1. call cabal sdist all to create an sdist of all local packages
  2. unzip the sdists and move them into an umbrella directory
  3. create a cabal.project referring to these local packages
  4. build and test the executable inside this umbrella directory.

More

For the end user of happy, not much changes: the CLI stays identical, the functionality remains the same. Just the compile time of happy increases a bit.

Template Files

As the template files are backend-specific, these are now data-files of happy-backend, not of happy itself. Just like the tests, which are data-files of happy-test instead of happy itself.

Bootstrapping

Bootstrapping is a process which is purely frontend-specific. Therefore, the bootstrap flag only exists in happy-frontend, but still exists. A next step could be to outsource the bootstrapping parser into its own package, as described in #187 (comment).

CI

Travis and appveyor used make sdist; make sdist-test-only. We have replaced this with our new make sdist-test.
make sdist-test currently doesn't work on MinGW.

@knothed knothed force-pushed the modularization branch 2 times, most recently from 0ca1d45 to a73c187 Compare June 8, 2021 12:01
@Ericson2314
Copy link
Collaborator

@knothed Still trying to find the time to properly sit down and review this, but I wanted to reaffirm I'm very happy to see this PR!

@knothed
Copy link
Contributor Author

knothed commented Jun 8, 2021

@Ericson2314 perfect, I‘m quite happily looking forward to hearing your opinion!

.appveyor.yml Show resolved Hide resolved
.travis.yml Outdated Show resolved Hide resolved
happy.cabal Outdated
happy-core == 1.21.0,
happy-frontend == 1.21.0,
happy-middleend == 1.21.0,
happy-backend == 1.21.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably maintain a script that changes all these fixed version bounds to e.g. 1.22.0 once we want to release the next version

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use configure to do the job and also check in the generated cabal files. Alternatively just put up with having to grep for and replace 10ish references to the pretty unique version string 1.21.0...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generated cabal files

Hmm? What is that?

packages/frontend/src/Happy/Frontend/AbsSyn.lhs Outdated Show resolved Hide resolved
packages/frontend/src/Happy/Frontend/Mangler.lhs Outdated Show resolved Hide resolved
packages/test/data/monad002.ly Outdated Show resolved Hide resolved
packages/test/src/Happy/Test.hs Outdated Show resolved Hide resolved
packages/test/src/Happy/Test/SDist.hs Show resolved Hide resolved
import Control.Monad.IO.Class
import Control.Applicative

type TypedShell a = MaybeT IO a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all a bit ad-hoc and could probably take inspiration in e.g. shelly or turtle. Then again, if it works it's probably good enough. Although we already had experiences that were hard to debug...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah as long as this doesn't need to change too often I don't care too much.

@sgraf812
Copy link
Collaborator

sgraf812 commented Jul 14, 2021

I didn't review much code, because I trust you on not having modified the core happy logic. Plus, the tests seem to pass, so your refactoring seems to be successful.

I quite like the new package organisation, but ultimately it's up to the maintainers to decide.

Edit: Although, I wonder: Did CI really run the tests? I can see no output difference between https://travis-ci.org/github/simonmar/happy/jobs/774274988 (where they should run) and https://travis-ci.org/github/simonmar/happy/jobs/774274987 (where they shouldn't run).

@Ericson2314
Copy link
Collaborator

For CI maybe try GitHub workflows. See Alex, which has made the switch, though I think is prefer a manually written one like I attempted in haskell/alex#190 as last I checked haskell-ci didn't support macOS with GitHub workflows yet.

@Ericson2314
Copy link
Collaborator

Ah just merge in #196 and get it working with what you got. Hopefully we can merge that one soon and fix CI so this can be safe to merge.

@knothed
Copy link
Contributor Author

knothed commented Jul 20, 2021

Edit: Although, I wonder: Did CI really run the tests? I can see no output difference between https://travis-ci.org/github/simonmar/happy/jobs/774274988 (where they should run) and https://travis-ci.org/github/simonmar/happy/jobs/774274987 (where they shouldn't run).

Yes. In the second link, the tests have been run normally, while in the first link, make sdist-test was performed. Compare e.g. the number of lines from the output.

@sgraf812
Copy link
Collaborator

Ah just merge in #196 and get it working with what you got

Hmm... Isn't GHA-based CI an entirely different issue? As I see it, Travis CI seems to work with this PR: https://travis-ci.org/github/simonmar/happy/builds/774274977. And it indeed runs the tests.

@Ericson2314
Copy link
Collaborator

Ericson2314 commented Jul 21, 2021

@knothed Oh when I clicked "all checks have passed" Travis was not there, so I assumed it was broken. Nevermind, then.

ChangeLog.md Outdated Show resolved Hide resolved
@Ericson2314
Copy link
Collaborator

OK I would like this to get merged soon :). It is very good and has sat open too long.

A few more things

  1. Can we put happy itself in packages/ too? This will make filtering the sources per package from the repo easier, because every package gets its own subtree.
  2. I would be open to giving the swappable packages names based on the algorithm they currently use. e.g. maybe we would switch the default backend from this one to a different one at some point :).
  3. Can the attribute grammar stuff leverage this some how? I see you have isBootstrapped to leak out whether to do the attribute grammar tests.

@Ericson2314
Copy link
Collaborator

If you check the "maintainers can push to PR" box I can help fix conflicts.

@Ericson2314
Copy link
Collaborator

Also piknotech#1

@knothed
Copy link
Contributor Author

knothed commented Jul 22, 2021

OK I would like this to get merged soon :). It is very good and has sat open too long.

Nice, I'd also like that :)

  1. Can we put happy itself in packages/ too? This will make filtering the sources per package from the repo easier, because every package gets its own subtree.

Yes we could do that. This would however require all of happys data-files to be inside this subdirectory. These include the readme, changelog, makefile etc., which would be much better placed in the top directory instead of being inside packages/happy.
We could probably solve this using links? But I don't know whether that is cross-system-proof.

  1. I would be open to giving the swappable packages names based on the algorithm they currently use. e.g. maybe we would switch the default backend from this one to a different one at some point :).

Sounds good - what would you propose?

  1. Can the attribute grammar stuff leverage this some how? I see you have isBootstrapped to leak out whether to do the attribute grammar tests.

Bootstrapping is a frontend-specific thing, and the frontend exposes isBootstrapped. The other packages don't care about bootstrapping.

Using happy with attribute grammars requires the bootstrapped happy however which is why the attribute grammar tests are listed separately inside Tests.hs.

@Ericson2314
Copy link
Collaborator

We could probably solve this using links? But I don't know whether that is cross-system-proof.

I'm told windows support for symlinks is continuously improving. And I think git had some tricks even before that, so that's fine with me.

@Ericson2314
Copy link
Collaborator

  1. I would be open to giving the swappable packages names based on the algorithm they currently use. e.g. maybe we would switch the default backend from this one to a different one at some point :).

Sounds good - what would you propose?

I guess first I would need to understand how/why your happy rad still uses regular happy-backend.

@Ericson2314
Copy link
Collaborator

Well isBootstrapped should be called something like supportsAttributeGrammars for starters. The former is an implementation detail, the latter isn't.

@knothed
Copy link
Contributor Author

knothed commented Jul 22, 2021

I guess first I would need to understand how/why your happy rad still uses regular happy-backend.

My rad-backend is independent of happy-backend. The executable happy-rad however does include regular happy-backend so the user can decide which backend they wanna use.

@knothed
Copy link
Contributor Author

knothed commented Jul 22, 2021

Should we remove the TODO file? It seems outdated.

@Ericson2314
Copy link
Collaborator

Ericson2314 commented Jul 22, 2021

I guess first I would need to understand how/why your happy rad still uses regular happy-backend.

My rad-backend is independent of happy-backend. The executable happy-rad however does include regular happy-backend so the user can decide which backend they wanna use.

Oh! So it could be skipped to force them into the rad backend? That's great news.

@Ericson2314
Copy link
Collaborator

Could you make a happy-backend-glr? And make it optional just like you made happy-rad support two backends?

Perhaps that is a better splitting than whatever I was thinking of with the attribute grammars :).

@knothed
Copy link
Contributor Author

knothed commented Jul 22, 2021

Yes, I‘ve also thought about splitting the GLR backend off. I‘ll do that.

It could however be that the GLR backend has a dependency to the normal backend, as it uses, I think, some of the normal backend logic. But that shouldn’t be a problem – the normal backend (obviously) doesn’t depend on the GLR backend.

@knothed
Copy link
Contributor Author

knothed commented Jul 22, 2021

Oh! So it could be skipped to force them into the rad backend? That's great news.

Exactly!

@knothed
Copy link
Contributor Author

knothed commented Aug 25, 2021

I decided for the following path:

  1. Split off happy-grammar and happy-tabular
  2. Split off happy-frontend, happy-backend and happy-backend-glr
  3. Create happy and happy-cli
  4. Create happy-test.

Changing all the CLI stuff at once and at the end is the most convenient, both for doing so and for reviewing.

@int-index The first MR is #200 which can be reviewed now.

Maybe it makes more sense to merge all the MRs into an intermediate branch and then merge that branch into master only at the end. Maybe someone could create a branch (e.g. modularisation).

- Put "mapDollarDollar" in Grammar
- Put "die" and "dieHappy" in Happy.CLI.Dying
- Put the other stuff where it is used.
@knothed
Copy link
Contributor Author

knothed commented Aug 27, 2021

I've removed GenUtils and distributed its members to sensible places:

  • mapDollarDollar to Grammar as it is a grammar-level feature and every backend working with Grammar uses it
  • dieHappy and friends to Happy.CLI.Dying. This makes Happy.*.CLI modules import Happy.CLI.Dying, which means frontend and backend now also depend on happy-cli. If this is unwanted, frontend and backend could outsource their CLI module into frontend-cli and backend-cli.
  • The remaining members are put directly in the packages where they're used.

@knothed knothed closed this Mar 23, 2022
@knothed knothed deleted the modularization branch March 23, 2022 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants