Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reexported modules, take 2 #385

Merged
merged 6 commits into from Aug 6, 2018
Merged

Reexported modules, take 2 #385

merged 6 commits into from Aug 6, 2018

Conversation

mboes
Copy link
Member

@mboes mboes commented Aug 5, 2018

This PR is effectively a reopen of #381. Starting from the same code,
I made three changes:

  1. rebased on top of current master,
  2. revert the stuff that tries to ensure haskell_imports provide
    a pkg_id like a haskell_library does,
  3. switch from ghc-pkg register to ghc-pkg cache.

The key observation that made this PR possible is that while ghc-pkg register is very picky about how we name packages e.g. in depends
and exposed-modules fields, GHC itself is much less so. GHC is quite
happy for us to not specify a version number, let alone a package
hash. It's probably a bug that ghc-pkg register is this picky. See
https://ghc.haskell.org/trac/ghc/ticket/15478.

We side step this bug by simply not using ghc-pkg register. This is
a temporary workaround, because that command does some useful sanity
checks we're not getting the benefit of anymore.

@mboes mboes requested a review from mrkkrp August 5, 2018 18:05
lunaris and others added 4 commits August 5, 2018 22:02
Cabal's `reexported-modules` field allows one to reexport and optionally
rename modules from a library. This commit adds an `exports` attribute
to `haskell_library` which accomplishes the same, accepting a dictionary
mapping dependencies to strings using the same syntax. For example:

```
haskell_library(
    ...,
    exports = {
        ":dependency": "A, B as BNew",
    },
    ...
)
```
Registration of a new package consists in,

1. copying the registration file into the package db,
2. performing some validation on the registration file content,
3. recaching, i.e. regenerating the package db cache file.

Normally, this is all done by `ghc-pkg register`. But in our case,
`ghc-pkg register` is painful, because the validation it performs is
slow, somewhat redundant but especially, too strict (see e.g.
https://ghc.haskell.org/trac/ghc/ticket/15478). So we now do (1) and
(3) manually, by copying then calling `ghc-pkg recache` directly.

The downside is that we do lose the few validations that `ghc-pkg
register` was doing that was useful. e.g. when reexporting modules,
validation checks that the source module does exist. We'll want to
come back to using `ghc-pkg register` eventually, but for now
https://ghc.haskell.org/trac/ghc/ticket/15478 is a blocker.
Copy link
Member

@mrkkrp mrkkrp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable, but REPLs do not work now, as CI reports.

@mboes
Copy link
Member Author

mboes commented Aug 6, 2018

Yes, the exported-modules list needs some post processing now to make it cromulent to the REPL.

@lunaris
Copy link
Collaborator

lunaris commented Aug 6, 2018

Looks good to me. I still don't think it'd be the end of the world to have ghc-pkg resolve the names to unit IDs at some point, but this works now 😃

Copy link
Member

@mrkkrp mrkkrp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mboes mboes merged commit 2a28637 into master Aug 6, 2018
@mboes mboes deleted the reexported-modules branch August 6, 2018 13:12
@mboes mboes mentioned this pull request Aug 6, 2018
mboes added a commit that referenced this pull request Aug 9, 2018
In #385, we claimed to exploit the fact that GHC only needs package
names in reexport declarations to obviate the need for knowing the
versions (and dependency hashes) of all prebuilt packages. We provided
a test to substantiate this indeed works. But it turns that the test
was not a good test.

The test was flawed because it featured libraries reexporting `base`.
It turns out that `base` (along with `rts`) are magical in GHC.
Because they are so-called "wired-in packages", they are assumed
unique and are therefore not located using a version number. So while
the scheme in #385 worked for `base`, it didn't work in the general
case, as observed by @lunaris.

The fix consists qualifying all package reexports with full package
id's, not package names. How do we derive a package id given only
a package name? The solution is to have the toolchain generate a dump
of the global package database that ships with the compiler. This is
only needed once for the entire workspace. From this dump, we can
lookup the full package id given any particular name.

Fixes #357.
@mboes mboes mentioned this pull request Aug 9, 2018
mboes added a commit that referenced this pull request Aug 10, 2018
In #385, we claimed to exploit the fact that GHC only needs package
names in reexport declarations to obviate the need for knowing the
versions (and dependency hashes) of all prebuilt packages. We provided
a test to substantiate this indeed works. But it turns that the test
was not a good test.

The test was flawed because it featured libraries reexporting `base`.
It turns out that `base` (along with `rts`) are magical in GHC.
Because they are so-called "wired-in packages", they are assumed
unique and are therefore not located using a version number. So while
the scheme in #385 worked for `base`, it didn't work in the general
case, as observed by @lunaris.

The fix consists qualifying all package reexports with full package
id's, not package names. How do we derive a package id given only
a package name? The solution is to have the toolchain generate a dump
of the global package database that ships with the compiler. This is
only needed once for the entire workspace. From this dump, we can
lookup the full package id given any particular name.

Fixes #357.
mboes added a commit that referenced this pull request Aug 10, 2018
In #385, we claimed to exploit the fact that GHC only needs package
names in reexport declarations to obviate the need for knowing the
versions (and dependency hashes) of all prebuilt packages. We provided
a test to substantiate this indeed works. But it turns that the test
was not a good test.

The test was flawed because it featured libraries reexporting `base`.
It turns out that `base` (along with `rts`) are magical in GHC.
Because they are so-called "wired-in packages", they are assumed
unique and are therefore not located using a version number. So while
the scheme in #385 worked for `base`, it didn't work in the general
case, as observed by @lunaris.

The fix consists qualifying all package reexports with full package
id's, not package names. How do we derive a package id given only
a package name? The solution is to have the toolchain generate a dump
of the global package database that ships with the compiler. This is
only needed once for the entire workspace. From this dump, we can
lookup the full package id given any particular name.

Fixes #357.
mboes added a commit that referenced this pull request Aug 10, 2018
In #385, we claimed to exploit the fact that GHC only needs package
names in reexport declarations to obviate the need for knowing the
versions (and dependency hashes) of all prebuilt packages. We provided
a test to substantiate this indeed works. But it turns that the test
was not a good test.

The test was flawed because it featured libraries reexporting `base`.
It turns out that `base` (along with `rts`) are magical in GHC.
Because they are so-called "wired-in packages", they are assumed
unique and are therefore not located using a version number. So while
the scheme in #385 worked for `base`, it didn't work in the general
case, as observed by @lunaris.

The fix consists qualifying all package reexports with full package
id's, not package names. How do we derive a package id given only
a package name? The solution is to have the toolchain generate a dump
of the global package database that ships with the compiler. This is
only needed once for the entire workspace. From this dump, we can
lookup the full package id given any particular name.

Fixes #357.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants