New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for hrepl. #1210
Add support for hrepl. #1210
Conversation
`hrepl` is a standalone binary that runs REPLS for Bazel Haskell targets. Its repository is located at: https://github.com/google/hrepl `hrel` takes a list of labels on the command-line, in contrast to `haskell_repl()` which requires editing BUILD files maually. For example: ``` $ hrepl tests/binary-with-lib:lib ``` This change adds specific output groups to `rules_haskell` which allow `hrepl` to understand its rule types. Internally, `hrepl` runs the following steps: - Run `bazel build --output_groups=... haskell:toolchain_info` and read the resulting protobuf file to get the path to GHC as well as any relevant command-line options. - Use `bazel query` to figure out exactly what targets to build. - "bazel build --output_groups=..." to compile the immediate dependencies of the targets. - Read the "LibraryInfo" protobuf file which contains information about them (e.g., the paths of the package DBs). - "bazel build --output_groups=..." the targets to be interpreted. This makes sure their `srcs`, runfiles, C++ dependencies, etc are available, but doesn't actually compile them. Instead, it writes the necessary information to a "CompileInfo" protobuf file. - Run `ghc --interactive`, combining the flags obtained from all the previous steps. So this change adds two kinds of output groups: - Library info: For targets that `hrepl` should recognize as dependencies, such as `haskell_library`, `haskell_toolchain_library` and `haskell_cabal_library`. - Compilation info: For targets that `hrepl` should load directly, such as `haskell_library` and `haskell_binary`. This change also makes the `protoc` binary a dependency of all Haskell build rules, rather than just `haskell_proto_library`. It writes files in the human-readable text format, and then uses the `protoc` binary to convert them to the binary proto format. The text format itself doesn't have a good forwards/backwards compatibility story, which is important since `hrepl` is a separate, standalone executable. (For example, a text format parser will fail if it encounters a field name that it doesn't understand, which makes it more difficult to deprecate old fields.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for open sourcing hrepl
and for opening this PR!
I've added a couple review comments for questions or change-requests.
} | ||
|
||
// Information about a built library. | ||
message LibraryInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LibraryInfo
only mentions shared libraries. How does hrepl
handle static libraries?
There are two situations where this is relevant.
- A static-mode GHC toolchain (e.g. on Windows) where Haskell libraries are only compiled to static archives. GHCi can load these, provided they are built with
-fPIC
(and-fexternal-dynamic-refs
on Unix). - A C library dependency that only provides a
pic_static_library
. GHCi's loader can usually load these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't currently handle them. We originally used static archives, but switched a year ago to dynamically-linked GHC to work around bugs in the static loader. It seems pretty reasonable to me that it could support both.
I filed google/hrepl#7 to track this, though as you've noted it would also require changes to rules_haskell itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@judah do you remember what bugs those were? We're likewise interested in making the static linker work better for everyone. It would be quite a win: in the common case we could avoid building dynamic libraries altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
though as you've noted it would also require changes to rules_haskell itself.
@judah which required changes are you referring to?
rules_haskell
supports static rts GHC and sets -fPIC
and -fexternal-dynamic-refs
as required and uses pic_static_library
of dependencies where available. On Windows that's the default, as the Windows bindist uses static rts mode. On Unix it's optional, currently exposed for nixpkgs provisioned GHC through the is_static
attribute, since there are no static rts GHC bindists available for Unix.
For example, Digital Asset's open source daml repository uses static linking in this way. As a test, I've run hrepl
with this PR's patches on that repository and was able to load a large target with static C library dependencies into GHCi using
LIBRARY_PATH=$PWD/bazel-out/k8-opt/bin/compiler/damlc/_ghci_libs-damlc hrepl damlc
Looking at cc_libs
in compile_info_output_groups
, it doesn't seem to distinguish static and dynamic libraries, so transitive_cc_shared_libs
will end up holding both. So, maybe just the name is inaccurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't realize that static libraries got passed through the same field of cc_libs. I've renamed the field to transitive_cc_libs
. (I don't have access to a static GHC at the moment, so we may need to leave that to future work.)
We got hit by a few bugs around the static linker in the transition from 8.0->8.4 and 8.4->8.6, including multiple instances of segfaults on ANN pragmas. One example was:
https://gitlab.haskell.org/ghc/ghc/issues/14675
But there were a couple others we never tracked down well enough to file a useful bug report (in particular we weren't able to rule out effects of our internal toolchain).
At the time (early 2019), the existing comments on the GHC issues didn't give us a lot of confidence in the static RTS. But it's good to hear you've taken it up in earnest, and your use of -fexternal-dynamic-refs
sounds promising; we never discovered that flag, and the problem mentioned in that link (R_X86_64_PC32
) does sound familiar. I'll make a note to try using that in our codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW my understanding is that interest in the static linker from GHC HQ has varied over the years. When support from dynamic libraries was first introduced, I suppose the vision was to eventually retire the static linker, because it's duplicating so much OS functionality. However, the story around dynamic libraries turned out less than ideal. And at this point, I believe even Facebook (who employs @simonmar) might be using a static toolchain, or so https://engineering.fb.com/security/fighting-spam-with-haskell/ would have me believe.
At any rate, the static toolchain is what Digital Asset uses too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I believe we're not using the static runtime linker any more at Facebook. The static runtime linker requires all linked objects to fit in the low 2GB of the address space, and we eventually ran into that limit and had to switch to building a standard .so
and using the system dyanmic linker to load it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@simonmar very interesting. And thanks for chiming in! So out of curiosity, do you now only use standard .so during your build, or do you still use -dynamic-too
when building? If the former, that would be nice to speed up the build, but it's not an option for our macOS users, where the Mach-O header size limits are intentionally very low (hence, there's only so many dynamic dependencies one can have).
This sounds weird to me, can’t we make that optional? |
One way that comes to mind is to factor out the rule info output groups into an aspect. That aspect would then depend on protoc, and the Haskell rules wouldn't have to. @judah What to you think? Though, my understanding is that
Looking at the logs, |
If it’s only lazily requested, then it’s fine. |
Yeah, My preference would be to keep using output groups rather than aspects. The former is a little more straightforward, while the latter would also require us to make new providers to pass the relevant information between the regular rules and the aspect(s). (It could also be a bit tricky to integrate with |
We'll pass those automatically in hrepl itself.
That's a good point. Let's stick to the output groups then. I'm also fine with a lazy dependency on protoc. (@judah Sorry, I somehow managed to edit your comment instead of replying. I tried to fix it again.) |
The CI failures should be fixed by rebase on |
It may contain either static or shared libraries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@judah It would be good if we could also test the hrepl
integration in rules_haskell
, so that we don't accidentally break it. What do you think would be a good way to do this? Do you think there's a way to do this in a bazel test
, or will that have to be tested outside of Bazel?
That doesn't have to be included in this PR, but can go into a separate one. Just wanted to start the discussion.
Since it will also hold static libraries too (google/hrepl#7).
/azp run |
@aherrmann yes, let's do the testing in a follow-up PR. The They work by pointing I can think of a few possiblities:
|
`protoc` depends on `@zlib` and that collides with the `@zlib` workspace in the `examples`. We cannot directly use `protoc`'s `@zlib` as that library's name is mangled and Cabal won't recognize it.
I took a look at CI
I've pushed fixes to this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The buildkite pipeline is either called ``` buildkite/rules-haskell/pr ``` or ``` buildkite/rules-haskell ``` See for example #1215 or #1210. The latter seems to be the case for external PRs where the buildkite pipeline needs to be initiated manually by pushing to a branch within the repository. However, I have also observed it for internal PRs, e.g. #1214. In any case, we want to accept both pipeline names as mergify merge criteria. I have tested the configuration in the [mergify simulator](https://dashboard.mergify.io/).
hrepl
is a standalone binary that runs REPLS for Bazel Haskell targets.Its repository is located at: https://github.com/google/hrepl
hrel
takes a list of labels on the command-line, in contrast tohaskell_repl()
which requires editing BUILD files maually. For example:
This change adds specific output groups to
rules_haskell
which allowhrepl
to understand its rule types. Internally,
hrepl
runs the following steps:bazel build --output_groups=... haskell:toolchain_info
and read the resultingprotobuf file to get the path to GHC as well as any relevant command-line options.
bazel query
to figure out exactly what targets to build.the targets.
them (e.g., the paths of the package DBs).
sure their
srcs
, runfiles, C++ dependencies, etc are available, but doesn'tactually compile them. Instead, it writes the necessary information to a "CompileInfo"
protobuf file.
ghc --interactive
, combining the flags obtained from all the previous steps.So this change adds two kinds of output groups:
hrepl
should recognize as dependencies, such ashaskell_library
,haskell_toolchain_library
andhaskell_cabal_library
.hrepl
should load directly, such ashaskell_library
and
haskell_binary
.This change also makes the
protoc
binary a dependency of all Haskell build rules,rather than just
haskell_proto_library
. It writes files in the human-readabletext format, and then uses the
protoc
binary to convert them to the binary proto format.The text format itself doesn't have a good forwards/backwards compatibility story,
which is important since
hrepl
is a separate, standalone executable. (For example,a text format parser will fail if it encounters a field name that it doesn't understand,
which makes it more difficult to deprecate old fields.)