Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Self-contained haddocks with unified index #249

Merged
merged 1 commit into from
May 17, 2018
Merged

Conversation

mrkkrp
Copy link
Member

@mrkkrp mrkkrp commented May 15, 2018

Close #235, close #232.

This PR solves the following problems:

  • Documentation generated by a single haskell_doc rule is located in a predictable location corresponding to Bazel package in which the rule instance is located and name of doc target. This way it's easy to archive docs or collect them as an artifact on CI.
  • Documentation can be moved anywhere and it's still viewable. That's because we use relative links and all Bazel targets that are transitive deps of the targets specified in deps attribute of haskell_doc are copied to the directory.
  • We use a composite approach by still relying on aspects for efficiency and collecting all docs in single directory for convenience. This is achieved by telling Haddock to use relative links like ../{package_id} when generating per-package docs in aspects and then assembling them in such directory layout that the links turn out to be correct.
  • Unified index is generated the same way as stack does it. The index contains all direct deps, i.e. things from deps attribute. We however dump all docs for Bazel transitive dependencies, so if those top-level deps link to something else, it'll be reachable.

Unsolved:

  • There is no way to get version of a prebuilt dependency from normal rules. Docs for built-in packages that come with GHC link to each other using ../{package-version} scheme. So knowing exact version number is necessary to be able to copy those over and link to them from other docs. This doesn't seem possible. Symlinking like doc-root/package -> doc-root/package-version is not a solution because we don't know package-version part on Skylark side and so we cannot specify it as output, as so it is always lost on moving from sandbox.
  • Another problem is that non-built-in prebuilt deps, like text use full path to other package or ghc in nix store. So these are not portable.

The above issues forced me to abandon the original idea of copying docs of prebuilt dependencies as well. So to view the docs generated using prebuilt dependencies provided by Nix, one must build those packages on his/her system to nix store is populated with the same stuff. I hope that it's mostly non-issue. Although it maybe an issue if we're to serve something online and not browse locally. If packages are built by Bazel, as with Hazel, they should be deps and not prebuilt_deps and so their docs should be fully portable.

@mrkkrp mrkkrp changed the title Self-contained haddocks with unifified index Self-contained haddocks with unified index May 15, 2018
Copy link
Collaborator

@judah judah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool!


all_caches = set.empty()

for dep in ctx.attr.deps:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should pass --read-interface for every transitive dependency, not just the immediate deps. As you have it now, the contents and index files don't contain information about the transitive dependencies.

The individual packages have their own contents/index locations (e.g., bazel-bin/tests/haddock/haddock//testsZShaddockZShaddock-lib-deep-1.0.0/index.html). So I think the top-level files are more useful when they contain everything.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue that we should not by default include everything. Imagine, if we go with Hazel, almost all dependencies will be in the index that way, and so they will distract from actual documentation of packages of interest (go find them first in this long list!). Currently platform-core docs include only platform-* packages, which is what I kept in mind here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But just the immediate dependencies is a fairly arbitrary slice of everything that could be of interest in the repo. In the Hazel use case, when we start using it, we could imagine adding an extra attribute to haskell_library signaling whether the library is a "snapshot package" (in Stack parlance). The Haddock rules could then be taught to ignore those packages for the index.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just create a haskell_doc target and list there all e.g. platform-* packages? You have quite a bit of control: you can create different doc packs with indices including exactly what you want.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a point of comparison, stack haddock always generates both kinds of index (just local packages, and local packages + deps).

For Bazel, using the attribute to toggle between behaviors sounds reasonable to me, although it does mean that to get both kinds of index you need two copies of all the per-rule haddocks.

@@ -182,6 +258,14 @@ haskell_doc = rule(
aspects = [haskell_doc_aspect],
doc = "List of Haskell libraries to generate documentation for.",
),
"include_transitive_deps": attr.bool(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see where this attribute is used? Is it still needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option should actually allow to switch between "include everything" and "include only direct deps" modes. I pushed an update that makes it actually work.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable. I suggest changing the name of this attribute though; at first glance it makes it seem like documentation for transitive deps isn't generated at all (rather than just the index).

Maybe index_transitive_deps instead?

paths.join(doc_dir_raw, "{0}.haddock".format(pkg_id)),

)
hoogle_file = ctx.actions.declare_file(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you intentionally stopping support for generating the hoogle file? If so, we should mention it in the PR description.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not intentional, I've restored Hoogle support.

Copy link

@dleprovost dleprovost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!


# Ensure that top-level doc directory exists.

"$RULES_HASKELL_MKDIR" -p "$RULES_HASKELL_DOC_DIR"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pattern we use elsewhere too, as an alternative to building a PATH that could introduce impurities. However, I'd be curious to know, what's the idiomatic way to do this in Bazel best practices? Are there established best practices already (in particular by ex-Blaze users) on how to feed scripts with their dependencies? Would it be better to use runfiles instead? Question for @judah.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, if a shell script depends on another binary or file, they'll declare it as a runfile and then reference it within the script using a hard-coded relative path. The path to the script's directory (and thus to the runfiles) is obtained using something like:
$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
which is more robust to symlinks.

However, for binaries like "mkdir" or "cp", all the other scripts I've seen have just called them directly, assuming they would be on the default path. For example:
https://github.com/bazelbuild/bazel/search?utf8=%E2%9C%93&q=mkdir&type=
I guess that only really becomes an issue when you're relying on Nix to supply those binaries.

Honestly, though, for something as simple as this script I'd feel it simpler to just call ctx.actions.run_shell. Both approaches seem defensible though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that only really becomes an issue when you're relying on Nix to supply those binaries.

Right. On NixOS, @mrkkrp has no alternative but to do what he did or to use runfiles. /bin/ just doesn't exist.

Copy link
Collaborator

@judah judah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one naming suggestion

@mrkkrp mrkkrp merged commit 63aa698 into master May 17, 2018
@mrkkrp mrkkrp deleted the self-contained-haddock branch May 17, 2018 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants