Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nix derivation for all build dependencies #898

Open
cmoog opened this issue May 10, 2022 · 4 comments
Open

nix derivation for all build dependencies #898

cmoog opened this issue May 10, 2022 · 4 comments

Comments

@cmoog
Copy link

cmoog commented May 10, 2022

The existing nix package only includes the CLI itself. But, for hermetic nix builds of latex documents, we'd need to have an additional derivation that includes the necessary latex packages. This way, we can have reproducible builds of documents without network dependencies.

Is this possible with the current architecture? I imagine we'd just need to have a derivation with every package, then the user would override TECTONIC_CACHE_DIR during the build step to point at the cache derivation.

Thoughts?

@pkgw
Copy link
Collaborator

pkgw commented May 11, 2022

Hmm. I'm not really familiar with how nix does things, so I may be missing a few items here, but I'll do my best to answer.

The most direct way to get a fully network-free document build would be to provide a local copy of the bundle file (several gigabytes), and point your document builds at that, using the -b argument in the V1 CLI. There is a wrinkle here because the -b argument expects a bundle stored in a Zip file, while the online bundle is in a different "indexed tar" format, but one can generate one from the other, or I could just upload the Zip version to our cloud storage. The other wrinkle I can think of is that I think the V2 CLI doesn't support local bundle files, but that would be pretty easy to add support for.

To get a build that didn't rely on all of those gigabytes of data, you could do something like bootstrap a build as above on a clean cache, then export the populated cache directory in some fashion, then "import" using $TECTONIC_CACHE_DIR or some kind of manual copy.

Finally, I'll mention that the repo tectonic-texlive-bundles contains the infrastructure used to generate those bundle files from the TeXLive upstream, which is done by doing a bunch of processing in Docker containers that point to a checkout of Norbert Preining's Git mirror of the TeXLive SVN repo (which is like 60 gigs or something silly since they commit oodles of binaries to the SVN).

Does that help?

@Neved4
Copy link

Neved4 commented Oct 6, 2022

ping @cmoog

@cmoog
Copy link
Author

cmoog commented Oct 6, 2022

Thanks for this detailed response @pkgw, great info here. The trouble here is that (to my knowledge) tectonic doesn't provide an easily parsable lockfile from which a nix expression could parse and download the minimum set of required dependencies. The next best solution would be a way to generate a nix expression similar to node2nix, but even that would require hooking into tectonic dependency parsing/resolution logic.

Finally, you're right that downloading the entire archive of all dependencies would work. I agree that the quickest solution would be a nix derivation that contains a populated cache dir, which could be used at build time by setting $TECTONIC_CACHE_DIR to the derivation path in the nix store.

@pkgw
Copy link
Collaborator

pkgw commented Oct 24, 2022

Ah, yes. Right now Tectonic doesn't have anything like a lockfile because it doesn't manage dependencies and packages in a fine-grained manner — during document builds, there's no dependency resolution; all Tectonic does is pull files from the bundle upon request. The bundle is built from TeXLive packages but the information about specific packages is (intentionally) erased once the bundle is assembled. (Sorry, I feel like I'm not explaining clearly here.)

Tectonic could definitely emit a very simpleminded "lockfile" with the list of files needed to build a given document. That could be used to pull down the subset of files from the bundle needed to build that document without the network.

From some extremely superficial looking at what node2nix does, I think one question would be whether the fetchurl fetcher supports HTTP byte-range requests. If it does, I think we could use that sort of lockfile to create a Nix expression that depended on only the pieces of the bundle required by the specific document.

If it doesn't, one could get something to work by having a fetchurl expression that depends on the whole bundle. That could be combined with a lockfile, but as long as you're pulling down the whole bundle anyway the details of the lockfile aren't saving you any work.

So I guess either way, the tectonic CLI program would need to provide whatever low-level operations would be required to go from these sorts of fetchurl inputs to a set-up cache to supporting a build, I think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants