Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to find available bundles? #685

Open
hugobuddel opened this issue Nov 20, 2020 · 14 comments
Open

How to find available bundles? #685

hugobuddel opened this issue Nov 20, 2020 · 14 comments
Labels
bundles and files Bundle issues, finding fonts, etc. docs Documentation edits & improvements

Comments

@hugobuddel
Copy link

Different bundles can be specified with -w or -b, but the documentation does not say where to find these bundles.

According to https://tectonic-typesetting.github.io/en-US/ :

The underyling “bundle” technology allows for completely reproducible document compiles.
Thanks to the Dataverse Project for hosting the large LaTeX resource files!

But I can't find the previous bundles, so my documents are not reproducible 😢. How can we find which bundles are available?

After some investigation, I've found these 3 urls:

The first two redirect to the Dataverse Project, and indeed can be found there:

This is enough for me now, and I'll make sure to store these bundles locally. But maybe we can formalize this a bit and add these references to the documentation.


For what's it worth, the specific problem I'm trying to solve: The TeXLive 2020.0 bundle (#669) produces a biblatex control file with a version that is newer than my biber can handle (see also #35, #53):

ERROR - Error: Found biblatex control file version 3.7, expected version 3.5.
This means that your biber (2.12) and biblatex (3.14) versions are incompatible.

I install tectonic through conda and there is no newer biber version on conda.

@pkgw
Copy link
Collaborator

pkgw commented Nov 20, 2020

Yeah, it wouldn't hurt to document this somewhere.

For the record, my current vision is to make compilation more cargo-like than rustc-like, if you will, and introduce a Tectonic.toml file defining how a document is compiled, and a Tectonic.lock file recording the various pieces of state needed to yield reproducible builds. For each document, the lockfile would record the resolved bundle URL (among other things), ensuring reproducibility even as the default bundle gets upgraded over time.

@faywong
Copy link

faywong commented Feb 8, 2021

@pkgw Why these bundles are so large(for example, the 2020.0 bundle is 2.6G huge)。

2.6G file maybe is not affordable in some countries as bandwidth is expensive.

Is there a minimal bundle that suitable for basic bootstrap tex(for example, \begine{documen} hello world! \end{Document} ) ?

@hugobuddel
Copy link
Author

hugobuddel commented Feb 8, 2021

One nice thing about tectonic is that it is not necessary to download the bundles. Write that document and compile with tectonic and it should fetch just the packages you need. (I guess that's why they are tar files and not tar.gz files, because it would allow better indexing.)

Tectonic caches those packages, so you only have to download them once. When you include a new package in your document, it will be fetched. So tectonic as-is should be very well suited for your goals.

Downloading these bundles manually is not the normal way to use tectonic. I created this issue because I had some problems with the latest bundle, so wanted to see whether the problems disappeared with the older bundles. (Answer: well, kinda but not really and it wasn't a problem with the bundle per se.)

(Side note: if you install LaTeX through your linux distribution, you should not install the documentation. The documentation seems to be much larger than the packages themselves. Which is a good thing, but not when you are on expensive bandwidth.)

Edit: did some tests: the document you describes puts about 40 megabytes in ~/.cache/Tectonic.

@faywong
Copy link

faywong commented Feb 9, 2021

One nice thing about tectonic is that it is not necessary to download the bundles. Write that document and compile with tectonic and it should fetch just the packages you need. (I guess that's why they are tar files and not tar.gz files, because it would allow better indexing.)

I have done a test to verify this(the above conclusion doesn't apply at my side):

  1. create a doc.tex with content(doesn't use any extra package)
\documentclass{article}
\begin{document}
	\paragraph{hello, world}
\end{document} 
  1. compile it with tectonic
tectonic -X compile doc.tex
note: "version 2" Tectonic command-line interface activated
note: connecting to https://archive.org/services/purl/net/pkgwpub/tectonic-default
  1. as i need a proxy to reach the website, i can't download the bundle from the cli, so i try to download it from chrome browser(it prompts i need other 21h to download the 2.6G bundle)
    image

So at my side, the 2.6G bundle is a must to bootstrap the tectonnic for the first time.

@hugobuddel
Copy link
Author

OK, that's a bummer. Tectonic.zip (7MB) is a zip file of my ~/.cache/Tectonic directory that was sufficient for me to compile that test document without an internet connection.

You could try to bypass the bundle altogether by unpacking that file and placing the contents in your ~/.cache directory, such that you have ~/.cache/Tectonic/files/00/c1e9c387b218901ebe736f6b06f8505669ed714934482d9c08cfc6f74c4caf files etc. The file is actually a tar-gz-zip file (because apparently that leads to the smallest file size, and github only accepts zip files), so you first need to unzip, and then untar.

However, even the smallest change to the document can lead tectonic to fetch more files. E.g. adding a \section requires it to download extra fonts (I've included those). So using the attached tarball is not a proper solution and I don't recommend it. However, given this bootstrap you might be able to figure out yourself how to manually download and add those needed files. Because it seems the files are just sorted by their sha256sum:

~/.cache/Tectonic/files/00$ sha256sum c1e9c387b218901ebe736f6b06f8505669ed714934482d9c08cfc6f74c4caf 
00c1e9c387b218901ebe736f6b06f8505669ed714934482d9c08cfc6f74c4caf  c1e9c387b218901ebe736f6b06f8505669ed714934482d9c08cfc6f74c4caf

So if tectonic needs a file, you could try to find that file manually yourself and then put it in the right place with the right name. I cannot really help you further, but I know the frustration of having bad internet, so maybe you can get started with this. (I'm just a tectonic user like you.)

The best way for you forward would perhaps be to figure out a way to let tectonic on the command line use the same proxy that chrome uses. But I wouldn't know how to get started with that. (Hmm, since tectonic is mostly rust, wouldn't it be cool to get it running in the browser directly? E.g. #166.)

Other than that it seems that tectonic might not be a good fit for your situation. You would probably do better by finding a local proxy with a recent texlive distribution that you can download. Maybe there are linux installation dvd's that you can order. (Or well, maybe find someone that can mail you a DVD / usb stick with the tectonic bundle on it, it is free software.)

@hugobuddel
Copy link
Author

To dig a bit deeper on how to add files yourself. Say tectonic complains that it needs loadhyph-pl.tex (I'm not sure whether it does that though). Then you can look in ~/.cache/Tectonic/indexes/5131b19b08f5628f7a5ccfb7d408f43dc8265c9c50eeb9686c43657265a2f4e4.txt to find this line:

loadhyph-pl.tex 2764703744 1160

The first number is the start of that file in the bundle, the second number the length (+1). Add those and subtract 1 for the end: 2764703744 + 1160 - 1 = 2764703744 + 1160 - 1. Then download only that part of the bundle and pipe into the file:

curl -r 2764703744-2764704903 https://ttassets.z13.web.core.windows.net/tlextras-2020.0r0.tar > loadhyph-pl.tex

(Adapt curl to use your proxy, I don't know how to do that. Maybe you can even do it in chrome directly or with a plugin, I don't know.)

Then use sha256sum to find the directory and file name you should use:

$ sha256sum loadhyph-pl.tex 
00c1e9c387b218901ebe736f6b06f8505669ed714934482d9c08cfc6f74c4caf  loadhyph-pl.tex

Take the first to characters for the directory, and the rest as the filename, so loadhyph-pl.tex would become

~/.cache/Tectonic/files/00/c1e9c387b218901ebe736f6b06f8505669ed714934482d9c08cfc6f74c4caf

Hope you can use this information to your advantage.

@hugobuddel
Copy link
Author

Tectonic does not directly tell you what files you need, but you can still figure it out from the logs.

E.g. adding a \tiny to the document will give this error without internet:

$ tectonic --keep-logs test.tex 
note: this is a BETA release; ask questions and report bugs at https://tectonic.newton.cx/
Running TeX ...
note: downloading SHA256SUM
warning: failure requesting "SHA256SUM" from network
caused by: https://ttassets.z13.web.core.windows.net/tlextras-2020.0r0.tar: error trying to connect: failed to lookup address information: Temporary failure in name resolution
...
note: connecting to https://archive.org/services/purl/net/pkgwpub/tectonic-default
error: test.tex:3: Font TU/lmr/m/n/5=[lmroman5-regular]:mapping=tex-text; at 5.0pt not loadable: Metric (TFM) file or installed font not found
Writing `test.log` (2.04 KiB)
error: something bad happened inside TeX; its output follows:

===============================================================================
(test.tex
LaTeX2e <2020-02-02> patch level 5
L3 programming layer <2020-03-06> (article.cls
Document Class: article 2019/12/20 v1.4l Standard LaTeX document class
(size10.clo)) (l3backend-xdvipdfmx.def)
No file test.aux.
(ts1cmr.fd)
! Font TU/lmr/m/n/5=[lmroman5-regular]:mapping=tex-text; at 5.0pt not loadable:
 Metric (TFM) file or installed font not found.
<to be read again> 
                   relax 
l.3 \tiny
         
No pages of output.
Transcript written on test.log.
===============================================================================
error: the TeX engine had an unrecoverable error
caused by: halted on potentially-recoverable error as specified

From this you can extract that it is looking for lmroman5-regular, and lmroman5-regular.otf is in the index file, so that's the file you need. Hope this is useful.

@faywong
Copy link

faywong commented Feb 9, 2021

Fortunately i can access to https://ttassets.z13.web.core.windows.net/tlextras-2020.0r0.tar without any proxy and bootstrap the tectronic(Suppose it's the case, i will verify it actually when at home :)).
Also many thanks to @hugobuddel for providing me some many details on about how tectonic bundle works.

I think the index file works like meta data(just like a map in real life or meta package list of rpm/deb repo for redhat/debian linux), and its size is more acceptable for must people(it's

-rw-r--r--@ 1 faywong staff 4.8M 2 8 16:11 indexes/5131b19b08f5628f7a5ccfb7d408f43dc8265c9c50eeb9686c43657265a2f4e4.txt

on my side), this index file should be distributed separated from the overall bundle so as to reduce the bootstrap cost of tectonic.

The tectonic also features at used as library, so the first 2.6G data download is a huge cost for solo application to integrate. In my opinion a more flexible package download strategy maybe is a good improvement.

@hugobuddel
Copy link
Author

For clarity: normally tectonic does not download the full 2.6GB file. It only downloads the parts of the file it actually needs, probably using the mechanism I described, but automatically instead of manually.

From your comments it seems that https://archive.org/services/purl/net/pkgwpub/tectonic-default is blocked by your proxy, but https://ttassets.z13.web.core.windows.net/tlextras-2020.0r0.tar is not (they are the same). If so, then you can manually specify to use the latter:

tectonic -w https://ttassets.z13.web.core.windows.net/tlextras-2020.0r0.tar test.tex

Maybe this would be the best option for you.

@pkgw it seems that the internet archive is blocked in China. So a billion people cannot use tectonic with the default settings. Maybe it would be possible to have several default url's to fetch the packages?

@pkgw
Copy link
Collaborator

pkgw commented Feb 9, 2021

@hugobuddel It's a good point. Better than having multiple default URLs, it might be good to change to one that is simply more globally accessible. The only reason I use archive.org is that their PURL service gives me an easily reprogrammable HTTP 302 redirection, without the sysadmining having to be my responsibility. When I've looked in the past, it has been surprisingly difficult to find a such service, at least for free. But at this point I'm OK with non-free services so there are probably more options now.

(I say that this is better than having multiple default URLs because with multiple options, there are issues of making sure to update both of them at once, etc. Seeing as it's a simple redirection service I think it is probably better to have a single source of truth, even if it is a single point of failure as well.)

Anyway, this is all a bit outside of the main topic of this issue; discussion of this topic should probably go to a new one.

@faywong
Copy link

faywong commented Feb 10, 2021

For clarity: normally tectonic does not download the full 2.6GB file. It only downloads the parts of the file it actually needs, probably using the mechanism I described, but automatically instead of manually.

@hugobuddel Got it, thanks for such a good demonstrate of the bundle mechanism tectonic taken.

I say that this is better than having multiple default URLs because with multiple options, there are issues of making sure to update both of them at once, etc. Seeing as it's a simple redirection service I think it is probably better to have a single source of truth, even if it is a single point of failure as well.)

@pkgw I agree with you. multiple default URLs brings in more complicated behaviors(harder to locate when failure/version mismatch), reduce the possbility of failure but other than a root solution.

@faywong
Copy link

faywong commented Feb 10, 2021

Also i noticed that the CTAN archive is mirrored across China education institutes(for example tsinghua university mirror)

If the tectonic bundle can be accepted by CTAN archive, it will be mirrored and synced on more servers. Just to concern the CDN expense taken to host the bundles. :)

@pkgw
Copy link
Collaborator

pkgw commented Feb 12, 2021

@faywong The core hosting of the bundle files actually isn't, or at least shouldn't be, an issue at the moment. The issue here is having a reliable service that redirects a globally stable URL (embedded in the source code and distributed executables) to the latest version of those bundle files. This is basically the functionality of URL shortener services, which are of course quite common, but I've had trouble finding something that meets all of the requirements that I'd like to uphold for Tectonic: mainly that the ownership is transferrable, that I can update the redirection target, and that (ideally) the service is free.

(I've also had issues with the hosting in the past, because it can be tricky to find a place that hosts large files and supports HTTP Range requests, but I decided to bite the bullet and start paying to host on an Azure Storage service, and that should solve that problem until/unless the costs grow hugely.)

@mirrorinf
Copy link

@pkgw It would be ideal if the latex packages (.cls .sty files etc.) can be downloaded from a user specified CTAN mirror.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bundles and files Bundle issues, finding fonts, etc. docs Documentation edits & improvements
Projects
None yet
Development

No branches or pull requests

5 participants