Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

epub or mobi version #287

Open
slacka opened this Issue Jun 21, 2013 · 42 comments

Comments

Projects
None yet
8 participants

slacka commented Jun 21, 2013

Would be great for those of us with e-book readers. Thanks!

mdnahas commented Jun 21, 2013

When I looked into this in the past, the results with "mobi" and "ePub" are
difficult to achieve and faulty. LaTeX just isn't meant to generate them.

I use PDFs on my iPad Mini. The book looks fine, but the font is small.
If I have the time, I'll look into tuning the page format for a smaller
screen. (less margins, larger font, but keep images the same size.)

http://tex.stackexchange.com/questions/16374/effort-to-make-latex-ebook-friendly

http://tex.stackexchange.com/questions/3070/preparing-pdf-files-for-ebook-readers-etc

On Fri, Jun 21, 2013 at 10:56 AM, slacka notifications@github.com wrote:

Would be great for those of us with e-book readers. Thanks!


Reply to this email directly or view it on GitHubhttps://github.com/HoTT/book/issues/287
.

Owner

andrejbauer commented Jun 21, 2013

In the meanwhile we prepared hott-ebook.pdf with small margins. Please use that until someone figures out how to typeset math for ebooks.

mdnahas commented Jun 21, 2013

Dandy!

On Fri, Jun 21, 2013 at 1:44 PM, Andrej Bauer notifications@github.comwrote:

In the meanwhile we prepared hott-ebook.pdf with small margins. Please
use that until someone figures out how to typeset math for ebooks.


Reply to this email directly or view it on GitHubhttps://github.com/HoTT/book/issues/287#issuecomment-19811309
.

Contributor

loopspace commented Jun 21, 2013

I've figured out how to typeset maths for ebooks ...

But it's definitely non-trivial to convert a LaTeX file to ePub. I got something that compiled, but needs some work to make it a true ePub version.

Owner

andrejbauer commented Jun 21, 2013

I will buy you a beer if you can do this.

Contributor

loopspace commented Jun 21, 2013

To give you a flavour of what it might look like (and also to show where work would need to be done) here's the introduction as an ePub:

http://www.math.ntnu.no/~stacey/documents/HoTT-introduction.epub

This is a valid ePub3 with mathematics embedded as MathML. It opens in iBooks, but some of the glyphs are missing (as are the references, but that's because it's just the first chapter). There are probably lots of other oddities that someone familiar with the text would spot.

The big issue (which is common to any LaTeX-to-non-fixed-format conversion) is that there are things where you have to work out "What do I want this to be like in the final document?" and the answer isn't always the same for PDF as for XHTML (including ePub) as PDF is a visual, fixed layout format and XHTML isn't.

But fortunately, this is perhaps something that is easy to do via a collaborative effort. If I can get the backbone working then others can put the flesh on it by figuring out the figures and replacements for missing glyphs and so forth.

Owner

andrejbauer commented Jun 21, 2013

Would it make sense to have a separate branch for this? How much will we have to "pollute" the source to get epub working?

Contributor

loopspace commented Jun 21, 2013

Inevitably some messing with the main text will be necessary (particularly with diagrams). At the moment I'm trying my best not to mess with the actual text and just handle things at the macro level.

Though I'm coming up against some issues with your ... creative ... use of the equation environment! I might have to rewrite those as I've not found a way to embed ordinary text inside mathematics and still have full range of text.

Technical bit: the equation environment signals maths, so we start a MathML piece. Text within that gets put into an mtext tag, but you then want emphasised text within that and the naive text doesn't work. I'm not sure how to fix this as I don't know how to make that work. Easier would be to devise an alternative environment that didn't start off in mathematics mode. Here's an example from preliminaries.tex:

\begin{equation}\label{eq:tautology1}
\text{\emph{``If not $A$ and not $B$, then not ($A$ or $B$)''}.}
\end{equation}

Technically here, the outer mathematical environment is doing nothing. But an automated system still sees it.

An alternative would be to have a textequation environment which centres its contents and sticks the equation label exactly as the equation environment does, but doesn't go into maths mode.

Owner

andrejbauer commented Jun 21, 2013

For that sort of thing we probably don't care whether it really is a math environment. It just has to look kind of the same.

Contributor

loopspace commented Jun 21, 2013

A very simple version would be:

\newenvironment{textequation}{\equation\hbox\bgroup}{\egroup\endequation}

Owner

andrejbauer commented Jun 21, 2013

I am inclined to start this off as a fork first, and then pull it in when we can be sure it's working. I imagine you already have a fork. I'll follow you there, and here maybe you can post a link to it for people who are interested in coperation.

Or we make a branch. I am not sure which is The Correct Thing to do in this case.

Contributor

mikeshulman commented Jun 21, 2013

I guess the reason to make this a math environment was to get it an
equation number.
On Jun 21, 2013 10:53 AM, "loopspace" notifications@github.com wrote:

A very simple version would be:

\newenvironment{textequation}{\equation\hbox\bgroup}{\egroup\endequation}


Reply to this email directly or view it on GitHubhttps://github.com/HoTT/book/issues/287#issuecomment-19830773
.

Contributor

dvanduzer commented Jun 21, 2013

Here are some samples generated using pandoc (v1.11.1 compiled with texmath 0.6.1.3):

$ pandoc -t epub3 -f latex -o ~/Dropbox/HoTT/hott-intro-pandoc.epub introduction.tex

https://www.dropbox.com/s/saxm6cdqnjccd8g/hott-intro-pandoc.epub

$ pandoc -t epub3 -f latex -R -o ~/Dropbox/HoTT/hott-intro-pandoc-raw.epub introduction.tex

https://www.dropbox.com/s/hmoan0vzio2u5o9/hott-intro-pandoc-raw.epub

Any macro debugging will help all of these efforts. The best github way to do this would probably be to create an "epub" feature branch in your project. Contributors would still use forks and pull requests to submit changes to the "epub" branch, and whichever work pans out can later be merged into the master branch on the official project.

Contributor

loopspace commented Jun 21, 2013

The textequation that I gave above will get an equation number.

I foresee two types of changes needed to get an epub version. Some changes are things that can be done on the main version and have no effect there but make life easier for conversion scripts (such as mine or pandoc). These are things like the textequation environment, and changing underscores in reference names to hyphens.

Others will be things that are specific to an epub version.

I'm currently working just on my own machine. I'll create a github fork sometime soon and make it a bit more official. I wanted to see if it were even possible before doing anything serious.

Right, here's what I've gotten today. It's everything up to the appendix, apart from the fact that I've blanked out a few things that I know are going to be tricky (diagrams mainly) and there are a few symbols that I can't find in the unicode list. It opens in Azardi, I haven't tried it in iBooks yet. There are validation errors, mostly due to references but a few due to maths in the section titles.

http://www.math.ntnu.no/~stacey/documents/HoTT-main.epub

Contributor

loopspace commented Jun 21, 2013

I've uploaded a new version to the above link which does open in iBooks on an iPad. It contains all the sections up to the appendix.

Owner

andrejbauer commented Jun 22, 2013

This loks very promising! On my ipad in iboks there seem to be missing math fonts.

Contributor

loopspace commented Jun 22, 2013

@andrejbauer Yes, it would appear that iBooks doesn't come with enough fonts to cover the unicode range. Fortunately, it is possible to embed fonts in an epub. I've just replaced the above version with one with the main STIX fonts embedded (the STIX licence allows this, I believe) and that fixes a lot of the missing glyphs. There's still some missing (integral signs and the like) because I didn't embed the full STIX family on the first go as I didn't know if it would work.

Anyway, try it again from http://www.math.ntnu.no/%7Estacey/documents/HoTT-main.epub

Owner

andrejbauer commented Jun 23, 2013

It is using German gothic letters instead of mathsf, weird. I am still amazed at how quickly you got this far.

Contributor

loopspace commented Jun 23, 2013

Oh, that's because I was lazy and couldn't remember off the top of my head what mathsf looked like so I set it to mathfrak for the time being.

Hmm, the bit that actually does the conversion to MathML doesn't yet know about mathsf. I've set it to mathrm for the time being and will look into adding that later.

Contributor

loopspace commented Jun 24, 2013

New version uploaded (same link as above), now includes bibliography.

Things that need work:

  • Indexing will take a bit to get right.
  • tikzpicture and xymatrix stuff needs converting to proper diagrams
  • need to implement tables
  • the stuff used in formal.tex needs implementing (mathpartir) which means figuring out how all those diagrams should look in MathML, so for now that file isn't included
Contributor

loopspace commented Jun 25, 2013

I have created a proper fork for this project now. I'm new to git/github so I may get things horribly wrong!

Contributor

loopspace commented Jun 25, 2013

The stuff in formal didn't take too much (though it could probably be made better). Latest version has that in as well. Tables also (though the colours in the homotopy groups of spheres aren't as yet). So indexing and figures are the main missing bits.

Member

guillaumebrunerie commented Jun 27, 2013

On a Cybook e-reader, there are still a lot of symbols missing, for instance \to, \mathbb{N}, \equiv, \simeq, \top, \bot, \vee, \wedge …
Not all of them are missing, for instance greek letters, \times, \infty, \int are present (yes, there are some integral signs in the book ^^).
Also, indices and exponents do not seem to work (but maybe this e-reader does not understand MathML)

Contributor

loopspace commented Jun 28, 2013

@guillaumebrunerie There aren't that many ebook readers that can cope with MathML so it wouldn't surprise me that it didn't work. I know that iBooks has rudimentary support, and there are some desktop readers that work as well (Azardi does, Calibre has MathJaX embedded). Sorry about that.

Contributor

dvanduzer commented Jun 28, 2013

If you're seeing integral signs, then the markup you're getting might just be a symptom of it being a work in progress.

MathML is part of the newer ePub 3.0 standard, so look for e-reader software that supports it.

Contributor

loopspace commented Jun 28, 2013

(New version just uploaded.)

Yes, definitely a work in progress! I've just uploaded a screenshot to G+ showing that at certain font sizes then it renders very curiously. I suspect that I need to embed more fonts, probably the entire STIX family.

Contributor

loopspace commented Jun 28, 2013

I just tried it out in the Firefox ePub reader add-on and it looks much better than in iBooks (superiority of Gecko's MathML rendering over Webkit's I guess) which suggests that there is a limit to how good this is going to get.

Member

guillaumebrunerie commented Jun 28, 2013

It’s a bit strange that MathML is mandatory for ePub 3 but not the mathematical fonts. What is MathML good for if you can’t display any mathematical symbol?

Contributor

loopspace commented Jun 28, 2013

My understanding is that MathML isn't mandatory for ePub3 but that ePub3 works on a modular system so it's possible for an ebook reader to say "We support the core of ePub3" without actually supporting MathML.

For iBooks then MathML support is almost an accident. iBooks is based on Webkit and so when something is supported in Webkit then it automatically gets supported in iBooks. But that doesn't mean that they've deliberately thought about it in great detail (I'm not saying they haven't, just that it needn't follow). Given that its support in Webkit is no where near Gecko's, I guess the rationale is that there's no point embedding a slew of fonts until it is more developed.

But I'm guessing on most of this. It might just be that I don't yet know how to make it use the fonts correctly. It certainly looks okay in the Firefox epub reader with no missing glyphs (well, hardly any).

I have downloaded the epub version from the link that loopspace posted 8 days ago to my Sony PRS T2 ebook reader. Most formulas look o.k. but
a) no subscripts or superscripts, e.g. in the formula for ISO(A,B) in the section on constructivity from the introduction, everything is on one line. \pi_1 or x^2 is rendered on one line, like "x 2"
(But footnotes ARE shown as superscripts.)
b) certain missing symbols are shown as a crossed-rectangle, for example mathcal-U (in particular) and mathbb-S (but not mathbb-Z),

Besides
c) The math fonts are upright and not italic.
d) displayed equations are left-aligned.
e) equation numbers appear as 1.2.1 instead of (1.2.1) in the book, both where they are defined and where they are cited (but there they are highlighted as hyperlinks)
f) there are boxes around theorems, lemmas and proofs etc.
g) there is no table of contents
h) figures are missing.

I have also looked at it with the FBReader program on a Linux computer, and there it did not look better (the layout is even worse, equation numbers are on the left, tables are not usable at all)

Contributor

loopspace commented Jul 1, 2013

Let me take your comments one-by-one. (Incidentally, I keep updating the book but the link stays the same. The current version dates from June 28th.)

a) This is a sure sign that your ebook reader doesn't support MathML. There's not a lot I can do about that! The superscripting for footnotes is actually done via normal CSS not MathML so I'd actually be more surprised if they didn't work.
b) I was having a few issues with fonts and how font selection should work. I'm not sure that fixing this will fix the missing glyphs for you but just in case I've uploaded the most recent version. Some of the more popular mathematical symbols (such as mathbb-Z) have made their way into "ordinary" fonts but for the full range then you need something like the STIX fonts. As they tend not to be available on most devices (is there any that has them?) then I've embedded them in the book. But if your reader doesn't support embedded fonts then there's nothing I can do.
c) Might be the font selection issue, might be lack of support for MathML. MathML puts single tokens in italics by default so the CSS doesn't need to specify that. If the MathML is being ignored, all the reader sees is, say "x" not "x-in-italics" so renders it in normal font.
d) Should be centred.
e) Where they are defined then they should have parentheses around them (not sure about cited). But the parentheses are inserted by CSS (the :before and :after pseudo classes) so if the reader doesn't support them then it won't render them.
f) Yes, I put those in as an experiment. I find it helps me see where proofs and theorems end.
g) Should be there, but ePub3 and ePub2 have different methods for tables of contents so if your reader is basically saying "Well, it's sort of like ePub2 so I'll have a go" then I'm not surprised this is missing.
h) Not yet done.

On a computer, you can try either azardi or the epub add-on for Firefox to see what it ought to look like.

Contributor

loopspace commented Jul 4, 2013

New version uploaded (same link as above). The symbol index and regular index are now in place.

Question for anyone still reading: in the PDF, the index links back to the page on which the index entry was made. In an eBook we can link to the same place but we don't know the actual page number since that can change depending on user settings. What should the link text be in this situation? That is, in the actual index it might say:

category 126

where 126 is the page number and is hyperlinked to where the index entry was defined. But in the ebook, it might not be page 126. We can't know what page it will be on but we need some text there to provide something to click on. What should that text be?

need /some/ text there to provide something to click on. What should that text be?

Could the link be the entry itself? (without anything additional to click on)

Contributor

loopspace commented Jul 4, 2013

Hadn't thought of that.

I'm not sure it would work. For one thing, one index entry might have several places to link back to (∞-category has about a dozen). For another, the indexed entry might be a bit complicated (for example, contain mathematics) and I'm not sure how well that would interact with being a link.

I'm not sure it would work. For one thing, one index entry might have several places to
link back to (∞-category has about a dozen).

That's a point. How about parapgrah or section numbers, like §1.2, §1.2, §3.2.1.
Like page numbers, they give the reader a hint whether several occurrences are close
together or spread over different parts.

(And remember there is always the fall-back option of writing "click here", "click here",
"click here", "click here"

Contributor

mikeshulman commented Jul 4, 2013

I think section numbers seems like the best choice.

Contributor

loopspace commented Jul 5, 2013

Turned out that the easiest was to use \cref to reference the "enclosing block", whatever that might be. So if the indexed entry was in running text you might get "Chapter 2" but if in a definition you'll get "Definition 3.1.7". The link goes to wherever the label was, though.

I need to do the same for the symbol index.

Contributor

loopspace commented Jul 5, 2013

Symbol index done (I changed the definition of \pg to \cref).

This is not about an epub version but a PDF version for ebook readers. (related to issue #290)
I have prepared one for myself and put it at http://page.mi.fu-berlin.de/rote/Kram/hott-EB.pdf

It has the following features.

  • Changed the page aspect ratio to 4:3 (approximately)
  • Smaller page area, reduced margins (and therefore larger display size)
  • Overlapping pages

I found myself frustrated over flipping back and forth over a theorem or formula that straddles page boundaries. The bottom third of the screen is now devoted to a preview of the next page. I remember seeing mediaeval books or manuscripts where the first or syllable or word of a page was repeated at the end of the last page. I wonder how long it will take me to get used to reading in this mode; for pure text it is not so usual. The ideal would rather be a long continuous scroll, but that does not work on my reader. But maybe math is not good for ebook readers anyway.

Contributor

mikeshulman commented Jul 6, 2013

I guess you were unsatisfied with hott-ebook, which is like the printed ustrade version without the margins? We (particularly @andrejbauer) decided that we're not willing to "officially" maintain versions with more than two different text width setttings, as each new such setting requires going through the entire book to adjust the line breaks and so on.

The repetition is an interesting idea! I'm not sure what I think about that.

It was quick hack, and it has some quirks: When an html link goes across a page boundary, the whole page becomes the clickable area. References from TOC or index jump to the previous page of the intended target, and the heading on the top of the page is the appropriate one for the next page. Still, I like it better than hott-ebook, but it is definitely not suitable as a "production" version. (What about adapting the height of hott-ebook? You don't have a lot of figures and such about whose placement you need to worry. I don't know on what other devices people read hott-ebook, but 4:3 seems to be a typical aspect ratio.

What I would still like is to jump from the index to the right place, not to the top of the page. (like in loopspace's ebook version)

I just noted that the lines of hott-ebook are just a tiny bit longer than in my version; I will look into that
(And there is an index entry for "anger". is that a joke.

Contributor

mikeshulman commented Jul 7, 2013

I'm not very skilled with formatting issues, but maybe @andrejbauer can have a look when he gets back from his vacation.

(yes.)

@JasonGross JasonGross added a commit to JasonGross/book that referenced this issue Mar 28, 2014

@JasonGross JasonGross Update cleveref from 0.17.9 to 0.19
I'm trying to get the book to compile with htlatex so that we can have a
nice kindle-compatible epub, and it requires updating cleveref.
(Although I am a fan of MathML and the epubs for #287, e.g., of
@loopspace and @dvanduzer, Kindle doesn't currently support MathML at
all, and htlatex uses embedded images for the math.)
03b8e1c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment