Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML Export #721

Open
Nitwel opened this issue Apr 10, 2023 · 84 comments
Open

HTML Export #721

Nitwel opened this issue Apr 10, 2023 · 84 comments
Labels
feature request New feature or request html Related to HTML export

Comments

@Nitwel
Copy link

Nitwel commented Apr 10, 2023

This is all very much my draft notes currently but will be worked upon in the comming days, so take everything with a grain of salt.

tl;dr This is looking really promising already in terms of translatability, the main callenge here will likely be the rewrite of the Layout process into separate steps as discussed.

Comparison Typst => HTML

How easy we could convert most functions into a HTML/CSS counterpart.

Text

  • #lorem() -> #text
  • #emph -> <em>
  • #linebreak -> <br>
  • #lower -> <span class="lower"> or similar with text-transform: lowercase
  • #overline -> <span class="overline"> with text-decoration: overline
  • #raw -> <code> with syntax highlighting done over a <div> instead (possibly highlight.js)
  • #smallcaps -> <span class="smallcaps">
  • #smartquote -> #text
  • #strike -> <s>
  • #strong -> <strong>
  • #sub -> <sub>
  • #super -> <sup>
  • #text -> <span>
  • #underline -> <span class="underline"> with text-decoration: underline
  • #upper -> <span class="upper"> or similar with text-transform: uppercase

Math

Good sauce: https://fred-wang.github.io/TeXZilla/

  • accent -> <mover>
  • attach -> <munderover> / <msubsup>
  • scripts -> <msubsup>
  • limits -> <munderover>
  • binom -> <mrow><mo>(</mo><mfrac linethickness="0px"><mi>n</mi><mi>k</mi></mfrac><mo>)</mo></mrow>
  • cases -> <mrow><mo>{</mo><mtable columnalign="left left" displaystyle="false"></mtable></mrow>
  • equation -> <math>
  • frac -> <mfrac>
  • lr -> <mrow> with <mo>
  • mat -> <mtable>
  • root -> <mroot> and <msqrt>
  • round -> Same as lr
  • styles -> with css
  • op -> <mo> or <mi>
  • under/over -> <munderover>
  • variants -> #text
  • vec -> <mtable>

Layout

  • #align -> <span class="center"> with text-align: center
  • #block -> <div>
  • #box -> <div class="box"> with display: inline-block
  • #list -> <ul>
  • #colbreak -> maybe possible with css break-before: ...
  • #columns -> possible with column-count: 2
  • #grid -> <div class="grid"> with display: grid and so on
  • #hide -> either with opacity: 0 or using a <div> with a fixed width
  • #measure -> maybe possible with js?
  • #move -> with position: relative; top: y, left: x
  • #enum -> <ol> tbh, the name could maybe be improved
  • #pad -> with padding: 10pt 2pt ...
  • #page -> Doesn't make sense in context of web media
  • #pagebreak -> Same as #page
  • #par -> <p>
  • #parbreak -> beginning of new <p>
  • #place -> with position: absolute; top: ...
  • #repeat -> Could maybe be possible with width: 100%; overflow: hidden but very hacky
  • #rotate -> with transform: rotate(...)
  • #scale -> with transform: scale(...)
  • #h -> possibly with position: inline-block; width: ...
  • #v -> same as #h but with height instead of width
  • #stack -> with display: flex
  • #table -> <table>
  • #terms -> <dl>

Visualize

Easily possible using SVG's

  • #circle -> <circle>
  • #ellipse -> <ellipse>
  • #image -> <image>
  • #line -> <line>
  • #path -> <path>
  • #polygon -> <polygon>
  • #rect -> <rect>
  • #square -> <rect>

Meta

  • #bibliography -> Likely with <a href="#el-id"> but has to be looked into
  • #cite -> same as #bibliography
  • #counter -> #text
  • #document -> should be used to insert meta information into the HTML Document.
  • #figure -> <figure>
  • #heading -> <h1> /<h2> / ...
  • #link -> <a>
  • #locate -> Doesn't make much sense in the context of web media / possibly with JS
  • #numering -> #text
  • #outline -> #text
  • #query -> can be ignored
  • #ref -> <a>
  • #state might be ignored
  • #style -> possibly can be ignored
@Enivex
Copy link
Collaborator

Enivex commented Apr 11, 2023

#repeat -> Doesn't make sense in the context of web media

I fail to see why that is

#hide -> with opacity: 0

Documentation says "It may also be useful to redact content because its arguments are not included in the output.", so simply using opacity is not an option

#page -> Doesn't make sense in context of web media
#pagebreak -> Same as #page

It should at least be possible to get separate pages for chapters/sections, with a menu for navigation.

@ararunaufc
Copy link

I'll always vote in favor of semantic tags. That means em and strong instead of the alternatives, for example.

There are some more tags that may be used with this in mind, like section, figure...

@lingo
Copy link

lingo commented Apr 11, 2023

Re this:

#page -> Doesn't make sense in context of web media
#pagebreak -> Same as #page

There are CSS page-break and @page properties which could be useful (mostly if other tools are going to process the HTML in some way).

Providing for something like<br class="page-break"> would be useful, at least, or as mentioned by @ararunaufc , using semantic tags like <section class="page"> might be better?

@Nitwel
Copy link
Author

Nitwel commented Apr 11, 2023

Thanks for the comments, updated the comparison above! ❤️

On the note of pages, we could have some form of metadata in the HTML document noting down, where we want to do a page-break, though it might miss the point as exporting to a printable document should go through the PDF export directly instead of exporting it to html and then printing that.

But yes, we can look into supporting that too.

@laurmaedje
Copy link
Member

Math should use MathML!

@ararunaufc
Copy link

Math should use MathML!

That would be ideal in principle, but maybe not great in practice...

Quoting MDN:

the subset focusing on semantics has never been implemented in browsers while the subset focusing on math layout led to incomplete and inconsistent browser implementations.

Maybe MathML Core, but I don't know how that would work out.

Note: It is highly recommended that developers and authors switch to MathML Core, perhaps relying on other web technologies to cover missing use cases.

@Nitwel
Copy link
Author

Nitwel commented Apr 11, 2023

We can definitely focus on using MathML Core, my current translation should also only contain MathML Core elements as far as I'm aware.

@reknih
Copy link
Member

reknih commented Apr 11, 2023

Hey! Thanks for this issue. I wanted to chip in with a few thoughts here:

  • emph and strong are intended as semantic functions and should be translated to em and strong tags respectively.
  • Arguably, strike should be translated to the s tag. However, the semantics of s are quite narrow and there is also the del element to cover other semantic cases. Because there is no alternative way of achieving a "non-semantic" strikethrough in Typst like there is for emph and strong, I can be swayed either way as to whether these functions should result in a semantic tag or CSS.
  • document shall not be ignored. Its title property should set the title tag's content and the author property might be used to populate Open Graph meta tags
  • state should not be ignored.
  • hide must keep its content private. A solution that @laurmaedje will likely hate me for is to measure its content's dimensions using the normal print layout code paths and then create a inline-block of these dimensions. This solution does not guarantee that the space reserved is the same as if the user agent had rendered hide's contents. Alternatively, hide could be ignored for the moment.

In general, you carry over quite a bit of styling! We should ask ourselves what the goal of HTML output will be: Is it to strive for a pixel-perfect reproduction of the PDF or to produce a document that feels "web native" and can easily be styled with downstream CSS or be used as an artifact in i.e. a static site generator. Or do we want something in-between (eBooks)?

The answer to these questions should inform the design with respect to the more invasive layout functions like move and place. They are built a series of fixed-height viewports instead of a continuous one, so does applying them to HTML output make sense? How many of the text styles should be applied as CSS? Should a Typst document be able to conditionally decide what to do depending on the export target? It is true that we could also conditionally apply a lot of these styles only when the HTML document is printed with media queries, the CSS paged media module, or ePub extensions.

I, personally, think that Typst's HTML output should be more semantic than a pixel-perfect reproduction of the PDF. However, I do not have an opinion yet to the extend we should "bake in" styles as CSS and apply functions like place.

@Nitwel
Copy link
Author

Nitwel commented Apr 11, 2023

Updated emph, strong, strike and document.

On the note of state: As far as I understood that feature, it will compile into text or something else and won't matter for the HTML export. I used #text in those cases, meaning, HTML export doesn't have to worry about it.

I'm definitely on par with keeping the outputed HTML file as semantic as possible so I propose the following balance between Semantics and Styling.

  1. If there is a 1 to 1 translation between Typst and HTML, we use it.
    like emph and <em>
  2. Typst functions that have no direct translation but which functionallity can be reproduced in HTML, should at least share exact behaviour.
    like #parbreak and starting a new paragraph in html or #lower with a class that sets it's contents to lowercase
  3. For Typst functions that will have different behaviour depending on their context, at the moment either PDF or HTML, we will implemend something like media queries in css, that allow to apply different functionallity depending on their environment.
    This could look something like:
if context == "HTML" {
  #place(...)
} else {
  #place(...)
}
  1. Functions that are not possible when having the layout phase after the export phase should be ignored for now and revisited later as their only solution would have to be hacked into HTML and thus need further in depth discussion on what we should do in each case.
    like #hide or #measure

Let me know if that aligns with your thoughts on an HTML export or if I should think this through furhter.

@laurmaedje
Copy link
Member

On the note of state: As far as I understood that feature, it will compile into text or something else and won't matter for the HTML export.

That's correct. There are some things in your list which are already resolved by the time HTML export will start, including query, state, and counter.

@reknih reknih added the html Related to HTML export label Apr 12, 2023
@Dherse Dherse mentioned this issue Apr 12, 2023
45 tasks
@laurmaedje laurmaedje added the rfc label Apr 12, 2023
@laurmaedje laurmaedje changed the title [RFC] HTML Export [RFC #3] HTML Export Apr 12, 2023
@ararunaufc
Copy link

3. For Typst functions that will have different behaviour depending on their context, at the moment either PDF or HTML, we will implemend something like media queries in css, that allow to apply different functionallity depending on their environment.

Personally, I would prefer this to be something to do outside of the document, and I would avoid making the document "know" about existing environment options.

What I mean is that the document would only describe what it would like to do and something else would provide the how. (Not a proposal, but picture something like CSS custom properties that can only be set outside of their usage.)

@bluebear94
Copy link
Contributor

Note that there are multiple HTML tags that have the same style by default but different semantic connotations: cite, em, and i for italics and b and strong for bold. Perhaps Typst could pick a reasonable default (namely, em and strong) but provide commands for the other tags for those who care about the distinction.

@Enivex
Copy link
Collaborator

Enivex commented Apr 18, 2023

Note that there are multiple HTML tags that have the same style by default but different semantic connotations: cite, em, and i for italics and b and strong for bold. Perhaps Typst could pick a reasonable default (namely, em and strong) but provide commands for the other tags for those who care about the distinction.

The semantic distinction is important for accessibility is not?

@drewcassidy
Copy link

I think his would be very useful. The typst language looks perfect for a static site builder, by being both a templating language and a markup language all in one

@drewcassidy
Copy link

Im currently investigating forking and adding HTML support myself now that I've learned how typst works internally. It would require making non-fixed-position version of the typeset function, and possibly modifying compile to be a generic to output using either the fixed-position PDF-style builder or the freeform HTML-style builder. It looks like it would be easiest to have the free-form typeset directly emit HTML instead of the 2-step method PDF uses.

Im wondering, since I've never contributed to a large project like this, how much I can modify the library? Typst seems to make almost everything public so Im hesitant to rename or modify builtin function signatures so some feedback on what's acceptable would be appreciated.

@laurmaedje
Copy link
Member

I wouldn't care too much about breaking changes and HTML export should indeed be 1-step instead of 2-step. However, we have planned to split up the current layout phase into two, where the first results in a fully styled, semantic document model. This would then serve as the source for HTML export and layouting.

@drewcassidy
Copy link

Understood. I'll hold off in my efforts for now.

#page -> Doesn't make sense in context of web media

I disagree. For example if using for a static site builder it might be useful to paginate things, like with a list of articles.

@janosh
Copy link
Sponsor

janosh commented Apr 30, 2023

Afaik KaTeX uses MathML under the hood where possible but is more fully featured and might be less hassle than MathML itself.

@Enivex
Copy link
Collaborator

Enivex commented Apr 30, 2023

Afaik KaTeX uses MathML under the hood where possible but is more fully featured and might be less hassle than MathML itself.

That would require converting typst math to LaTeX math instead, and KaTeX supports an even smaller subset of that than MathJax.

MathML Core is the way to go.

@janosh
Copy link
Sponsor

janosh commented Apr 30, 2023

I see. Out of curiosity, how big of an undertaking is HTML export? Doable in weeks/few months/many months?

@BirnadinErick
Copy link

BirnadinErick commented May 1, 2023

what if we convert the typst to markdown, then let markdown ecosystem take care of the rest? am I not thinking correct?

@Enivex
Copy link
Collaborator

Enivex commented May 1, 2023

what if we convert the typst to markdown, then let markdown ecosystem take care of the rest? am I not thinking correct?

That conversion would be very lossy.

@laurmaedje
Copy link
Member

I see. Out of curiosity, how big of an undertaking is HTML export? Doable in weeks/few months/many months?

We first need to rework some internals. I'd say a few months.

@silvergasp
Copy link
Contributor

This would be more an architectural thought (that I'm naiively suggesting without fully understanding typst internals), but I wonder if it would make sense to create an intermediate representation. Something akin to LLVM-IR but for documents. For example rustc (clang and many other compilers) will first compile to llvm-ir and then use llvm tooling to convert the IR to some sort of machine code.

This sort of a pipeline would look something like;

  • typst AST -> Document-IR (intermediate representation)
  • Document-IR -> PDF... OR
  • Document-IR -> html... OR
  • Document-IR -> insert your format here.

This would be nice for a couple of reasons.

  1. Someone could create an entirely new front end for Document-IR similar to how clang and rust are front ends for LLVM-IR.
  2. You could decouple the final format from the intermediate representation so adding a new format (e.g. EPUB) would become significantly easier.
  3. Any exporters for the document-IR format could be shared between front ends.

@FelixBenning
Copy link

FelixBenning commented Mar 12, 2024

I wanted to contribute some references since I have thought about targeting html with maths for some time:

Tools which try to convert LaTeX to html

These tend to be a bit brittle, because they never support the entirety of LaTeX, so when you for example use the package aligned-overset to put text arguments over equations without messing up alignment, you will break LaTeXML. They do allow you to write a .sty.ltxml file to replace the functionality provided by packages, but the syntax is not well documented and you have to learn a new language.

Starting from a new language and targeting both LaTeX and html

This is the category typst falls into, so I am suprised that nobody mentioned PreTeXt so far. It is a very interesting approach, which essentially uses XML Templates to convert xml to both html and latex.
The good:

  • Since it is starting from scratch, compilation will always work for html
  • it has XML schemas to statically check the correctness of your source file and give you tooltips, autocompletion support (which tags are supported in which block, e.g. subsections inside of sections)
  • It is geared towards interactivity (collapsible proofs, embedded videos and interactive plots, integration to exercise platforms (WeBWorK) to submit exercises,...)

The bad:

  • XML is verbose, especially the necessary paragraph tags <p> are super annoying.
  • Its appraoch to citation and bibliographies is weak.

The existing content in PreTeXt consists mostly of textbooks, i.e. people who wanted their textbooks to be more interactive than pdf pages, but still wanted to be able to publish the book in print. Articles on the other hand are rare (bad citation functionality might either be a cause or effect of this).

Starting from html or markdown

Interesting projects in this space are

  • Quarto the continuation of RMarkdown for other languages allowing interleaving programming and explanations such that the plots are generated in the files.
  • geared towards dashboards: Observable 2.0 recently became open source allowing for the generation of data heavy static webpages. More dashboard frameworks include Dash and Shiny
  • geared towards blogs: Jekyll Hugo, Gatsby
  • notebooks: JuPytR, Pluto.jl, Observable again

Neither of them are perfect for presenting maths and neither of them have bibilography support and it is typically difficult to target LaTeX so you do not want to write in them even though the end result might be perfect for interactivity.

Personal musings about typst

You are not going to capture article writing in the short term because journals have styleguides geared towards LaTeX and want LaTeX. Since academics then have to use LaTeX, they will need a reason to use a different tool for the textbooks, lecture scripts and journals they want to write. So you need a unique selling point. For me this is interactivity.

I spent a lot of time trying to find a tool, which allow for more interactivity. Other people want this interactivity too (which is why people bother with PreTeXt to write textbooks, or which is why the journal distill existed. I hope that typst will eventually fill this niche. So this would be the selling feature for me. I have to learn a new syntax over LaTeX, but I get interactivity in return.

So you aim for the textbooks and lecture scripts first (unfortunately eating PreTeXt's lunch instead of filling a new niche but such is life), since you are not going to capture articles early on.

Interactivity is a nice buzzword, but you have limited time so what does that mean step by step?

Step 1: Capture textbooks

  1. Collapsible proofs. This is such a simple feature which can be implemented with laughingly little javascript (or no js at all if you use the <details> and <summary> tags), but even LaTeXML does not do it natively yet (see Issue and js snippet here)
  2. Allow for the inclusion of interactive javascript plots (with a static default for pdf).

Step 2: Capture articles

  1. The interactive journal distill had to implement their own citation library in javascript, since there is no such functionality for html. This generated a bibliography and provided hover-over support for the references in the text allowing for the numeric citation style without it being annoying like in pdf form. This is a feature which should probably be a separate javascript library (something which I started but abandoned with bibcite based on custom html tags and someone else seemed to have the same idea and even named it the same bibcite based on markdown sytax

Step 3: Long term

To get all the conservative holdouts in the long run:

  1. It would be really nice to not only collapse the entire proof, but add and remove details from proofs interactively. I.e. click on an equal sign and get another step in-between.
  2. Or highlight the terms which change from line to line in color.

Since the above probably causes more work during the writing process, this will probably only become reality if you reduce the workload with either or both of these approaches:

  1. Allow collaborative editing (like github allows marking certain lines of code and adding a code snippet in a review), you could allow for comments in a document which could be expanded (i.e. people explain equation to each other)
  2. Integrate with AI: since you don't want it to hallucinate stupid proofs, this requires proof checking (e.g. integration with lean. To enable a theorem proves such as lean to check proofs, the notation needs to be as semantic as possible. It is unclear how doable this is, since even f(x+y) could be misintepreted as f * (x+y) with an implicit multiplication sign (which is why semantic mathML is so verbose). But maybe it will be possible to use a language model to differentiate between these semantic meanings in the future. It would certainly be easier if there were fewer ambiguities, so maybe keep that in mind?

LaTeXML is blocking on the collapsible proofs because they think about all the other collapsible things and try to come up with a more general solution. I get the sentiment, but I also think that you need that easy selling point first and have to come up with possibly breaking long term solutions later.

@vsheg
Copy link

vsheg commented Mar 13, 2024

@FelixBenning The PDF engine meets 70% of my needs, and a static HTML engine could cover another 20%. I don't plan to use Typst for interactive plots soon, as I use many tools for this and don't think Typst will have advanced features. I believe Typst will rely on extensions for these tasks. Just like I don't expect the Python core team to build plotting or machine learning into interpreter.

@FelixBenning
Copy link

@vsheg I mean that is completely valid. And if you want to be like vim to vscode, you can just have better usability without cool features and be a niche product. But if you want to capture larger userbases you need a unique selling point and I explained why interactivity would be that selling point for me. I also don't meant implementing interactive plots yourself but rather integrating with D3.js or other javascript libraries

@ju6ge
Copy link

ju6ge commented Mar 13, 2024

@FelixBenning I am not so sure about your argument. For me it is the other way around, typst is a bliss to use compared to latex. Talking about vscode, typst already has lsp support that alone is a very good selling point for usabilty and adoption. If anything interactive plots seem more like a niche use case. Also typst has no direct concept of a plot, there are plugins for that, so your request seems more like a plugin feature than a typst feature.

@vsheg
Copy link

vsheg commented Mar 13, 2024

@FelixBenning I think text pre-processing and post-processing, an 'include' directive, and a well-designed extensions API are enough for interoperability with HTML/JS/WASM, etc., or other languages.

I prefer not to use Typst for data science workflows that involve plotting and interactivity, as Python is more appropriate for these tasks. Consequently, your suggestions are unclear to me

@FelixBenning
Copy link

FelixBenning commented Mar 13, 2024

For me it is the other way around, typst is a bliss to use compared to latex. -@ju6ge

That is exactly what people would say about vim: "It is so much easier to use, I don't have to move my fingers away from the keyboard, the combinations makes sense, I am so much more productive". But they are a niche audience who care about text editors and are willing to spend the time to learn the vim language.

I don't know what you use typst for, but you can not credibly tell me, that you can write journal articles in it. Because journals want LaTeX. And it matters to the average user, whether you can use it everywhere (including journals). They need a very good reason to choose another tool over a "good enough" tool which they can use everywhere. Trying to get academics to use git for collaboration on .tex files is similarly impossible. And I can repeat "it makes it so much easier, it is a bliss" as much as I like, it is not going to make them learn git. What would make them learn it, if they could do a thing using it, they can't do without it.

And I never suggested using typst for data science workflows, but for interactive textbooks and articles (like the ones produced by PreTeXt), the technologies to make interactive textbooks simply intersect with blogging and dashboard frameworks.

And I care 0% whether that is an (easy to install!) extension, or whether that is typst native. That is an implementation detail. The average user views the python programming extension for vscode as part of the vscode experience in python (which is why many of the larger extensions are maintained by microsoft themselves). I have no idea how the typst ecosystem looks like, because I am not going to get into another language right now. I do not have the capacity if there isn't greater payback than "it is marginally nicer to use than latex, but has the same output". I simply wanted to tell you what it would take for me to change my mind on this. What you do with that is your choice

And I will try to stop having this argument now.

@ju6ge
Copy link

ju6ge commented Mar 13, 2024

@FelixBenning Anyone using vim and who is halfway honest about the amount of effort required to get it configured the way vscode works would tell you that it is not easy and that there are tradeoffs.

To make my point clear: Typst already has killer features (in my opinion) that make a good argument to switching to it. Interactive plots are not the feature that will make everybody switch (in my opinion). Arguing about opinions is not really productive and not the point. Journals requiring latex will not make people use typst. But typst has more use cases than journals. I use it for presentations, notes, cvs. Things that I do for myself/work and where not having to deal with latex and having great editor support (via lsp) makes it friction less for me.

Typst having great text editor support and having native html export (in the future) might make journals provide typst as an option in the future, but that is pure speculation. Again I think those are better reasons to use typst than having interactive plots, that may be fancy but journals do not care about that one bit.

Anyway nothing of this has anything to do with the issue at the top, I just wanted to share my view on things because I think you are misunderstanding my reasoning.

Maybe one last comment, for the maintainers: Thanks for creating typst I really love using it. Keep up the good work!

@gnull
Copy link

gnull commented Mar 13, 2024

Interactivity seems orthogonal to HTML generation. Once you've generated static HTML, you could make a JS library that you can load and have it traverse your HTML, parse the static stuff and make it interactive with event listeners and such (kind of what MathJax does).

Typst could make it easier to write such libraries if it makes the statically generated HTML predictable and nice for parsing, but that's something for distant future; we can safely ignore it until all of static stuff is done and works well. One problem at a time.

@stancl
Copy link
Sponsor

stancl commented Mar 15, 2024

I think the suggestion mentioning the possibility of some type of IR format would be the way to go here: #721 (comment).

To me it seems best to view Typst as two separate parts, one that processes the template (simple example being substituting #foo with what I defined as #let foo) and one that outputs the PDF after this type of processing is done.

For Typst to have good support in other tools (things like pandoc) having an intermediate format that has none of the programming/templating stuff, and is purely a markup language, seems like a huge help. We cannot expect other tools to implement a full parser/compiler for the Typst language just to output a different format.

I have little clue about Typst internals or how difficult this would be to implement, so take my opinion here with a huge grain of salt. It just seems like something that would make sense to add as part of this undertaking.

@xentec
Copy link

xentec commented Mar 15, 2024

There is already an effort to integrate a VM into Typst: #3307 (Huge respect @Dherse!)
So to me the design of an IR is only a question of time.

@Dherse
Copy link
Sponsor Collaborator

Dherse commented Mar 15, 2024

There is already an effort to integrate a VM into Typst: #3307 (Huge respect @Dherse!) So to me the design of an IR is only a question of time.

Hey, thanks for the ping, I'll be working on it soon™️ but I've just starting working "for real" and I am still adapting to real life after uni 😂

@bluebear94
Copy link
Contributor

For Typst to have good support in other tools (things like pandoc) having an intermediate format that has none of the programming/templating stuff, and is purely a markup language, seems like a huge help. We cannot expect other tools to implement a full parser/compiler for the Typst language just to output a different format.

I have little clue about Typst internals or how difficult this would be to implement, so take my opinion here with a huge grain of salt. It just seems like something that would make sense to add as part of this undertaking.

AFAIK Typst does have a concept of elements, but since it supports contextual expressions that use functions such as locate, I believe that third-party tools will still need to evaluate Typst code (albeit in bytecode form after #3307).

@Amelia-Mowers
Copy link

Amelia-Mowers commented Mar 15, 2024 via email

@memeplex
Copy link

memeplex commented Mar 15, 2024

I believe there should exist some primitive structural elements that are expressive enough, like for example tables with headers and col/rowspans. HTML may provide good guidance in this regard. This is not only related to the procedural/declarative distinction, because even if third parties could declare structural elements, it's unrealistic to assume that tools like pandoc will understand any structure out there. Of course, it would be useful to keep this extended structure, but at the same time it should reduce to something primitive albeit not excessively low level / presentational. If tables were always reduced to bare grids, there won't be much structure remaining at the end. To put an example, the merging of tablex into the core introduces more meaningful building blocks which pandoc could take advantage of, but I still think that concepts like the header of a table are being overlooked, they can be inferred from some parameters but they are there because of different concerns like being able to replicate the header in multiple pages, not because of an explicit intention to preserve some basic structure/semantics.

@0x5c
Copy link

0x5c commented Mar 15, 2024

This feature request is called "HTML export". While all that talk about IRs and making typst code suitable for external engines is interesting and maybe worth its own issue/discussion thread, it's really orthogonal to the idea of having typst natively produce HTML output.
Typst already has native PDF output, and the the devs have previously confirmed that native HTML output is already in the roadmap.
I doubt that I'm alone in being subscribed to this issue for the topic of native HTML output, and not for the discussions of external tools.

@silvergasp
Copy link
Contributor

This feature request is called "HTML export". While all that talk about IRs and making typst code suitable for external engines is interesting and maybe worth its own issue/discussion thread, it's really orthogonal to the idea of having typst natively produce HTML output.

When I originally started the discussion on IRs it was meant more as a discussion on how to implement HTML export in a way that was modular and future proof for other similar formats. So I'd argue that discussion around IR's are "HTML export" adjacent rather than "orthogonal". Discussions on IR's are still potentially relevant in the discussion around how HTML export is implemented depending on the design direction that the core typst team chooses to go in. That being said, I agree that the conversation around other how an IR may enable usage of XYZ external tools has drifted beyond what I would considered relevant in this issue.

@osmano807
Copy link

Deleted my comment as previous experiences for the problem of typesetting, including HTML support, was not seen as welcome.
Nevertheless, Typst will need to decide how much decoupling the PDF typesetting from the evaluation engine is needed in face of the assumed HTML output promises. An IR is the general solution, which maybe it's not the project goal.
Merging the needs of these two output formats isn't easy, even with HTML as a second class, some adaptions would be needed to merge the #show into the HTML paradigm.

@0x5c
Copy link

0x5c commented Mar 16, 2024

When I #721 (comment) on IRs it was meant more as a discussion on how to implement HTML export in a way that was modular and future proof for other similar formats.

Nevertheless, Typst will need to decide how much decoupling the PDF typesetting from the evaluation engine is needed in face of the assumed HTML output promises. An IR is the general solution, which maybe it's not the project goal.

According to architecture.md, typst seems already built that way, just with a internal IR (consisting of "Frames") that isn't exposed to the outside world. (Specifically, between the Layout and Export steps)

To go further, the Export section of the document mentions there is already other exporters: an SVG one and an internal one (raster).

@astrale-sharp
Copy link
Contributor

According to architecture.md, typst seems already built that way, just with a internal IR (consisting of "Frames") that isn't exposed to the outside world. (Specifically, between the Layout and Export steps)

To go further, the Export section of the document mentions there is already other exporters: an SVG one and an internal one (raster).

It's a bit more subtle than that, currently the Frames hold precise positioning information, which is not compatible with semantic export to HTML.

Interactive content

I also think typst can be a great fit for this, albeit not directly.

To be "interactive friendly", Typst's output need to be externally stylable.

When you use #strong, you will be able to customize to look by changing a css function.

We should ensure it's possible to do with other custom functions or use #metadata and have it appear in the css.

Let's assume we're using #metadata.

Fictional case : transitions between slides of polylux

In polylux, every page represents a slide and if you want to animations between them:

  • add to elements metadata indicating how they should be animated
  • hook some javascript to read the css and interpret the metadata how you'd like

With this you could create a manim like (python library for generating interactive animations)

What about SVG

That's all well but maybe you don't want to be working with HTML, then the metadata information could also be present in the SVG exports so that we could work with this instead.

I think metadata is a good fit for being externally styled and with this addition, you could use typst to generate websites with animations etc.

If you want to have REST API for your website I would suggest generating a typst file including the functions used to tag your elements to use these API calls. This generation process should use a description of your API (one source of truth, no bullet in the foot :) )

@CaveNightingale
Copy link

The problem is likely about fix-position.
A Module generated by eval::eval meets the requirement of being both semantic and without precise positioning.
But a problem is that Func is quite block-boxed and tricky.
I think it will be a good idea to have a IR with all #Set applied to elements and removed from document and all Func evaluated.
As for interactive content, it's easy to add things like #svg(...), #html(...) , #pdf(...) which generate an element handled with specified backend and ignored by other backends.

@Amelia-Mowers
Copy link

Amelia-Mowers commented Apr 27, 2024 via email

@CaveNightingale
Copy link

CaveNightingale commented Apr 27, 2024

You can get precise positioning in html by modifying the x and y props of a wrapping span component.

Disagree.
When deciding the precise location (i.e. line breaking), typst need to know the width of the page, but this cannot be known before user actually load the page in html output. Of course you can compile it to JavaScript, but it may be tricky and inefficent.

@Adhalianna
Copy link

Adhalianna commented Apr 28, 2024 via email

@timjs
Copy link

timjs commented Apr 29, 2024

Great idea to share some use cases @Adhalianna! I've another one.

In my use case, I have teaching materials for students which I currently distribute as Pdf's. I'd love to use the same Typst source to generate a website containing the same information. The styling of the website would be completely different from the styling of the Pdf. I.e. I'd like to have sidebars similar to the current Typst documentation website and I'd like to add an option to search through the materials.

I see two possibilities to achieve this:

  1. One possibility would be to configure this style in Typst the same way as we currently do for Pdf. However, I think that's a hard one. Typst is tailored to typesetting and Pdf is the most natural backend for this.

  2. Another one would be to write some frontend logic for a single page application, loading Html chapters exported from Typst. It would be enough for me when Typst would be converted semantically to Html, like Pandoc currently does.
    The only thing I'm currently missing is something like a class argument to block, which gets translated to a class attribute in Html (and ignored in Pdf). This way I can style anything externally from Css.

@timjs
Copy link

timjs commented May 17, 2024

I've extended the Pandoc reader and writer to support class attributes on blocks. This is an experiment.

Now

<div class="warning">
  <p>This is a warning</p>
</div>

is equivalent to

#block(class: "warning")[This is a warning]

in both directions (Typst ⇒ Html and Html ⇒ Typst). This also works for other formats than Html.

This is not officially supported by Typst currently. It will give you an error on "an unexpected attribute: class".

Try it out, let me know if it's useful. The code is here.

@bluebear94
Copy link
Contributor

I think we’ll eventually want to have support for class attributes (chiefly for HTML, though it’ll probably useful for general styling as well), but I don’t like the idea of using strings, which is prone to name clashes between different packages. Instead, styling classes could be their own type, though unfortunately, this doesn’t change the fact that CSS class names aren’t namespaced. If nothing else, Typst could warn about name clashes between different class objects, and the user could remap the offending classes using something akin to a show rule.

We would also need a way to specify the id attribute, but that can be done through labels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request html Related to HTML export
Projects
None yet
Development

No branches or pull requests