Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

katex server side stem math rendering in HTML5 (PROTOTYPE) #3338

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

cirosantilli
Copy link

@cirosantilli cirosantilli commented Jun 13, 2019

WORKING EXTENSION WITH SOME TODOs: https://github.com/cirosantilli/asciidoctor-katex-2

er on the browser, so the page doesn't keep reflowing if you have a ton of formulas

This is now just a prototype, first install katex with:

npm install -g katex

then convert the a.adoc test from this PR.

Outcome: the block katex works, inline not yet.

If people think that this is of interest, I am willing to clean it up into a proper version with some help, since this is relatively high interest to me. Otherwise I'll likely just extract into an extension.

You can test katex on the CLI with:

echo '\sqrt{2+2} = 4' | katex

TODOs:

  • inline not working, don't know how to make def convert get an inline_katex block yet
  • stem: katex not working yet to make it the default for stem: throughout document
  • error handling if katex not installed / other errors
  • only include katex if stem: katex or if there is at least one katex: in document. LIkewise for mathjax.
  • maybe it would be cleaner to implement katex calls with: https://github.com/glebm/katex-ruby which uses https://github.com/rails/execjs to call the node code, instead of spawning processes with popen3 as I do here. But that would add more dependencies to this project.
  • proper testing

Bibliography:

@cirosantilli cirosantilli changed the title PROTOTYPE katex server side stem math rendering in HTML5 katex server side stem math rendering in HTML5 (PROTOTYPE) Jun 13, 2019
@mojavelinux
Copy link
Member

I haven't taken a deep look, but is this a better approach than https://github.com/jirutka/asciidoctor-katex?

@cirosantilli
Copy link
Author

@mojavelinux ah thanks, I hadn't seen that one.

After a quick look, basically, the only thing they can be doing fundamentally significantly better is using https://github.com/glebm/katex-ruby which uses https://github.com/rails/execjs to call katex instead of popen3 as I do here (relates to the https://github.com/Shopify/schmooze point above, but katex ruby gem is even better)

My approach avoids adding a lot of dependencies to this project, their approach likely runs faster in a document with a ton of maths since it should not start a process for every math like I do here.

I will benchmark this on a huge test document vs the existing mathjax to see if I can observe a significant performance difference, if not I would recommend just starting with pipes due to simplicity.

@cirosantilli
Copy link
Author

cirosantilli commented Jun 14, 2019

Benchmarks

I did a benchmark as follows:

# N equations.
n=1000

# asciidoctor-katex: asciidoctor 0.2.10, asciidoctor-katex 0.3.0, katex@0.10.2
(printf '= bla\n:stem:\n:docinfo: shared\n\n'; i=0; while [ $i -lt $n ]; do printf '[latexmath]\n++++\n\sqrt{1+1} = 2\n++++\n\n'; i=$((i+1)); done) > bench.adoc
printf '<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.10.2/dist/katex.min.css" integrity="sha384-yFRtMMDnQtDRO8rLpMIKrtPCD5jdktao2TV19YiZYWMDkUR5GQZR/NOVTdquEx1j" crossorigin="anonymous">' > docinfo.html
env time --append --format '%e' asciidoctor -r asciidoctor-katex bench.adoc

# this patch
(printf '= bla\n:stem:\n\n'; i=0; while [ $i -lt $n ]; do printf '[katexmath]\n++++\n\sqrt{1+1} = 2\n++++\n\n'; i=$((i+1)); done) > bench2.adoc
env time --append --format '%e' ./bin/asciidoctor bench2.adoc

# asciidoctor 0.2.10
(printf '= bla\n:stem:\n\n'; i=0; while [ $i -lt $n ]; do printf '[latexmath]\n++++\n\sqrt{1+1} = 2\n++++\n\n'; i=$((i+1)); done) > bench3.adoc
env time --append --format '%e' ./bin/asciidoctor bench3.adoc

# pdflatex, pdfTeX 3.14159265-2.6-1.40.19
(printf '\\documentclass[12pt]{article}\n\\begin{document}\n'; i=0; while [ $i -lt $n ]; do printf '$$ \sqrt{1+1} = 2 $$\n'; i=$((i+1)); done; printf '\\end{document}\n') > bench.tex
env time --append --format '%e' pdflatex bench.tex

Results:

  • asciidoctor-katex: 362 seconds
  • this patch: 82 seconds
  • asciidoctor: 0.1 second
  • pdflatex: 0.07 seconds

So to my surprise, my naive pipe approach was about 4.5x faster!

The it is interesting to open all the output documents:

  • asciidoctor-katex and "this patch", look exactly the same, load immediately
  • asciidoctor: takes 2 seconds to render all math

I also did a quick katex benchmark in isolation just to confirm that all the slowness is there:

time (for i in `seq 1000`; do echo '\sqrt{1+1} = 2' | katex > f; done)

and it takes about the same as the previous document rendering on node v10.15.1.

However, if I do:

#!/usr/bin/env nodejs

var katex = require('katex')

for (var i = 0; i < 1000; i++)
  console.log(katex.renderToString("\\sqrt{1+1} = 2", {throwOnError: false}));

then it is only 0.7 seconds, so all the slowness comes from katex startup time, which is what I thought the point of https://github.com/rails/execjs would be, maybe that just spawns multiple node commands?

OK, schmooze integration had good perf!!! https://github.com/Shopify/schmooze

gem install schmooze

main.rb

#!/usr/bin/env ruby

require 'schmooze'

class KatexSchmoozer < Schmooze::Base
  dependencies katex: 'katex'

  method :renderToString, 'katex.renderToString'
end

puts '''
<!DOCTYPE html>
<html lang="en">
<head>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.10.2/dist/katex.min.css" integrity="sha384-yFRtMMDnQtDRO8rLpMIKrtPCD5jdktao2TV19YiZYWMDkUR5GQZR/NOVTdquEx1j" crossorigin="anonymous">
</head>
<body>
'''

katex = KatexSchmoozer.new(__dir__)
(1..1000).each do |i|
  puts '<p>'
  puts katex.renderToString("\\sqrt{1+1} = 2", {throwOnError: false})
  puts '</p>'
end

puts '''
</body>
</html>
'''

Time: 0.5s! at schmooze (0.2.0)

For 1M equations: 124s.

OK, I can live with that, I would then recommend a Schmooze integration for now.

I had a quick look at execjs, but can't find easily how to do multiple calls on a single nodejs instance, maybe just to little docs. If I try:

#!/usr/bin/env ruby
require "execjs"
ExecJS.eval "x = 1"
puts ExecJS.eval "x + 1"

it blows up, so .eval must be launching multiple instances.

Tested on: Ubuntu 19.04, Lenovo ThinkPad P51 laptop with CPU: Intel Core i7-7820HQ CPU (4 cores / 8 threads), RAM: 2x Samsung M471A2K43BB1-CRC (2x 16GiB, 2400 Mbps), SSD: Samsung MZVLB512HAJQ-000L7 (512GB, 3,000 MB/s).

@mojavelinux
Copy link
Member

So to my surprise, my naive pipe approach was about 4.5x faster!

I'm not all that surprised by this because Ruby system integration is very good. You are going to get the best result by using the native calls rather than introducing layers in between.

@mojavelinux
Copy link
Member

The main issue with this change is going to be security. Right now, Asciidoctor doesn't use any libraries (that the user doesn't supply) and doesn't make any system calls. This is a hard requirement for Asciidoctor working on GitHub.

We could tie this into the safe mode so that it only works in unsafe mode (to start). We will also probably need a very lightweight adapter / manager class for calling katex so that it can be replaced in different environments with a different strategy (perhaps using a service or something). It would probably be something along the lines of what we have for the syntax highlighter integrations.

@cirosantilli
Copy link
Author

Yes, this is definitely unsafe only stuff, I'm considering it for GitHub pages.

Although at this speed, I wouldn't even bother supporting it... it would make the dev cycle unbearable.

Shame, since server side math is kind of the holy grail of web maths, so sad.

@cirosantilli
Copy link
Author

cirosantilli commented Jun 15, 2019

Dan, I've updated the benchmark comment with new findings.

As expected KaTeX slowness is only due to startup: once loaded renders are fast, so it is a "perf bug" on asciidoctor-katex's integration / stack.

Then I tried schmooze instead of execjs, and I got good usable perf!

I understand your concerns about this feature, let me know if you think it is worth merging to master for unsafe only, or if I should just start with an extension.

The extension I think I already know how to do basically :-) Master merge a bit more involved but I will do with help, as this is math nirvana.

@mojavelinux
Copy link
Member

mojavelinux commented Jun 16, 2019 via email

@cirosantilli
Copy link
Author

cirosantilli commented Aug 7, 2019

A quick progress report:

At https://github.com/cirosantilli/cirosantilli.github.io/blob/fd11b321c5e4075509db2c4d52249c94d90040bd/katex.rb I have pushed the plugin as far as I can go without solving "possibly not easy upstream questions" mentioned in the TODO part of that file:

Besides those however, it the extension is already working pretty well, and if those points were implemented, we would have, I believe, the best HTML ath typesetting system created so far, opening the way to destroy LaTeX and world domination.

Even without those points I think it is already good enough for me to publish it as a gem.

@mojavelinux mojavelinux deleted the branch asciidoctor:main October 23, 2021 07:57
@mojavelinux mojavelinux reopened this Oct 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants