Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

katex server side stem math rendering in HTML5 (PROTOTYPE) #3338

Open
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
2 participants
@cirosantilli
Copy link

commented Jun 13, 2019

Related issue: #1735

Server side rendering is awesome as it is faster on the browser, so the page doesn't keep reflowing if you have a ton of formulas

This is now just a prototype, first install katex with:

npm install -g katex

then convert the a.adoc test from this PR.

Outcome: the block katex works, inline not yet.

If people think that this is of interest, I am willing to clean it up into a proper version with some help, since this is relatively high interest to me. Otherwise I'll likely just extract into an extension.

You can test katex on the CLI with:

echo '\sqrt{2+2} = 4' | katex

TODOs:

  • inline not working, don't know how to make def convert get an inline_katex block yet
  • stem: katex not working yet to make it the default for stem: throughout document
  • error handling if katex not installed / other errors
  • only include katex if stem: katex or if there is at least one katex: in document. LIkewise for mathjax.
  • maybe it would be cleaner to implement katex calls with: https://github.com/glebm/katex-ruby which uses https://github.com/rails/execjs to call the node code, instead of spawning processes with popen3 as I do here. But that would add more dependencies to this project.
  • proper testing

Bibliography:

@cirosantilli cirosantilli changed the title PROTOTYPE katex server side stem math rendering in HTML5 katex server side stem math rendering in HTML5 (PROTOTYPE) Jun 13, 2019

@mojavelinux

This comment has been minimized.

Copy link
Member

commented Jun 13, 2019

I haven't taken a deep look, but is this a better approach than https://github.com/jirutka/asciidoctor-katex?

@cirosantilli

This comment has been minimized.

Copy link
Author

commented Jun 13, 2019

@mojavelinux ah thanks, I hadn't seen that one.

After a quick look, basically, the only thing they can be doing fundamentally significantly better is using https://github.com/glebm/katex-ruby which uses https://github.com/rails/execjs to call katex instead of popen3 as I do here (relates to the https://github.com/Shopify/schmooze point above, but katex ruby gem is even better)

My approach avoids adding a lot of dependencies to this project, their approach likely runs faster in a document with a ton of maths since it should not start a process for every math like I do here.

I will benchmark this on a huge test document vs the existing mathjax to see if I can observe a significant performance difference, if not I would recommend just starting with pipes due to simplicity.

@cirosantilli

This comment has been minimized.

Copy link
Author

commented Jun 14, 2019

Benchmarks

I did a benchmark as follows:

# N equations.
n=1000

# asciidoctor-katex: asciidoctor 0.2.10, asciidoctor-katex 0.3.0, katex@0.10.2
(printf '= bla\n:stem:\n:docinfo: shared\n\n'; i=0; while [ $i -lt $n ]; do printf '[latexmath]\n++++\n\sqrt{1+1} = 2\n++++\n\n'; i=$((i+1)); done) > bench.adoc
printf '<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.10.2/dist/katex.min.css" integrity="sha384-yFRtMMDnQtDRO8rLpMIKrtPCD5jdktao2TV19YiZYWMDkUR5GQZR/NOVTdquEx1j" crossorigin="anonymous">' > docinfo.html
env time --append --format '%e' asciidoctor -r asciidoctor-katex bench.adoc

# this patch
(printf '= bla\n:stem:\n\n'; i=0; while [ $i -lt $n ]; do printf '[katexmath]\n++++\n\sqrt{1+1} = 2\n++++\n\n'; i=$((i+1)); done) > bench2.adoc
env time --append --format '%e' ./bin/asciidoctor bench2.adoc

# asciidoctor 0.2.10
(printf '= bla\n:stem:\n\n'; i=0; while [ $i -lt $n ]; do printf '[latexmath]\n++++\n\sqrt{1+1} = 2\n++++\n\n'; i=$((i+1)); done) > bench3.adoc
env time --append --format '%e' ./bin/asciidoctor bench3.adoc

# pdflatex, pdfTeX 3.14159265-2.6-1.40.19
(printf '\\documentclass[12pt]{article}\n\\begin{document}\n'; i=0; while [ $i -lt $n ]; do printf '$$ \sqrt{1+1} = 2 $$\n'; i=$((i+1)); done; printf '\\end{document}\n') > bench.tex
env time --append --format '%e' pdflatex bench.tex

Results:

  • asciidoctor-katex: 362 seconds
  • this patch: 82 seconds
  • asciidoctor: 0.1 second
  • pdflatex: 0.07 seconds

So to my surprise, my naive pipe approach was about 4.5x faster!

The it is interesting to open all the output documents:

  • asciidoctor-katex and "this patch", look exactly the same, load immediately
  • asciidoctor: takes 2 seconds to render all math

I also did a quick katex benchmark in isolation just to confirm that all the slowness is there:

time (for i in `seq 1000`; do echo '\sqrt{1+1} = 2' | katex > f; done)

and it takes about the same as the previous document rendering on node v10.15.1.

However, if I do:

#!/usr/bin/env nodejs

var katex = require('katex')

for (var i = 0; i < 1000; i++)
  console.log(katex.renderToString("\\sqrt{1+1} = 2", {throwOnError: false}));

then it is only 0.7 seconds, so all the slowness comes from katex startup time, which is what I thought the point of https://github.com/rails/execjs would be, maybe that just spawns multiple node commands?

OK, schmooze integration had good perf!!! https://github.com/Shopify/schmooze

gem install schmooze

main.rb

#!/usr/bin/env ruby

require 'schmooze'

class KatexSchmoozer < Schmooze::Base
  dependencies katex: 'katex'

  method :renderToString, 'katex.renderToString'
end

puts '''
<!DOCTYPE html>
<html lang="en">
<head>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.10.2/dist/katex.min.css" integrity="sha384-yFRtMMDnQtDRO8rLpMIKrtPCD5jdktao2TV19YiZYWMDkUR5GQZR/NOVTdquEx1j" crossorigin="anonymous">
</head>
<body>
'''

katex = KatexSchmoozer.new(__dir__)
(1..1000).each do |i|
  puts '<p>'
  puts katex.renderToString("\\sqrt{1+1} = 2", {throwOnError: false})
  puts '</p>'
end

puts '''
</body>
</html>
'''

Time: 0.5s! at schmooze (0.2.0)

For 1M equations: 124s.

OK, I can live with that, I would then recommend a Schmooze integration for now.

I had a quick look at execjs, but can't find easily how to do multiple calls on a single nodejs instance, maybe just to little docs. If I try:

#!/usr/bin/env ruby
require "execjs"
ExecJS.eval "x = 1"
puts ExecJS.eval "x + 1"

it blows up, so .eval must be launching multiple instances.

Tested on: Ubuntu 19.04, Lenovo ThinkPad P51 laptop with CPU: Intel Core i7-7820HQ CPU (4 cores / 8 threads), RAM: 2x Samsung M471A2K43BB1-CRC (2x 16GiB, 2400 Mbps), SSD: Samsung MZVLB512HAJQ-000L7 (512GB, 3,000 MB/s).

@mojavelinux

This comment has been minimized.

Copy link
Member

commented Jun 14, 2019

So to my surprise, my naive pipe approach was about 4.5x faster!

I'm not all that surprised by this because Ruby system integration is very good. You are going to get the best result by using the native calls rather than introducing layers in between.

@mojavelinux

This comment has been minimized.

Copy link
Member

commented Jun 14, 2019

The main issue with this change is going to be security. Right now, Asciidoctor doesn't use any libraries (that the user doesn't supply) and doesn't make any system calls. This is a hard requirement for Asciidoctor working on GitHub.

We could tie this into the safe mode so that it only works in unsafe mode (to start). We will also probably need a very lightweight adapter / manager class for calling katex so that it can be replaced in different environments with a different strategy (perhaps using a service or something). It would probably be something along the lines of what we have for the syntax highlighter integrations.

@cirosantilli

This comment has been minimized.

Copy link
Author

commented Jun 14, 2019

Yes, this is definitely unsafe only stuff, I'm considering it for GitHub pages.

Although at this speed, I wouldn't even bother supporting it... it would make the dev cycle unbearable.

Shame, since server side math is kind of the holy grail of web maths, so sad.

@cirosantilli

This comment has been minimized.

Copy link
Author

commented Jun 15, 2019

Dan, I've updated the benchmark comment with new findings.

As expected KaTeX slowness is only due to startup: once loaded renders are fast, so it is a "perf bug" on asciidoctor-katex's integration / stack.

Then I tried schmooze instead of execjs, and I got good usable perf!

I understand your concerns about this feature, let me know if you think it is worth merging to master for unsafe only, or if I should just start with an extension.

The extension I think I already know how to do basically :-) Master merge a bit more involved but I will do with help, as this is math nirvana.

@mojavelinux

This comment has been minimized.

Copy link
Member

commented Jun 16, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.