New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RES could use snudown compiled to JS via emscripten #452
RES could use snudown compiled to JS via emscripten #452
Conversation
What is currently used to render markdown in RES? |
A manual port of snudown to js - https://github.com/gamefreak/snuownd |
Initial testing and benchmarking looks promising. EmSnudown currently passes all Snudown sanity tests (SnuOwnd fails one edge case).
Running through a few thousand random comments has EmSnudown, Snudown and SnuOwnd all producing identical results. Benchmarking shows EmSnudown as actually being faster than SnuOwnd which I found a little surprising. I might play around with this some more and see if minifying SnuOwnd helps it. They are micro-benchmarks though and were run on a virtual machine just to see if EmSnudown would be badly slower (apparently not).
Size of a minified EmSnudown is 235KB. Size of the current unminified SnuOwnd is ~100KB. I can't actually check because closure compiler chokes on variables named 'char', but uglifyjs suggests SnuOwnd might be reducible to about 33KB. The current status is that EmSnudown seems to work, and reasonably well. Given that RES currently (I think) only uses a single simple call to SnuOwnd it would be replacable as-is with a few line changes (and putting in the EmSnudown code instead of SnuOwnd). Todo:
Code at https://github.com/aidanhs/emsnudown, tests at https://github.com/aidanhs/emsnudown/blob/79588aff539102adcc321f4634ca3f6f5d974d2d/test/test.py |
how big is the output? I thought last time we'd looked at emscripten / On Thu, Jun 27, 2013 at 11:08 PM, aidanhs notifications@github.com wrote:
|
@honestbleeps didn't he say 235kb? I'm not surprised its faster, the javascript engine guys have been doing alot of work on optimising this kind of code. You didn't say what browser you tested in (unless I missed it), but Firefox should be particularly good at this with OdinMonkey. Also, I've noticed this when minimising RES myself, char is actually a reserved keyword in javascript so it's not surprising that it jokes. Really, that should be changed. On Fri, Jun 28, 2013 at 5:59 AM, honestbleeps notifications@github.com
|
Yup, 235KB. I'm going to look at with a view to potentially reducing this at the cost of speed...might be interesting to see the tradeoffs. Benchmarks were run in node, which uses v8 - essentially chrome, which is pleasing because they don't even optimise asm.js yet. Asm.js aware firefox should be even faster. That said, I should do some in-browser benchmarks, particularly for Safari, Opera, IE. |
Ok, I messed up with benchmarks. It did seem odd that they were faster than handwritten js (especially as v8 doesn't optimise asm.js)...I guess it's nice to know my intuition was right. Anyway:
As you can see, EmSnudown is actually still faster than SnuOwnd for rendering, it's just the initial setup which makes it slower (which makes a lot of sense given it needs to initialise a view of memory). Given that this initial setup can be deferred until when comment preview is actually loaded (as RES currently does with SnuOwnd), I'm not sure this is actually an issue. As noted above, I'm not putting EmSnudown forward for inclusion yet - I just wanted to correct the benchmarks. |
I'm confused... Wouldn't smaller numbers be faster? And this is the sanitization function I was talking about. As you can see, it overrides the callbacks and specifies custom flags and whitelist. This would be pretty easy to do as a modification at the C end. function sanitizeHTML(htmlStr) {
if (!this.sanitizer) {
var SnuOwnd = window.SnuOwnd;
var redditCallbacks = SnuOwnd.getRedditCallbacks();
var callbacks = SnuOwnd.createCustomCallbacks({
paragraph: function(out, text, options){
if (text) out.s += text.s;
},
autolink: redditCallbacks.autolink,
raw_html_tag: redditCallbacks.raw_html_tag
});
var rendererConfig = SnuOwnd.defaultRenderState();
rendererConfig.flags = SnuOwnd.DEFAULT_WIKI_FLAGS;
rendererConfig.html_element_whitelist = [
'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'span', 'div', 'code',
'br', 'hr', 'p', 'a', 'img', 'pre', 'blockquote', 'table',
'thead', 'tbody', 'tfoot', 'tr', 'th', 'td', 'strong', 'em',
'i', 'b', 'u', 'ul', 'ol', 'li', 'dl', 'dt', 'dd',
'font', 'center', 'small', 's', 'q', 'sub', 'sup', 'del'
];
rendererConfig.html_attr_whitelist = [
'href', 'title', 'src', 'alt', 'colspan',
'rowspan', 'cellspacing', 'cellpadding', 'scope',
'face', 'color', 'size', 'bgcolor', 'align'
];
this.sanitizer = SnuOwnd.getParser({
callbacks: callbacks,
context: rendererConfig
});
}
return this.sanitizer.render(htmlStr);
} |
Sorry, I was trying to be clever by comparing the proportional speedup (i.e. rendering speed) but I've realised that 3 datapoints really isn't enough to do that. Smaller numbers are faster, which is why I was saying that the load speed of emsnudown is quite a bit slower. Interesting, ok, in theory I should be able to just hook into the stuff that's already in the C code. Might I ask where you use this? I had a look in the RES code but can't see it (nor can I actually see anything but the default renderer being used, implying wiki rendering doesn't work). |
This is stalled while I think about emscripten-core/emscripten#1382 It's not very good if the strange unicode faces people like to use don't preview correctly. |
The above is solved in the emscripten unstable branch, just need it to graduate to stable... |
Now waiting on a bug in the actual snudown code - reddit/snudown#54 |
Ok, I’m happy with the rendering. I’ll convert this issue to a PR when I get a bit of time to do the legwork and see what I need to change to fully replace SnuOwnd. Some brief notes:
|
I don't get the jerkiness unless RES is still working on the page. I just tested a ~600 comment thread and it took almost a minute to append all the save-RES/bitcoin links etc. During that time it was still jumpy, but after that it was smooth. For the user docs link on the live page use this instead. Yours is missing the permalink part so you need to dig through the 'load comments' to find it. |
|
Thanks for that @gavin19, I thought I'd done that but managed to fail. @gamefreak they are all minor, to the extent that some of them are literally just notes for interest and would never be an issue in reality. To go through "test - behavior - impact" for each of them:
I absolutely agree about not fixing it (or any of the above). I wouldn't, even if emsnudown didn't exist. WRT replacement, we'll see if I can fill in for the HTML sanitisation functions... |
FYI, Snudown and its upstream Sundown are frozen and will be replaced. See vmg/sundown@37728fb for more info. |
@spladug thanks for that, I actually did notice the plans to replace snudown elsewhere, though it's interesting to now have the details of it being a coordinated effort and the intended roadmap. But: that commit was (almost exactly) 9 months ago and had a stated aim to have a successor to sundown (almost exactly) 8 months ago and had a 'soon' promise made (almost exactly) 3 months ago. This isn't me doubting it'll happen! It's just it's difficult to measure progress with no news. Unless you have some to share? ;) I look forward to sundown's successor - the learning process for emsnudown is coming to an end. And what is life for, if not for learning? |
As promised, this has been converted into a PR. However, the final piece of work (read: hackery) I've done to get a sanitiser working for RES has tipped the balance (in my mind) back to keeping snuownd. The original idea behind emsnudown was to create a thin shim wrapper to easily allow new snudown versions without having to do any work. Unfortunately the hack that I've had to resort to to expose the custom callback functionality has left it more fragile than I would like. In combination with snudown being deprecated (though I imagine we'll see minor changes slipping in...) the original motivation behind emsnudown is mostly lost and all it provides is a larger, slower, more difficult to customise version of snuownd in exchange for very edge case correctness. If you do have a compelling reason to want emsnudown that's fine (it's fairly easy for me to maintain) but I personally suggest rejecting this pull request and letting it be an example for what was attempted and/or starting point for emscripten compilation of the successor of snudown. Long live snuownd! |
No objections, closing. |
Issue to track progress of 'emsnudown', as initially discussed in #434
This issue will eventually be converted to a PR.