fix: Speed up stripping of markdown #2097

marksteve · 2021-04-29T04:58:47Z

We were encountering huge CPU spikes that would cause our outline server to stall for an hour when our wiki users searched for certain keywords. I identified the cause of the high CPU usage after some digging. It was when the search endpoint calls removeMarkdown() to render search results context.

My fix was to replace the package used with remark and the strip-markdown plugin. Do note that the plugin doesn't have the option to disable stripping of HTML and has some quirks. Not really sure what HTML is being whitelisted here.

…aries

CLAassistant · 2021-04-29T04:58:52Z

All committers have signed the CLA.

tommoor · 2021-04-29T05:03:08Z

This is a great find, any idea if it's certain keywords and if so which?

Edit: Seems like it's this – stiang/remove-markdown#35 – so a large number of spaces in a searched document would trigger it.

marksteve · 2021-04-29T05:10:16Z

Ooh. I thought it was just because of big documents. Updating your fork would be a better fix!

tommoor · 2021-04-29T05:14:17Z

I'll pull in the fix from the other repo that hasn't been merged 🙄 – you're right I think less churn would be good here and being able to retain the stripHTML option. Regardless just finding this is huge.

marksteve · 2021-04-29T05:14:27Z

And I guess the HTML that needs to be whitelisted are for emphasizing the matching terms?

tommoor · 2021-04-29T05:15:21Z

That's right – pg returns html tags for that, lol

marksteve · 2021-04-29T05:17:37Z

Got it! Closing this then. Thank you!

… many space characters see: #2097 see: https://snyk.io/vuln/SNYK-JS-REMOVEMARKDOWN-73635

tommoor · 2021-04-29T05:48:09Z

By the way, I suppose this was also the issue you were seeing with timeout's when searching from Slack on the cloud hosted version.

marksteve · 2021-04-29T10:15:50Z

Yep! Most probably

fix: Speed up stripping of markdown in search result context and summ…

fd3ab11

…aries

auto-assign bot requested a review from tommoor April 29, 2021 04:58

fix: Remove unsupported argument

ef94d9d

marksteve closed this Apr 29, 2021

tommoor added a commit that referenced this pull request Apr 29, 2021

fix: ReDoS attack vulnerability when searching documents that contain…

4d68a34

… many space characters see: #2097 see: https://snyk.io/vuln/SNYK-JS-REMOVEMARKDOWN-73635

marksteve deleted the fix/remove-markdown branch April 29, 2021 10:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Speed up stripping of markdown #2097

fix: Speed up stripping of markdown #2097

marksteve commented Apr 29, 2021 •

edited

CLAassistant commented Apr 29, 2021 •

edited

tommoor commented Apr 29, 2021 •

edited

marksteve commented Apr 29, 2021

tommoor commented Apr 29, 2021

marksteve commented Apr 29, 2021

tommoor commented Apr 29, 2021

marksteve commented Apr 29, 2021

tommoor commented Apr 29, 2021 •

edited

marksteve commented Apr 29, 2021

fix: Speed up stripping of markdown #2097

fix: Speed up stripping of markdown #2097

Conversation

marksteve commented Apr 29, 2021 • edited

CLAassistant commented Apr 29, 2021 • edited

tommoor commented Apr 29, 2021 • edited

marksteve commented Apr 29, 2021

tommoor commented Apr 29, 2021

marksteve commented Apr 29, 2021

tommoor commented Apr 29, 2021

marksteve commented Apr 29, 2021

tommoor commented Apr 29, 2021 • edited

marksteve commented Apr 29, 2021

marksteve commented Apr 29, 2021 •

edited

CLAassistant commented Apr 29, 2021 •

edited

tommoor commented Apr 29, 2021 •

edited

tommoor commented Apr 29, 2021 •

edited