[Vulnerability Report Disclosure: ReDoS] Catastrophic Backtracking in User-supplied Regular Expression #10045
Labels
announcement
core bug
Ouch! That hurts.
security
Pull requests that address a security vulnerability
The vulnerability has been fixed (refer to the security advisory). According to the consultation with @DIYgod, this is a complete disclosure of the vulnerability report, disclosed 120 hours after the fix commit.
For a better reading experience, download the PDF here.
简要的简体中文翻译附在下一条回复。
Vulnerability Report (RSSHub)
Regular expression Denial of Service (ReDoS)
Catastrophic Backtracking in User-supplied Regular Expression
Author: Rongrong (@Rongronggg9)
POC
After reporting to the repository owner, the vulnerability has been fixed (refer to the security advisory). To reproduce the vulnerability, a version before 041cfc3 (inclusive) is needed.
diygod/rsshub:2022-06-20is the last vulnerable Docker image.curl 'http://example.com/test/complicated?filter=%28%5B%5E%3C%3E%27%22%5D%2B%29%2A%22%3C&filterout=%28%5B%5E%3C%3E%27%22%5D%2B%29%2A%22%3C'node index.js. It is now always occupying a whole CPU core. It can last for at least several hours.EXP
example.comis an RSSHub instance./test/complicatedcan be nearly any "route" of RSSHub, here is a randomly chosen "test route".filter,filter_title,filter_description,filter_author,filterout,filterout_title,filterout_description, andfilterout_authorare all so-called parameters of RSSHub, which can be supplied in the URI query. These parameters accept user-supplied regular expressions with unconditional trust, then callString.match()to perform regular expression matches. All of these parameters are vulnerable.%28%5B%5E%3C%3E%27%22%5D%2B%29%2A%22%3Cis the URL encoded form of the regular expression /([^<>'"]+)*"</. The POC shows a catastrophic-backtracking-vulnerable regular expression specially designed for any HTML. But most catastrophic-backtracking-vulnerable regular expression may be able to construct an effective DoS attack (refer to the "Look back" chapter for more details).Once catastrophic backtracking occurs,
String.match()can take more than several hours to finish. There is no timeout enforced, causing the ReDoS attack to become "zero-cost": attack once, down for hours. On multi-core servers, the ReDoS attack can probably only affect the RSSHub instance itself; while on single-core servers, especially VPS1, it can be a disaster.The component cannot be disabled. Every RSSHub instance is vulnerable "out-of-box" unless additional access control is applied. If an external watchdog or timeout is enforced (e.g. Vercel, GAE), the downtime of each effective attack can be limited. However, the instance must be terminated first in order to resume it, so the attack cost is still too low.
Vulnerable version
Since Git hash 4671720, Pull Request #3910 on GitHub, which was merge on Feb 9, 2020 (28.5 months ago).
Fixed in 5c41774.
Condition to be vulnerable
Unless additional access control is applied.
If an external watchdog or timeout is enforced, the downtime can be limited but the instance is still vulnerable.
Vulnerability grade
High.
Possible fix
Either:
Timeline
Considering the fix of the vulnerability is simple, I prefer a 28+7 days timeline (UTC).
Jun 21, 2022 (Day 0)
Jul 5, 2022 (Day 14)
Jun 22, 2022 - Jul 19, 2022 (Day 1 - Day 28)
Jul 20, 2022 - Jul 26, 2022 (Day 29 - Day 35)
Jul 27, 2022 (Day 36)
Additional conditions
Timeline (in reality)
Jun 21, 2022 15:31 UTC (0h)
Jun 21, 2022 15:56 UTC (0.4h)
RE2).Jun 21, 2022 17:52 UTC (2.4h)
Jun 22, 2022 10:43 UTC (19.2h)
Jun 22, 2022 19:25 UTC (27.9h)
Jun 26, 2022, 15:56 UTC (120.4h)
Look back
A ReDoS attack is usually severe but cost-less, "defeating the strong with little effort". Carelessness and the over-trust in the regular expression engine requiring backtracking are the root causes of a ReDoS vulnerability, while the common immediate causes are:
ReDoS vulnerabilities are not very common since there must be a catastrophic-backtracking-vulnerable regular expression, which is usually written by programmers and out of the control of end-users. But in this vulnerability of RSSHub, things are different: both the regular expression and the string can be user-supplied. In the POC, I merely showed how to supply a vulnerable regular expression. So how to understand why the string can also be user-supplied? A fact is that some "routes" (e.g.
/twitter/user/:username) grab posts from social media, resulting in anyone being able to post malicious strings and request RSSHub to grab them. An experienced attacker may merely construct a vulnerable regular expression specially designed for some strings just like the one I have demonstrated. While an inexperienced attacker can easily use known combinations of vulnerable regular expressions and malicious strings. As a result, everyone who knows what catastrophic backtracking is can probably construct an effective attack.The fix adopts an alternative regular expression engine (
RE2) which is backtracking-free. It is quite simple, but effective. It shows a nice example with functionality and security balanced.However, as mentioned, two commits (refer to the previous chapter) have opened up the attack surface again in certain conditions. Those commits re-empower instance maintainers to switch back to the vulnerable regular expression engine, the vanilla built-in-JavaScript one (
RegExp). To be responsible, I warn instance maintainers again not to switch back toRegExpunless:Revisions
Rev. 1
Initial report.
Rev. 2
Document the chosen fix in the chapter "Possible fix".
Correct some wording and statements.
Add two new chapters: "Timeline (in reality)" and "Look back".
Rev. 3
Minor revision.
Document two commits opening up the attack surface again in certain conditions.
Rev. 4
Fix the statement about
/test/complicated.Rev. 5
Correct some mistakes in the chapter "Timeline (in reality)".
Rev. 6
Document the last vulnerable Docker image.
Footnotes
Most VPS providers take the advantage of a technology called "CPU credits". If a VPS has a high CPU load continuously, it can possibly consume all remaining CPU credits, causing the VPS provider to limit the performance to a fairly low level. Even if the high load ends, it can take at least several hours before there are enough CPU credits to make the performance resume.↩
The attack cost is still too low.↩
Still vulnerable if manually enabled and matching the "condition to be vulnerable".↩
What you are reading.↩
The text was updated successfully, but these errors were encountered: