New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON.parse optimizations, refactoring, DoS-attack prevention #1485
JSON.parse optimizations, refactoring, DoS-attack prevention #1485
Conversation
* JSON.parse now only allocates about 20% of the previously needed memory * JSON.parse more strict according to the JSON spec * Added JSON.parse recursion limit to prevent StackOverflow-Exception
…N/parse/text-negative-zero.js" test.
return ConstructFast(contents, 0, contents.Length); | ||
} | ||
|
||
internal JsArray ConstructFast(JsValue[] contents, int offset, int length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change seems unnecessary, also probably causes more array bounds checks due to calculation, maybe revert this and check my suggestions at call sites
namespace Jint.Native.Json | ||
{ | ||
public sealed class JsonParser | ||
{ | ||
private readonly Engine _engine; | ||
private readonly int _maxDepth; | ||
|
||
/// <summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment seems to be out-of-place
Jint/Native/Json/JsonParser.cs
Outdated
} | ||
|
||
private Extra _extra = null!; | ||
public JsonParser(Engine engine, uint maxDepth) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint
seems odd for public API in this case, could take int and throw argument exception if <= 0?
private static bool IsDecimalDigit(char ch) | ||
{ | ||
return (ch >= '0' && ch <= '9'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be more readable without the unchecked and uint
is more common type when doing these overlow-based checks
Jint/Native/Json/JsonParser.cs
Outdated
ch >= 'a' && ch <= 'f' || | ||
ch >= 'A' && ch <= 'F' | ||
; | ||
unchecked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for unchecked, uint
is the usual idiom
Jint/Native/Json/JsonParser.cs
Outdated
Expect("{"); | ||
if ((++state.CurrentDepth) > _maxDepth) | ||
ThrowDepthLimitReached(_lookahead); | ||
Expect(ref state, '{'); | ||
|
||
var obj = _engine.Realm.Intrinsics.Object.Construct(Arguments.Empty); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var obj = new JsObject(_engine);
return ParseJsonObject(); | ||
return Lex(ref state).Value; | ||
case Tokens.Punctuator: | ||
if (_lookahead.FirstCharacter == '[') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
braces
Jint/Native/Json/JsonParser.cs
Outdated
@@ -826,14 +728,25 @@ private enum Tokens | |||
EOF, | |||
}; | |||
|
|||
class Token | |||
private class Token |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should mark sealed
Jint/Options.cs
Outdated
/// The maximum depth allowed when parsing JSON files using "JSON.parse", | ||
/// defaults to 64. | ||
/// </summary> | ||
public uint MaxJsonParseDepth { get; set; } = 64u; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to have separate public JsonOptions Json { get; set} = new JsonOptions();
under Options
so if we later want more behavioral configuration it would all be there?
var engine = new Engine(options => options.Json.MaxParseDepth = 42); // note shorter property name
Jint/Native/Json/JsonParser.cs
Outdated
} | ||
|
||
private Token ScanNumericLiteral() | ||
private string ScanPuncatatorValue(int start, char code) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ScanPunctuatorValue
Quite impressive PR backed with great improvement numbers, nice work! I've added some comments, please take a look. |
Hi, thanks for taking your time to review it. Will go through the comments and adjust the PR accordingly. Might take a day or two, depending on when I find some time to do it. |
I applied all suggestions. Hopefully I didn't miss any braces 😄. I also did a quick Benchmark-Re-Run with the suggested modifications and from a quick glance, it even got a tiny little bit faster now. Not much (and not always) - but at least a little bit 😀. The Here are the results if you are interested: (NewParser means "After I applied your suggestions" and OldParser means "Before applying your suggestions"). BenchmarkDotNet=v0.13.4, OS=Windows 10 (10.0.19042.2251/20H2/October2020Update)
AMD Ryzen 9 5950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK=7.0.102
[Host] : .NET 6.0.13 (6.0.1322.58009), X64 RyuJIT AVX2
DefaultJob : .NET 6.0.13 (6.0.1322.58009), X64 RyuJIT AVX2
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this outcome and improvement. Thank you for taking the time to refactor the original code and iterate on feedback!
Hi there,
first PR for this project, hopefully got everything right 😃.
I tried to not modify too much to not destroy previous knowledge of the parser code.
Summary
Adjustments to the parser
The old parser did parse the following strings, but they are not valid according to the JSON specification:
"[1,]"
: return value wasJsArray([1, null])
".1"
: return value wasJsNumber(0.1)
These will now throw a JavaScriptException. These cases are baked by tests.
Recursion limit
Previously the parser did not have a recursion limit. A specially crafted JSON string could lead to a non-handlable
StackOverflowException
. I've added a max. recursion limit which is valid for arrays and objects. The default value is 64 (same as the default value of System.Text.Json). The limit can get adjusted per Parser-Instance or globally usingJsonParser.DefaultMaxDepth
.Performance Improvements
I've refactored several areas of the JsonParser to improve the performance. I validated the changes running BenchmarkDotNet using different JSON files. I've used the following sources to get sample JSON files for the benchmark:
The JSON files for the test had a size between 2.7KB and 43.3MB (UTF-8 encoded).
For that I had to remove some old code which either wasn't used at all (like the
_extra
-field) or did not return any useful result with the current implementation (likeParseJsonObject()
which is now private orMarkEnd...
).While the old code was generally about 20% slower compared to the Json.NET parser (Newtonsoft.Json), it now requires about half Json.NET needs when used by calling
JsonConvert.DeserializeObject()
. It is still slower compared to more optimized parsers like System.Text.Json or SpanJson.Test execution
The test was basically just
To make both parsers runnable within the same test run, I simply copied the changes temporarily into a new class
JsonParser2
and kept the other class unchanged.Results
(quite a lot, but the relevant columns are "Ratio" and "Alloc Ratio")
Again, I hope I got everything right.