New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std.json: Add support for recursive objects to std.json.parse #9307
std.json: Add support for recursive objects to std.json.parse #9307
Conversation
lib/std/json.zig
Outdated
UnexpectedToken, | ||
UnexpectedValue, | ||
UnknownField, | ||
} || SkipValueError || TokenStream.Error || error{OutOfMemory} || std.fmt.ParseIntError; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ParseInternalError
intentionally doesn't have a specified error set, as different errors are only possible/reachable with certain types passed.
e.g. OutOfMemory
cannot be hit when T
isn't a pointer type.
If anything, you need a function ParseInternalError(comptime T: type)
; in future (once comptime field support improves) that may also need to take a options: ParseOptions
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I tried to do your suggested approach. Now I don't know how to avoid infinite recursion. In the test I wrote it goes like this:
- Check union errors, e.g. check all union fields
- i64 - no errors, next
- Array - check child field type
- Child field type is union, check all union fields
- ...
I can't check whether it is going to be a recursion because I need to retrieve a parent of my parent, the sequence is Union -> Array -> Union. And of course more complex scenarios are possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there lies the rub.
That's why the compiler couldn't figure it out either :P
The json parser accepting pointer types opens it up for infinite recursion; and currently this can lead to unbounded stack usage! Given the future goals of zig include limiting unbounded stack usage, perhaps the solution is to remove pointers to complex objects from the accepted input of json.parse
.
Edit: alternative idea: what if ParseOptions
had a field blackListedTypes
that would cause a @compileError
; then when recursing, add the current type to the black listed types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't think I understand your suggestion. Do you suggest to raise a compile error whenever there's a recursive type definition we are trying to parse? This is the exact reason for this PR, I want to be able to parse into recursive definitions. Maybe I misunderstand something.
Also, can I do it like this? commit Don't analyze already inferred types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, maybe I wasn't clear enough. By "infinite recursion" I meant that compiler produces an error that eval quota is overreached (@setEvalBranchQuota
), it already saves from infinite recursion scenario.
2c4ea04
to
da436f7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done! looks like you found a way to make it work; no API objections from me now :)
Though I'm not sure about some of the style, or if the pattern of inferred_types
will make the stage1 compiler cry.
Will leave it for others to review.
lib/std/json.zig
Outdated
@@ -1530,7 +1532,81 @@ test "skipValue" { | |||
} | |||
} | |||
|
|||
fn parseInternal(comptime T: type, token: Token, tokens: *TokenStream, options: ParseOptions) !T { | |||
fn ParseInternalError(comptime T: type, inferred_types: []const type) type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think inferred_types
should be comptime
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could also be made public as ParseError
so that users can get the error set without using @typeInfo
.
lib/std/json.zig
Outdated
@@ -1530,7 +1532,81 @@ test "skipValue" { | |||
} | |||
} | |||
|
|||
fn parseInternal(comptime T: type, token: Token, tokens: *TokenStream, options: ParseOptions) !T { | |||
fn ParseInternalError(comptime T: type, inferred_types: []const type) type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function call memoization should do the same as your inferred_types
check (and perform better). Did you have some issues with it that led to this? If so it should be documented here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the above comment thread: something like this is required to not blow up on recursion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understood correctly, then yes, recursion is the reason. I can provide a function with a single argument though if this is the desired signature by moving this code to ParseInternalErrorImpl
and doing something like this:
fn ParseInternalError(comptime T: type) type {
const parse_internal_inferred_types = [_]type{};
return ParseInternalErrorImpl(T, &parse_internal_inferred_types);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, because the call hasn't completed yet, missed that.
That helper would work as the ParseError
I mentioned in the other comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I completely understood your comment but implementation should be fine.
lib/std/json.zig
Outdated
token: Token, | ||
tokens: *TokenStream, | ||
options: ParseOptions, | ||
) ParseInternalError(T, std.mem.span(&parse_internal_inferred_types))!T { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
span
isn't doing anything here, &parse_internal_inferred_types
should already coerce properly.
lib/std/json.zig
Outdated
@@ -2457,11 +2574,11 @@ pub fn unescapeValidString(output: []u8, input: []const u8) !void { | |||
} else |err| { | |||
// it might be a surrogate pair | |||
if (err != error.Utf8CannotEncodeSurrogateHalf) { | |||
return error.InvalidUnicodeHexSymbol; | |||
return UnescapeValidStringError.InvalidUnicodeHexSymbol; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return UnescapeValidStringError.InvalidUnicodeHexSymbol; | |
return error.InvalidUnicodeHexSymbol; |
These are semantically equivalent and currently using the error.Foo
syntax is preferred.
a9e6f16
to
59468d5
Compare
Inferred errors don't work for recursive function calls, so we add an explicit error set.
parseInternal
recurses in several places, current error catches this:zig/lib/std/json.zig
Line 1768 in b7da1b2