Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std.json: Add support for recursive objects to std.json.parse #9307

Merged
merged 10 commits into from Aug 20, 2021

Conversation

greenfork
Copy link
Contributor

@greenfork greenfork commented Jul 5, 2021

Inferred errors don't work for recursive function calls, so we add an explicit error set.

parseInternal recurses in several places, current error catches this:

const v = try parseInternal(ptrInfo.child, tok, tokens, options);

lib/std/json.zig Outdated
UnexpectedToken,
UnexpectedValue,
UnknownField,
} || SkipValueError || TokenStream.Error || error{OutOfMemory} || std.fmt.ParseIntError;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ParseInternalError intentionally doesn't have a specified error set, as different errors are only possible/reachable with certain types passed.
e.g. OutOfMemory cannot be hit when T isn't a pointer type.

If anything, you need a function ParseInternalError(comptime T: type); in future (once comptime field support improves) that may also need to take a options: ParseOptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I tried to do your suggested approach. Now I don't know how to avoid infinite recursion. In the test I wrote it goes like this:

  1. Check union errors, e.g. check all union fields
  2. i64 - no errors, next
  3. Array - check child field type
  4. Child field type is union, check all union fields
  5. ...

I can't check whether it is going to be a recursion because I need to retrieve a parent of my parent, the sequence is Union -> Array -> Union. And of course more complex scenarios are possible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there lies the rub.

That's why the compiler couldn't figure it out either :P

The json parser accepting pointer types opens it up for infinite recursion; and currently this can lead to unbounded stack usage! Given the future goals of zig include limiting unbounded stack usage, perhaps the solution is to remove pointers to complex objects from the accepted input of json.parse.

Edit: alternative idea: what if ParseOptions had a field blackListedTypes that would cause a @compileError; then when recursing, add the current type to the black listed types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't think I understand your suggestion. Do you suggest to raise a compile error whenever there's a recursive type definition we are trying to parse? This is the exact reason for this PR, I want to be able to parse into recursive definitions. Maybe I misunderstand something.

Also, can I do it like this? commit Don't analyze already inferred types

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, maybe I wasn't clear enough. By "infinite recursion" I meant that compiler produces an error that eval quota is overreached (@setEvalBranchQuota), it already saves from infinite recursion scenario.

@daurnimator daurnimator added the standard library This issue involves writing Zig code for the standard library. label Jul 5, 2021
@greenfork greenfork force-pushed the support-json-recursive-objects branch from 2c4ea04 to da436f7 Compare July 5, 2021 15:35
Copy link
Collaborator

@daurnimator daurnimator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done! looks like you found a way to make it work; no API objections from me now :)

Though I'm not sure about some of the style, or if the pattern of inferred_types will make the stage1 compiler cry.

Will leave it for others to review.

lib/std/json.zig Outdated
@@ -1530,7 +1532,81 @@ test "skipValue" {
}
}

fn parseInternal(comptime T: type, token: Token, tokens: *TokenStream, options: ParseOptions) !T {
fn ParseInternalError(comptime T: type, inferred_types: []const type) type {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think inferred_types should be comptime?

Copy link
Member

@Vexu Vexu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also be made public as ParseError so that users can get the error set without using @typeInfo.

lib/std/json.zig Outdated
@@ -1530,7 +1532,81 @@ test "skipValue" {
}
}

fn parseInternal(comptime T: type, token: Token, tokens: *TokenStream, options: ParseOptions) !T {
fn ParseInternalError(comptime T: type, inferred_types: []const type) type {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function call memoization should do the same as your inferred_types check (and perform better). Did you have some issues with it that led to this? If so it should be documented here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the above comment thread: something like this is required to not blow up on recursion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understood correctly, then yes, recursion is the reason. I can provide a function with a single argument though if this is the desired signature by moving this code to ParseInternalErrorImpl and doing something like this:

fn ParseInternalError(comptime T: type) type {
    const parse_internal_inferred_types = [_]type{};
    return ParseInternalErrorImpl(T, &parse_internal_inferred_types);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, because the call hasn't completed yet, missed that.

That helper would work as the ParseError I mentioned in the other comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I completely understood your comment but implementation should be fine.

lib/std/json.zig Outdated
token: Token,
tokens: *TokenStream,
options: ParseOptions,
) ParseInternalError(T, std.mem.span(&parse_internal_inferred_types))!T {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

span isn't doing anything here, &parse_internal_inferred_types should already coerce properly.

lib/std/json.zig Outdated
@@ -2457,11 +2574,11 @@ pub fn unescapeValidString(output: []u8, input: []const u8) !void {
} else |err| {
// it might be a surrogate pair
if (err != error.Utf8CannotEncodeSurrogateHalf) {
return error.InvalidUnicodeHexSymbol;
return UnescapeValidStringError.InvalidUnicodeHexSymbol;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return UnescapeValidStringError.InvalidUnicodeHexSymbol;
return error.InvalidUnicodeHexSymbol;

These are semantically equivalent and currently using the error.Foo syntax is preferred.

@greenfork greenfork requested a review from Vexu July 8, 2021 05:26
@greenfork greenfork force-pushed the support-json-recursive-objects branch from a9e6f16 to 59468d5 Compare July 25, 2021 14:35
@Vexu Vexu merged commit b2e970d into ziglang:master Aug 20, 2021
@greenfork greenfork deleted the support-json-recursive-objects branch December 21, 2021 01:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants