Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/8.0] Fix to #32235 - Buffering error in JSON deserialization with junk data #32253

Merged
merged 4 commits into from
Nov 14, 2023

Conversation

maumar
Copy link
Contributor

@maumar maumar commented Nov 8, 2023

Fixes #32235
Fixes #32194

Description
When skipping unmapped json property make sure we skip it correctly (i.e. entire property is in the buffer), otherwise expand the buffer, read more and try again. Also, when materializing json collections, throw when unexpected token is encountered.

Customer impact
This can cause bugs when materializing entities mapped to JSON, if the JSON contains unmapped properties (junk data). Because we incorrectly skip those in some cases, we may end up in invalid state when streaming the results.

How found
Found by EF Core team member when implementing support for postgres.

Regression
Yes.

Testing
Added tests.

Risk
Low: we use the same mechanism for skipping as we do for regular reads and that works without issue. Also improved robustness of the code in other places, by throwing exceptions if somehow we end up in bad state regardless. Added quirk just in case.

Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The below looks OK to me, but I'll let @ajcvickers approve as he's more familiar with this area.

But looking at JsonCollectionReaderWriter.FromTypedJson the loop there looks like it also needs patching. For example, if an unexpected nested array (or object) is found in the middle of the JSON array, it seems like we'd just go inside and start adding its own elements. Ideally we should always have a default case for this kind of switch, etc.

@ajcvickers
Copy link
Member

@maumar @roji I added tests and fixes for the collection reader/writer. I think this is good to go to Tactics now.

Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK from my end but see some comments.

@@ -46,11 +46,24 @@ public JsonCollectionReaderWriter(JsonValueReaderWriter<TElement> elementReaderW
collection.Clear();
}

while (manager.CurrentReader.TokenType != JsonTokenType.EndArray)
if (manager.CurrentReader.TokenType == JsonTokenType.None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have context and am probably missing something, but... Does this mean that it's "unclear" whether FromJsonTyped is called with the manager already being on a token or not? Shouldn't this be a clear contract of FromJsonTyped, where MoveNext is either always called before calling FromJsonTyped, or never?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will file a bug on this. The path for handling arrays created by query goes down one thing, while other code does anther.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roji Looks like you already filed this here: #32269 :-)

src/EFCore/Storage/Json/JsonCollectionReaderWriter.cs Outdated Show resolved Hide resolved
@@ -62,6 +75,16 @@ public JsonNullableStructCollectionReaderWriter(JsonValueReaderWriter<TElement>
case JsonTokenType.Null:
collection.Add(null);
break;
case JsonTokenType.None:
case JsonTokenType.StartObject:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting that this makes it impossible to encode a scalar as e.g. an object or array, although this may be a legitimate representation. For example, it may make sense for a geometry point representation to be an array containing [X,Y], or for some other complex thing to have an object representation.

In practice, we're probably unlikely to do this, since the JSON representation usually needs to be convertible to a relational value (when extracting a value and e.g. comparing it against something), and such a nice, structured JSON representation is unlikely to be convertible (though may it is in some cases?).

But I'm still not sure this is something that the collection reader/writer should block: if we just called into _elementReaderWriter.FromJsonTyped with StartObject, wouldn't that throw anyway if the reader expects something else (e.g. a string or int)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't supported right now without writing a different collection reader/writer, but I think we could make it work in the future. I'll file an issue.

@ajcvickers ajcvickers added this to the 8.0.x milestone Nov 13, 2023
maumar and others added 4 commits November 13, 2023 15:18
When skipping unmapped json property make sure we skip it correctly (i.e. entire property is in the buffer), otherwise expand the buffer, read more and try again.
Also, when materializing json collections, throw when unexpected token is encountered.

Fixes #32235
@AndriySvyryd AndriySvyryd changed the title Fix to #32235 - Buffering error in JSON deserialization with junk data [release/8.0] Fix to #32235 - Buffering error in JSON deserialization with junk data Nov 13, 2023
@ajcvickers ajcvickers merged commit 0db7891 into release/8.0 Nov 14, 2023
7 checks passed
@ajcvickers ajcvickers deleted the fix32235_80 branch November 14, 2023 09:49
@ajcvickers ajcvickers removed this from the 8.0.x milestone Nov 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants