Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Cache polymorphic properties #41753

Merged
merged 6 commits into from
Oct 21, 2019
Merged

Conversation

svick
Copy link

@svick svick commented Oct 13, 2019

Fixes https://github.com/dotnet/corefx/issues/41638.

When a polymorphic property (i.e. one whose type is object) is serialized, its JsonPropertyInfo is created every time, based on the runtime type of the property. This is a problem, because it means accessing the property will JIT compile the Get delegate every time, which takes a lot of time.

This PR changes that, by caching the JsonPropertyInfo for each encountered runtime type. This should be an okay thing to do, since most polymorphic properties should only use a small number of types.

One case where this could cause an issue is if doing this prevents the type from unloading. But, as far as I can tell, that is already a problem with System.Text.Json, and one that can be solved by switching the instance of JsonSerializerOptions that's used.

After adding a benchmark for this case the results of comparing System.Text.Json.Serialization microbenchmarks show clear improvement (up to 68x) for the new benchmark and no real change for other benchmarks:

summary:
better: 17, geomean: 5.809
worse: 4, geomean: 1.041
total diff: 21

Slower diff/base Base Median (ns) Diff Median (ns) Modality
System.Text.Json.Serialization.Tests.WriteJson<Location>.SerializeToUtf8Bytes 1.05 1855.03 1944.15
System.Text.Json.Serialization.Tests.ReadJson<MyEventsListerViewModel>.Deseriali 1.05 1310030.00 1370308.74
System.Text.Json.Serialization.Tests.ReadJson<Hashtable>.DeserializeFromString 1.04 163849.84 171039.26
System.Text.Json.Serialization.Tests.ReadJson<IndexViewModel>.DeserializeFromStr 1.03 100573.41 103365.76
Faster base/diff Base Median (ns) Diff Median (ns) Modality
System.Text.Json.Serialization.Tests.WriteJson<HashSet<String>>.SerializeObjectP 68.80 2275620.83 33077.25
System.Text.Json.Serialization.Tests.WriteJson<ArrayList>.SerializeObjectPropert 52.81 2110367.71 39958.59
System.Text.Json.Serialization.Tests.WriteJson<Dictionary<String, String>>.Seria 45.75 2337101.79 51079.52
System.Text.Json.Serialization.Tests.WriteJson<Hashtable>.SerializeObjectPropert 30.13 2100212.50 69707.29
System.Text.Json.Serialization.Tests.WriteJson<ImmutableSortedDictionary<String, 24.85 2287553.75 92056.69
System.Text.Json.Serialization.Tests.WriteJson<ImmutableDictionary<String, Strin 20.94 2423108.75 115724.54
System.Text.Json.Serialization.Tests.WriteJson<LoginViewModel>.SerializeObjectPr 17.77 21066.59 1185.80
System.Text.Json.Serialization.Tests.WriteJson<BinaryData>.SerializeObjectProper 12.87 21884.38 1700.18
System.Text.Json.Serialization.Tests.WriteJson<Location>.SerializeObjectProperty 9.01 21679.90 2404.95
System.Text.Json.Serialization.Tests.WriteJson<IndexViewModel>.SerializeObjectPr 1.42 79311.34 56018.97
System.Text.Json.Serialization.Tests.WriteJson<Location>.SerializeToStream 1.04 2222.72 2127.47
System.Text.Json.Serialization.Tests.ReadJson<ArrayList>.DeserializeFromStream 1.04 116964.02 112564.32
System.Text.Json.Serialization.Tests.ReadJson<ImmutableSortedDictionary<String, 1.04 374212.71 360219.74
System.Text.Json.Serialization.Tests.WriteJson<HashSet<String>>.SerializeToStrea 1.04 26759.54 25804.20
System.Text.Json.Serialization.Tests.WriteJson<MyEventsListerViewModel>.Serializ 1.04 1186575.00 1144969.42
System.Text.Json.Serialization.Tests.ReadJson<Location>.DeserializeFromStream 1.03 3750.12 3632.62
System.Text.Json.Serialization.Tests.WriteJson<MyEventsListerViewModel>.Serializ 1.03 1191000.89 1159996.21

@ahsonkhan
Copy link
Member

Wow! This is awesome. Thanks for taking this on, @svick and great test case @mrlund.

Once this change has been reviewed/merged to master, we should evaluate porting this to 3.1.

@ahsonkhan
Copy link
Member

ahsonkhan commented Oct 14, 2019

After adding a benchmark for this case the results of comparing System.Text.Json.Serialization microbenchmarks show clear improvement (up to 68x) for the new benchmark and no real change for other benchmarks

Would you mind submitting a PR to dotnet/performance to add that benchmark to the suite of perf tests?
Edit: Looks like you already did :) dotnet/performance#936

Can you also measure and add (and share the results) for other polymorphic test cases outside of object (such as class B inherits from class A)? Presumably, those cases have similar regression. Also can you measure and add the test case similar to the one that @mrlund originally mentioned on the issue. Something like:

public class Foo
{
    public IEnumerable<int> Prop { get; set; } // or some POCO instead of int if needed
}

public class FooObject
{
    public object Prop { get; set; }
}

public void BuildObjects()
{
    var model = new List<int>();
    for (int i = 0; i < 100; i++) // arbitrary length chosen
    {
        model.Add(i);
    }
    _value = new Foo
    {
        Prop = model
    };

    _valueObject = new FooObject
    {
        Prop = model
    };
}

@ahsonkhan
Copy link
Member

ahsonkhan commented Oct 14, 2019

When a polymorphic property (i.e. one whose type is object) is serialized, its JsonPropertyInfo is created every time, based on the runtime type of the property. This is a problem, because it means accessing the property will JIT compile the Get delegate every time, which takes a lot of time.

This PR changes that, by caching the JsonPropertyInfo for each encountered runtime type. This should be an okay thing to do, since most polymorphic properties should only use a small number of types.

Is there any case for deserialization where we might have a similar problem that needs to be fixed (such as dealing with polymorphic properties other than object)? We treat object properties special while deserializing because we return back a boxed JsonElement instead, so there will be some perf difference there, which is expected. So, I am asking for cases other than object itself.

@steveharter - can you think of other areas in the serializer code that could have similar issues where one-time operations end up running every time, potentially in edge cases?

return runtimeProperty;
}

var cache = LazyInitializer.EnsureInitialized(ref property.RuntimePropertyCache, () => new ConcurrentDictionary<Type, JsonPropertyInfo>());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Don't use var here.


return runtimeProperty;
return cache.GetOrAdd(runtimePropertyType, (type, arg) => CreateRuntimeProperty(type, arg), (property, options, Type));
Copy link
Member

@ahsonkhan ahsonkhan Oct 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steveharter, unrelated to this PR, but here is some code cleanup we should consider (separately - out of scope for this PR which we should keep focused on the perf fix only):
Using Type as a property name makes code like this quite hard to parse/reason about. I was confused where Type is coming from and how we were passing in System.Type (i.e. a type) as a method parameter.

Copy link
Member

@steveharter steveharter Oct 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "ClassType" better? (somewhat redundant) Or just doc? The actual type can be a POCO type or any property type (string, List<string>, etc).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have a ClassType property, so that wouldn't work.

public ClassType ClassType { get; private set; }

@@ -584,5 +585,7 @@ private void VerifyWrite(int originalDepth, Utf8JsonWriter writer)
ThrowHelper.ThrowJsonException_SerializationConverterWrite(ConverterBase);
}
}

public ConcurrentDictionary<Type, JsonPropertyInfo> RuntimePropertyCache;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be stored on the JsonSerializerOptions instead with the rest of the caches?

Also, can the following existing cache be used for the runtime properties as well?

private readonly ConcurrentDictionary<Type, JsonPropertyInfo> _objectJsonProperties = new ConcurrentDictionary<Type, JsonPropertyInfo>();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On _objectJsonProperties I believe @layomia is working to get rid of that cache.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be stored on the JsonSerializerOptions instead with the rest of the caches?

By storing the cache on the actual JsonPropertyInfo like it is currently means we are tied to a particular class + particular property that may have specific attribute (e.g. [JsonIgnore] and [JsonConverter] which we need to preserve.

If we store the cache on a JsonSerializerOptions instance then we would would need to key on class type + property name + property type.

If we store the cache on a JsonClassInfo instance then we would need to key on property name + property type.

The cache in the current location of a JsonPropertyInfo instance is the most specific, but that also means we are instantiating a new dictionary for every polymorphic property, which probably isn't ideal, especially in the future when we likely support polymorphic properties in cases other than System.Object.

So I think storing on JsonClassInfo is the best location like we have for the current property cache already there: public volatile Dictionary<string, JsonPropertyInfo> PropertyCache. However, we should be able to just use the existing cache by changing the "string" key to be property name + property type (just for polymorphic properties) then we can just re-use the existing cache.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steveharter

we should be able to just use the existing cache

I don't think that would be a good idea, since it would require allocating a string for every access to serve as the key. So you're likely paying lots of string allocations to avoid a single dictionary allocation.

Just moving the cache to JsonClassInfo (while using a (name, type) tuple as the key) sounds reasonable to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@svick yes allocating a string for the key is not ideal, but small overhead compared to what we have today. A dictionary is heavy-weight so having one instance for a Type is much better than per property.

However creating a new cache on JsonClassInfo with the propertyName+propertyType tuple as you suggest would be OK, but JsonPropertyInfo+propertyType might be a bit more performant and works because the same JsonPropertyInfo coming in is (should) be the same instance each time.

@svick
Copy link
Author

svick commented Oct 14, 2019

@ahsonkhan

Can you also measure and add (and share the results) for other polymorphic test cases outside of object (such as class B inherits from class A)?

As far as I can tell, such cases don't currently behave polymorphically, so I don't think it makes sense to measure them.

For example, consider the following code:

using System;
using System.Text.Json;

class A
{
    public string P1 => "Base";
}

class B : A
{
    public string P2 => "Derived";
}

class Program
{
    static void Main()
    {
        Console.WriteLine(JsonSerializer.Serialize(new { Prop = (A)new B() }));
        Console.WriteLine(JsonSerializer.Serialize(new { Prop = (object)new B() }));
    }
}

This prints:

{"Prop":{"P1":"Base"}}
{"Prop":{"P2":"Derived","P1":"Base"}}

Which shows that a property statically typed as A does not correctly print the contents of B. There's also an open issue about this: https://github.com/dotnet/corefx/issues/38650.

Or maybe I misunderstood what you meant?

Also can you measure and add the test case similar to the one that @mrlund originally mentioned on the issue. Something like: […]

What is the part that you're missing from the test I added? Non-trivial number of items in the collection? Named types? A POCO type in the collection? All of the above?

I don't think any of those are relevant here.


I'll look into responding to the rest of the comments later, hopefully tomorrow

@ahsonkhan
Copy link
Member

ahsonkhan commented Oct 14, 2019

As far as I can tell, such cases don't currently behave polymorphically, so I don't think it makes sense to measure them.

Ah yes. Thanks for reminding me that we don't use the runtime type for serializing polymorphic types (outside of object). Makes sense to defer adding perf tests for that once its supported.

What is the part that you're missing from the test I added? Non-trivial number of items in the collection? Named types? A POCO type in the collection? All of the above?

I was looking for a test where a collection property is treated as object while serializing (i.e. anything that implements IEnumerable<T>). But we already have that covered by the existing perf tests (like IndexViewModel which has a List<ActiveOrUpcomingEvent>), so we are good there with the new perf test you added.

I don't think any of those are relevant here.

Sounds good.

@steveharter
Copy link
Member

@steveharter - can you think of other areas in the serializer code that could have similar issues where one-time operations end up running every time, potentially in edge cases?

We only support polymorphic serialization today when the property type is System.Object, so currently this PR only helps with that. When we add additional support then this scenario will be more common.

Can you also measure and add (and share the results) for other polymorphic test cases outside of object (such as class B inherits from class A)? Presumably, those cases have similar regression

So class class B:A doesn't have a perf issue today and this PR won't help\change that.

{
string json = JsonSerializer.Serialize(new { Prop = (object)new[] { 0 } });
Assert.Equal(@"{""Prop"":[0]}", json);
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests with different properties with the same Type but with different attributes (e.g. [JsonIgnore]) to ensure the cache's key works as expected.

return runtimeProperty;
}

var cache = LazyInitializer.EnsureInitialized(ref property.RuntimePropertyCache, () => new ConcurrentDictionary<Type, JsonPropertyInfo>());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this preferred over Lazy<T>?

Copy link
Author

@svick svick Oct 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's preferred here, since Lazy<T> would require an allocation of an instance of Lazy<T> for every instance of JsonPropertyInfo (even if it didn't have any polymorphic properties).

But that's a moot point anyway if we're going to reuse JsonClassInfo.PropertyCache (more on that later).

@steveharter
Copy link
Member

This is a problem, because it means accessing the property will JIT compile the Get delegate every time, which takes a lot of time.

Currently we JIT the delegates but there is a fallback that uses reflection (see "MemberAccessorStrategy") for cases that can't JIT. I wonder if reflection is more lightweight upfront and would work without having to cache although I suspect that caching (and hashtable lookup) would still be faster, but probably not the full 68x as quoted earlier.

Copy link
Member

@steveharter steveharter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR.

See the comment on moving cache from JsonPropertyInfo to JsonClassInfo (e.g. in string format of something like "{PropertyName}_{PropertyType}" and the comment about adding additional tests to ensure cache is property keying correctly.

@steveharter
Copy link
Member

Once this change has been reviewed/merged to master, we should evaluate porting this to 3.1.

@svick our internal cutoff for 3.1 Preview 2 is October 23rd meaning this PR would need to likely get into master this week for us to have time to port to 3.1.

@svick
Copy link
Author

svick commented Oct 17, 2019

@steveharter I think I have addressed all the comments.

I will have limited computer access until Monday, so I probably won't be able to address any new comments until then. Though feel free to add commits to this PR if that becomes necessary.

Co-Authored-By: Ahson Khan <ahkha@microsoft.com>
@ahsonkhan
Copy link
Member

ahsonkhan commented Oct 17, 2019

https://dev.azure.com/dnceng/public/_build/results?buildId=392129

System\Text\Json\Serialization\JsonClassInfo.AddProperty.cs(233,26): error CS1501: No overload for method 'GetOrAdd' takes 3 arguments

System\Text\Json\Serialization\JsonClassInfo.AddProperty.cs(233,26): error CS1501: No overload for method 'GetOrAdd' takes 3 arguments [D:\a\1\s\src\System.Text.Json\src\System.Text.Json.csproj]
##[error]System\Text\Json\Serialization\JsonClassInfo.AddProperty.cs(233,26): error CS1501: No overload for method 'GetOrAdd' takes 3 arguments

I don't know why this is failing. ConcurrentDictionary has the 3-arg overload:
https://apisof.net/catalog/System.Collections.Concurrent.ConcurrentDictionary%3CTKey,TValue%3E.GetOrAdd%3CTArg%3E(TKey,Func%3CTKey,TArg,TValue%3E,TArg)

Maybe the generic needs to be specified, explicitly?

Edit: This API is only available on Netstandard2.1 and isn't available on netstandard 2.0.

Copy link
Member

@ahsonkhan ahsonkhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, LGTM.


return runtimeProperty;
return cache.GetOrAdd((property, runtimePropertyType), (key, arg) => CreateRuntimeProperty(key, arg), (options, Type));
Copy link
Member

@ahsonkhan ahsonkhan Oct 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specify the generic explicitly?

Suggested change
return cache.GetOrAdd((property, runtimePropertyType), (key, arg) => CreateRuntimeProperty(key, arg), (options, Type));
return cache.GetOrAdd<(JsonSerializerOptions, Type)>((property, runtimePropertyType), (key, arg) => CreateRuntimeProperty(key, arg), (options, Type));

Edit: Nevermind

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API isn't available on netstandard 2.0, which is why the build is failing.
Can you #if/def the code using BUILDING_INBOX_LIBRARY and in the else block, only use netstandard2.0 compatible APIs?

@svick
Copy link
Author

svick commented Oct 18, 2019

I think I have fixed the error. The CI failures don't seem related to this PR.

@ahsonkhan
Copy link
Member

Filed an issue for the unrelated test failures:
https://github.com/dotnet/corefx/issues/41905

@ahsonkhan
Copy link
Member

@svick, can you please resolve the merge conflict so we can get this checked in? Thanks.

@svick
Copy link
Author

svick commented Oct 21, 2019

@ahsonkhan The conflict should be resolved.

@ahsonkhan ahsonkhan merged commit 4629961 into dotnet:master Oct 21, 2019
@svick svick deleted the json-object-property branch October 21, 2019 23:04
steveharter added a commit to steveharter/dotnet_corefx that referenced this pull request Oct 22, 2019
* Cache polymorphic properties

* Move RuntimePropertyCache to JsonClassInfo

* Added test of RuntimePropertyCache using properties with different attributes

* Fixed typo

Co-Authored-By: Ahson Khan <ahkha@microsoft.com>

* Use allocating overload of GetOrAdd on .Net Standard 2.0
steveharter added a commit to steveharter/dotnet_corefx that referenced this pull request Oct 23, 2019
* Cache polymorphic properties

* Move RuntimePropertyCache to JsonClassInfo

* Added test of RuntimePropertyCache using properties with different attributes

* Fixed typo

Co-Authored-By: Ahson Khan <ahkha@microsoft.com>

* Use allocating overload of GetOrAdd on .Net Standard 2.0
@bklooste
Copy link

"This should be an okay thing to do, since most polymorphic properties should only use a small number of types"

What happens when its many types especially some people will use interfaces of interfaces .

Maybe an LRU list be used so dont need to deal with this or should this dont use many types be documented ?

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* Cache polymorphic properties

* Move RuntimePropertyCache to JsonClassInfo

* Added test of RuntimePropertyCache using properties with different attributes

* Fixed typo

Co-Authored-By: Ahson Khan <ahkha@microsoft.com>

* Use allocating overload of GetOrAdd on .Net Standard 2.0


Commit migrated from dotnet/corefx@4629961
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
5 participants