-
Notifications
You must be signed in to change notification settings - Fork 349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix memory regression in DuplicatePropertyNameChecker #2834
Conversation
using System.Collections.Generic; | ||
#if NETSTANDARD2_0_OR_GREATER | ||
#if NETSTANDARD2_0_OR_GREATER || NETCOREAPP3_1_OR_GREATER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why doesn't NETSTANDARD2_0_OR_GREATER
also include NETCOREAPP_3_1+
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the past #if NETSTANDARD2_0_OR_GREATER
used to apply to .NET Core 3.1 and later. But when we added .netcoreapp3.1
as an explicit target framework in our csproj file, NETSTANDARD2_0_OR_GREATER
stopped applying to it. I don't know why actually. But since then, we had to manually use NETCOREAPP3_1_OR_GREATER
.
{ | ||
// since this method is called frequently, we implement SingleOrDefault manually | ||
// to avoid allocating predicate closures. | ||
using (IEnumerator<ODataProperty> e = resource.NonComputedProperties.GetEnumerator()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not do the check at the time of adding to NonComputedProperties
? And/or use a data structure that can do this validation at point of adding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think validating at the time of adding NonComputedProperties or using a better suited data structure would indeed be better. But I didn't want to risk breaking existing behaviour since I hadn't done a thorough analysis to ensure changing the behaviour here doesn't break any assumptions its callers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One challenge with validating the uniqueness properties at the time that they are added to NonComputedProperties
is that we don't know when the properties are added to the collection. The NonComputedProperties
refers to the ODatarResource.Properties
property, which is an IEnumerable<ODataProperty>
.
We can detect when the properties collection is set in the setter, and do the validation there, but we can't tell if new entries are added to the collection after the property has been set. For example, if we only validate the properties collection in the setter, we would not be able to catch the following violation:
var properties = new List<ODataProperty>
{
new ODataProperty { Name = "Foo", Value = "Bar" },
new ODataProperty { Name = "A", Value = "B" }
}
odataResource.Properties = properties; // duplicate name validation would occur here
properties.Add(new ODataProperty { Name = "Foo", Value = "Baz" }); // this duplicate property name violation would not be caught
It's worthwhile to note that we do already perform so other verification in the setter, and we can argue that the verification would also not catch violations that happen after the property has been set.
My only concern is that taking this route for the uniqueness check is effectively changing existing behaviour where a property is guaranteed to be unique at the type we query its value, even if the collection had changed since the ODataResource.Properties
was set. This is behaviour if I have not yet verified is safe to break.
That said, my opinion is that we should probably leave it to the user to ensure they gave us unique properties and if they violate that we treat that as unsafe/undefined behaviour with unpredictable results.
It's also worth noting that the duplicate name checker will still verify property uniqueness independently of the check we do here, unless validation is disabled (in which case we probably shouldn't be doing that verification here either anyway). So, this check is sort of redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed to the duplicate check and no unit test has failed. We also have other cases where we (questionably) use SingleOrDefault in our libraries, there's a chance the original author did not add with intentions to ensure strong guarantees of uniqueness, but more as a sanity check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have revised the code and removed the duplicate check, the implementation now has FirstOrDefault()
semantics instead of SingleOrDefault()
.
I removed the duplicate-check from this method because:
- it doesn't break existing tests
- the duplicate check adds extra cost on a hot path
I opted not to move the duplicate-check to the ODataResource.Properties
setter because:
- we already have a duplicate property name checking logic in the writer via
DuplicatePropertyNameChecker
- in WebApi and OData Client, we can guarantee in the serializer that the property names are unique. That's also the best place to guarantee uniqueness since they have access the original property collection
- For customers using ODL directly, and who have disabled writer validation, we can assume they know what they're doing and already ensure that properties are unique.
However, if the last assumption doesn't hold, this could be a breaking change (though I think chances are extremely low). So, I'm open to restoring the original behaviour or moving the duplicate-check to the ODataResource.Properties
setter if you or someone else believes it necessary to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do alloc results look like after removing the duplicate check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had not collected alloc results after removing the duplicate check (I don't expect the result to be much different), but since I'm currently working on an area that calls this method, I've just taken CPU samples before and after applying this change:
You can see the TryGetPrimitiveOrEnumPropertyValue
went down from 0.96% to 0.52% CPU. In the before graph, we can see that the lion's share was from TryGetSingle
(which is called by SingleOrDefault()
)
Before
After
This PR has Quantification details
Why proper sizing of changes matters
Optimal pull request sizes drive a better predictable PR flow as they strike a
What can I do to optimize my changes
How to interpret the change counts in git diff output
Was this comment helpful? 👍 :ok_hand: :thumbsdown: (Email) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issues
*This pull request fixes #2813
Description
This PR fixes a couple of "low-hanging" excessive memory allocaitons.
DuplicatePropertyNameChecker
The
#if
flags we used to conditionally compile object-pooling for theDuplicatePropertyNameChecker
only applied to .NET Standard 2.0. I've extended them to also cover .NET Core 3.1. The regression was probably introduced when we added.netcoreapp3.1
as an explicit framework target.I also made
NullDuplicatePropertyChecker
a "global" singleton instead of creating per request/WriterValidator. The impact here is low, but it doesn't hurt. In any case, I'd argue it makes the code cleaner anyway.LINQ
Func<T,bool>
predicateFound a case of
Func<T,bool>
lurking on a hot path from callingresource.NonComputedProperties.SingleOrDefault(r => r.Name == propertyName)
. I reduced the allocation by implementingSingleOrDefault
manually (raw while loop andenumerator.MoveNext()
).While this reduced the allocations, I question our use of
SingleOrDefault
. Personally, I think we should reduce or avoid the use of SingleOrDefault in our library. In cases where we need to validate uniqueness, I think it would much better to consider pre-validating the collection before use, or ways to ensure we validate uniqueness only once, or in some cases just consider duplicates undefined behaviour and leave it to the customer to deal with. In this case, we potentially do a full scan of the property list each time we need to fetch the value of a single property. And yet we'll still validate property name uniqueness with the duplicate property name checker inside the writer.Update
Following this comment thread: #2834 (comment), I made the following change:
I removed the duplicate property name check from the
ODataResourceMetadataContext
because:I opted not to move the duplicate-check to the
ODataResource.Properties
setter because:DuplicatePropertyNameChecker
However, if the last assumption doesn't hold, this could be a breaking change (though I think chances are extremely low). So, I'm open to restoring the original behaviour or moving the duplicate-check to the
ODataResource.Properties
setter if you or someone else believes it necessary to do so.Checklist (Uncheck if it is not completed)
Results
Before
After
Before
After
Before
After
Before
After
Before
After