-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Switch to unified fully managed Overlapped implementation #74532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This change was attempted before in dotnet/coreclr#23029 and rejected due to performance impact. Things have changed since then that makes it feasible now. Sockets and file I/O do not use pinning feature of Overlapped anymore. They pin memory on their own using `{ReadOnly}Memory<T>.Pin` instead. It means that the async pinned handles are typically not pinning anything. The async pinned handles come with some extra overhead in this common use case. Also, they cause confusion during GC behavior drill downs. This change removes the support for async pinned handles from the GC: - It makes the current most common Overlapped use cheaper. It is hard to measure the impact of eliminating async pinned handles exactly since they are just a small part of the total GC costs. The unified fully managed implementation enabled simplificication of the implementation and reduced allocations. - It gets rid of confusing async pinned handles behavior. The change was actually motivated by a recent discussion with a customer who was surprised by the async pinned handles not pinning anything. They were not sure whether it is expected behavior or whether it is a bug in the diagnostic tools. Micro-benchmarks for pinning feature of Overlapped are going to regress with this change. The regression in a micro-benchmark that runs Overlapped.Pack/Unpack in a tight loop is about 20% for each pinned object. If there is 3rd party code still using the pinning feature of Overlapped, Overlapped.Pack/Unpack is expected to be a tiny part of the end-to-end async flow and the regression for end-to-end scenarios is expected to be in noise range.
|
Tagging subscribers to this area: @mangod9 Issue DetailsThis change was attempted before in dotnet/coreclr#23029 and rejected due to performance impact. Things have changed since then that makes it feasible now. Sockets and file I/O do not use pinning feature of Overlapped anymore. They pin memory on their own using
Micro-benchmarks for pinning feature of Overlapped are going to regress with this change. The regression in a micro-benchmark that runs Overlapped.Pack/Unpack in a tight loop is about 20% for each pinned object. If there is 3rd party code still using the pinning feature of Overlapped, Overlapped.Pack/Unpack is expected to be a tiny part of the end-to-end async flow and the regression for end-to-end scenarios is expected to be in noise range.
|
Maoni0
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the changes in gc and vm dirs and they LGTM!
| object[] objArray = (object[])userData; | ||
| for (int i = 0; i < objArray.Length; i++) | ||
| { | ||
| GCHandleRef(pNativeOverlapped, (nuint)(i + 1)) = GCHandle.Alloc(objArray[i], GCHandleType.Pinned); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll be interested to see how the perf of this compares to async pinned handles. For sure a win for reduction in complexity and reduction in allocations. No immediate concerns, just a thought as I read through this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will likely show up in our HTTP.sys benchmarks as they do use this support. That'll be inspiration to finally optimize that code 😄
cc @Tratcher @sebastienros @adityamandaleeka as an FYI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This microbenchmark has about the same performance before and after the change. The cost is dominated by an unmanaged memory block alloc/free and GC handle alloc/free that is done both before and after the change, and so the reduced managed allocations have not really moved the needle.
Overlapped ov = new Overlapped();
NativeOverlapped* nativeOverlapped = ov.Pack(null, null);
Overlapped.Unpack(nativeOverlapped);
Overlapped.Free(nativeOverlapped);This microbenchmark regressed about 20% when usedData is one array, 40% when userData is two arrays, and so on:
Overlapped ov = new Overlapped();
NativeOverlapped* nativeOverlapped = ov.Pack(null, userData);
Overlapped.Unpack(nativeOverlapped);
Overlapped.Free(nativeOverlapped);This will likely show up in our HTTP.sys benchmarks as they do use this support.
I would be curious if you can observe the impact in end-to-end benchmarks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. We’ll see when the change propagates. I’ll keep an eye out
brianrob
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
![]()
This change was attempted before in dotnet/coreclr#23029 and rejected due to performance impact. Things have changed since then that makes it feasible now.
Sockets and file I/O do not use pinning feature of Overlapped anymore. They pin memory on their own using
{ReadOnly}Memory<T>.Pininstead. It means that the async pinned handles are typically not pinning anything. The async pinned handles come with some extra overhead in this common use case. Also, they cause confusion during GC behavior drill downs. This change removes the support for async pinned handles from the GC:Micro-benchmarks for pinning feature of Overlapped are going to regress with this change. The regression in a micro-benchmark that runs Overlapped.Pack/Unpack in a tight loop is about 20% for each pinned object. If there is 3rd party code still using the pinning feature of Overlapped, Overlapped.Pack/Unpack is expected to be a tiny part of the end-to-end async flow and the regression for end-to-end scenarios is expected to be in noise range.