Skip to content

Conversation

@jkotas
Copy link
Member

@jkotas jkotas commented Aug 24, 2022

This change was attempted before in dotnet/coreclr#23029 and rejected due to performance impact. Things have changed since then that makes it feasible now.

Sockets and file I/O do not use pinning feature of Overlapped anymore. They pin memory on their own using {ReadOnly}Memory<T>.Pin instead. It means that the async pinned handles are typically not pinning anything. The async pinned handles come with some extra overhead in this common use case. Also, they cause confusion during GC behavior drill downs. This change removes the support for async pinned handles from the GC:

  • It makes the current most common Overlapped use cheaper. It is hard to measure the impact of eliminating async pinned handles exactly since they are just a small part of the total GC costs. The unified fully managed implementation enabled simplificication of the implementation and reduced allocations.
  • It gets rid of confusing async pinned handles behavior. The change was actually motivated by a recent discussion with a customer who was surprised by the async pinned handles not pinning anything. They were not sure whether it is expected behavior or whether it is a bug in the diagnostic tools.

Micro-benchmarks for pinning feature of Overlapped are going to regress with this change. The regression in a micro-benchmark that runs Overlapped.Pack/Unpack in a tight loop is about 20% for each pinned object. If there is 3rd party code still using the pinning feature of Overlapped, Overlapped.Pack/Unpack is expected to be a tiny part of the end-to-end async flow and the regression for end-to-end scenarios is expected to be in noise range.

This change was attempted before in dotnet/coreclr#23029 and rejected due to performance impact. Things have changed since then that makes it feasible now.

Sockets and file I/O do not use pinning feature of Overlapped anymore. They pin memory on their own using `{ReadOnly}Memory<T>.Pin` instead. It means that the async pinned handles are typically not pinning anything. The async pinned handles come with some extra overhead in this common use case. Also, they cause confusion during GC behavior drill downs. This change removes the support for async pinned handles from the GC:
- It makes the current most common Overlapped use cheaper. It is hard to measure the impact of eliminating async pinned handles exactly since they are just a small part of the total GC costs. The unified fully managed implementation enabled simplificication of the implementation and reduced allocations.
- It gets rid of confusing async pinned handles behavior. The change was actually motivated by a recent discussion with a customer who was surprised by the async pinned handles not pinning anything. They were not sure whether it is expected behavior or whether it is a bug in the diagnostic tools.

Micro-benchmarks for pinning feature of Overlapped are going to regress with this change. The regression in a micro-benchmark that runs Overlapped.Pack/Unpack in a tight loop is about 20% for each pinned object. If there is 3rd party code still using the pinning feature of Overlapped, Overlapped.Pack/Unpack is expected to be a tiny part of the end-to-end async flow and the regression for end-to-end scenarios is expected to be in noise range.
@ghost
Copy link

ghost commented Aug 24, 2022

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

Issue Details

This change was attempted before in dotnet/coreclr#23029 and rejected due to performance impact. Things have changed since then that makes it feasible now.

Sockets and file I/O do not use pinning feature of Overlapped anymore. They pin memory on their own using {ReadOnly}Memory<T>.Pin instead. It means that the async pinned handles are typically not pinning anything. The async pinned handles come with some extra overhead in this common use case. Also, they cause confusion during GC behavior drill downs. This change removes the support for async pinned handles from the GC:

  • It makes the current most common Overlapped use cheaper. It is hard to measure the impact of eliminating async pinned handles exactly since they are just a small part of the total GC costs. The unified fully managed implementation enabled simplificication of the implementation and reduced allocations.
  • It gets rid of confusing async pinned handles behavior. The change was actually motivated by a recent discussion with a customer who was surprised by the async pinned handles not pinning anything. They were not sure whether it is expected behavior or whether it is a bug in the diagnostic tools.

Micro-benchmarks for pinning feature of Overlapped are going to regress with this change. The regression in a micro-benchmark that runs Overlapped.Pack/Unpack in a tight loop is about 20% for each pinned object. If there is 3rd party code still using the pinning feature of Overlapped, Overlapped.Pack/Unpack is expected to be a tiny part of the end-to-end async flow and the regression for end-to-end scenarios is expected to be in noise range.

Author: jkotas
Assignees: -
Labels:

area-System.Threading

Milestone: -

@jkotas jkotas requested review from Maoni0 and brianrob August 24, 2022 22:39
Copy link
Member

@Maoni0 Maoni0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the changes in gc and vm dirs and they LGTM!

object[] objArray = (object[])userData;
for (int i = 0; i < objArray.Length; i++)
{
GCHandleRef(pNativeOverlapped, (nuint)(i + 1)) = GCHandle.Alloc(objArray[i], GCHandleType.Pinned);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be interested to see how the perf of this compares to async pinned handles. For sure a win for reduction in complexity and reduction in allocations. No immediate concerns, just a thought as I read through this.

Copy link
Member

@davidfowl davidfowl Aug 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will likely show up in our HTTP.sys benchmarks as they do use this support. That'll be inspiration to finally optimize that code 😄

cc @Tratcher @sebastienros @adityamandaleeka as an FYI

Copy link
Member Author

@jkotas jkotas Aug 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This microbenchmark has about the same performance before and after the change. The cost is dominated by an unmanaged memory block alloc/free and GC handle alloc/free that is done both before and after the change, and so the reduced managed allocations have not really moved the needle.

Overlapped ov = new Overlapped();
NativeOverlapped* nativeOverlapped = ov.Pack(null, null);
Overlapped.Unpack(nativeOverlapped);
Overlapped.Free(nativeOverlapped);

This microbenchmark regressed about 20% when usedData is one array, 40% when userData is two arrays, and so on:

Overlapped ov = new Overlapped();
NativeOverlapped* nativeOverlapped = ov.Pack(null, userData);
Overlapped.Unpack(nativeOverlapped);
Overlapped.Free(nativeOverlapped);

This will likely show up in our HTTP.sys benchmarks as they do use this support.

I would be curious if you can observe the impact in end-to-end benchmarks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. We’ll see when the change propagates. I’ll keep an eye out

Copy link
Member

@brianrob brianrob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@jkotas jkotas merged commit 4cf2e3b into dotnet:main Aug 25, 2022
@jkotas jkotas deleted the overlapped branch August 25, 2022 13:10
@ghost ghost locked as resolved and limited conversation to collaborators Sep 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants