-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ArrayPool<T>.Shared.Return should clear buffer when T is reference or contains references #25378
Comments
Return takes a clearArray argument. If it's set to false, since that's the default value (for better or worse), it can't tell the difference between whether that was an explicit choice by the developer ("I don't want to pay for clearing, I know what I'm doing") or whether it was done unknowingly ("Look, I can call Return with just the buffer, that's easy"). Given the design is to leave it up to the caller to decide, it seems strange to me then that we'd override the caller's choice, making the argument effectively meaningless. |
I'm only advocating auto clearing on return for when T is a reference type. Otherwise the references in the array can stick around forever, causing the referenced objects eventually to get promoted to Gen 2, when the buffer was ostensibly for scratch work. |
ArrayPool tends to give you much larger buffers than what you actually need and use. Return does not know how much of the buffer was actually used. It would be pretty unfortunate for Return to clear say 1000 elements when you have actually only used 10 elements. If we want to fix this, we should rather do that by treating array pool caches in a special way by the GC. It is something we will likely need to do anyway to stop array pool from leaking/hoarding too much memory. |
I've updated the repro code to be a bit clearer since this is different from the other "Rent should zero-init" issues I've been opening. @jkotas's suggestion of special-casing this in the GC would also work. We could also use weak references under the covers to mimic this behavior until the GC changes come online. Clearing the buffer is just one alternative. |
The way the API is designed, it's up to the caller of Return. If they opt not for the array to be cleared, who are we to ignore them and do it anyway? They could actually want the behavior, e.g. using the pool not only as a cache of arrays but also as a cache of arrays containing cached objects, and we'd be causing them to incur more cost by clearing automatically. |
That is the fundamental problem I have with this argument. The parameter is optional and defaults to false, so in the majority case the caller did not opt in to or out of anything. I would believe that most callers would be surprised by this "the objects live forever" behavior. |
That's an arguable flaw in the API design, but it's what the API is. |
You could always change the default. It would not affect existing compiled code, and anybody who is passing an expicit parameter anyway would retain the same behavior even after recompilation. |
I stumbled upon this issue a few weeks ago during some perf related work in Regex. I agree with what Levi says that the default is dangerous but also agree with what Jan says that Rent could return larger arrays than what you actually requested and clearing those could cause severe bottlenecks. We could change the current design from: namespace System.Buffers
{
public abstract class ArrayPool<T>
{
// Only clears the array when clearArray: true is explitiely passed in.
public abstract void Return(T[] array, bool clearArray = false);
}
} to namespace System.Buffers
{
public abstract class ArrayPool<T>
{
// Only clears the array if T is a reference type.
public abstract void Return(T[] array);
// Clears the array if clearArray is true for both value and reference types.
public abstract void Return(T[] array, bool clearArray);
}
} and adapt the the existing uses in the BCL. |
@GrabYourPitchforks can you please drive htis to consensus one way or another so we can get it closed by 30th? |
Flagging this as post-zbb. We'll prioritize this once the Memory/OwnedMemory lifetime management discussions conclude. |
Please note that these APIs shipped long ago in .NET Core 1.0. We shouldn't be making breaking changes to the default overload behavior. Let's stick to docs / guidance or analyzers for existing shipped APIs. |
Closing stale issues that aren't actionable. |
The current implementation of
ArrayPool<T>.Shared
is that there are per-thread buckets of arrays waiting to be re-rented. The default implementation ofReturn
is that the arrays aren't cleared before being thrown into the TLS buckets. This leads to situations where if an application uses scratch buffers of reference types, the elements in those scratch buffers can stick around forever once they've been returned to the pool.Example:
This prints "Is alive = True / Generation = 2" to the console. The expected behavior is that the object would be eligible for collection once the containing array has been returned to the pool.
The API could still retain the behavior where the buffer doesn't get cleared if
Retain
is just going to throw away the array without putting it back in the pool.The text was updated successfully, but these errors were encountered: