SqlClient rent stream parameter buffers#35927
Conversation
use cached lazy allocated default text stream encoding
| writer.Flush(); | ||
| } | ||
|
|
||
| Array.Clear(inBuff, 0, constTextBufferSize); |
There was a problem hiding this comment.
Any particular reason for the Clear? Generally we don't bother clearing before returning to the pool. (Although maybe we should test in a mode where we garbelize after return https://github.com/dotnet/coreclr/issues/9885)
There was a problem hiding this comment.
It doesn't provide any security benefit that I see.
There was a problem hiding this comment.
Any particular reason for the Clear? Generally we don't bother clearing before returning to the pool.
We need to get consistent on this in general.
@danmosemsft is correct that this has been our general stance, and we don't clear in methods like Stream.CopyToAsync. We do clear in code known to be sensitive, e.g. in the crypto stack. However, depending on who the dev is and what their philosophy is, there's a bunch of places recently where clears have been added in situations where they in theory fit into the former case, e.g. in all of the new json-related code (cc: @bartonjs, @ahsonkhan, @steveharter).
We should all get in agreement and come up with very clear rules, then audit all of our usage and make sure we're in compliance with those rules.
There was a problem hiding this comment.
Personally, I would prefer if the answer was "always use clearArray: true, except for in specific cases where you want to swear in blood that there's next-to-no benefit to seeing this data" (e.g. JsonDocument's database block, since the most it conveys is the length of a given value -- though I'd be just as happy clearing that one, too). Next happiest is "always clear what you wrote" (which is the part that I've been talked down to 😄).
Essentially, as a security person, I want anything that touches user data to clear it before putting it back into the pool; because one of the best things about .NETs memory model is it's very very difficult to get at "discarded" user data.
The litmus test is "assume some web application has a bug where they share the entire contents of a rented array instead of just the portion they meant to... would it be bad for anyone if this data was in that array?". Unless the answer is "I am certain the answer is no" then treat it as if the answer is yes.
There was a problem hiding this comment.
Assuming that the user might want to pass sensitive data into the sql server at some point i thought it best to err on the side of caution and clear the part of the buffer i've used which is why i clear what i know i've rented and not the entire buffer.
Things like string formatting and other internal uses i wouldn't feel the need to clear the array but for this purpose where i don't know what the data contents means and i can't be certain about the lifetime of the array because the purpose of the rental is to be longer lived it seemed appropriate.
There was a problem hiding this comment.
Perhaps ArrayPool ought to have an overload which takes the size to clear (as a hint)
There was a problem hiding this comment.
When the buffer isn't actually going to be accepted back into the pool.
If the buffer has previously been in the pool it may be in a higher generation than 0 and not be cleaned up for an unknown amount of time. If you're using SqlClient there's a pretty good chance you'll get all generations of GC fairly quickly but i'm working on improving that 😁.
As far as i know the runtime only guarentees that newly allocated memory is zeroed, that doesn't require that collected memory is zeroed so information could hang around in memory for some time. In any case where the buffer contains user information it has the possibility to be sensitive and i think it should be cleared, even with non pooled buffers.
In cpu performance terms it's a non-issue in SqlClient because it spends 80-90% of it's time waiting on network io.
@danmosemsft it's possible though i think that option came up in the ArrayPool design and was decided against.
There was a problem hiding this comment.
@danmosemsft @saucecontrol @bartonjs @stephentoub What's the consensus on this comment? Should we keep it as is or rely on ArrayPool.Shared.Return(inBuff, clearArray: true); ?
There was a problem hiding this comment.
It seems to me that having ArrayPool clear it ought to be perfectly fine and simpler
|
@Wraith2 This PR LGTM. All tests passed as well. Once we have an agreement on the conversation for clearing the array I'll approve and we should be good to merge. |
SqlClient rent stream parameter buffers Commit migrated from dotnet/corefx@397b8cf
SqlClient allows you to stream data to the database by using a Stream, StreamReader or XmlReader as the parameter type for binary, text or xml column types. This PR changes the allocation of buffers used in the write functions used so that rental buffers are used instead of transient ones which clog the GC. IT also changes the default text encoding to be lazy initialised and cached once used.
The changes are minor. All functional and manual tests pass.
Benchmarks for the text case are better, for binary make little difference but will be better with multipacket writes, xml is a wash because the amount of garbage produced by the xml reader and writers is so large you'll never see an improvement there. All test functions insert ~7k of test data repeatedly.
Before:
After:
/cc @afsanehr, @tarikulsabbir, @Gary-Zh , @David-Engel