Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

SqlClient rent stream parameter buffers#35927

Merged
AfsanehR-zz merged 5 commits into
dotnet:masterfrom
Wraith2:sqlperf-streamwrite
Mar 14, 2019
Merged

SqlClient rent stream parameter buffers#35927
AfsanehR-zz merged 5 commits into
dotnet:masterfrom
Wraith2:sqlperf-streamwrite

Conversation

@Wraith2
Copy link
Copy Markdown
Contributor

@Wraith2 Wraith2 commented Mar 10, 2019

SqlClient allows you to stream data to the database by using a Stream, StreamReader or XmlReader as the parameter type for binary, text or xml column types. This PR changes the allocation of buffers used in the write functions used so that rental buffers are used instead of transient ones which clog the GC. IT also changes the default text encoding to be lazy initialised and cached once used.

The changes are minor. All functional and manual tests pass.

Benchmarks for the text case are better, for binary make little difference but will be better with multipacket writes, xml is a wash because the amount of garbage produced by the xml reader and writers is so large you'll never see an improvement there. All test functions insert ~7k of test data repeatedly.

Before:

Method Mean Error StdDev Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
SyncWriteTestStream 406.85 ms 8.0222 ms 13.8380 ms - - - 1309.38 KB
SyncWriteBinaryStream 23.08 ms 0.4456 ms 0.5949 ms - - - 478.91 KB
SyncWriteXmlStream 48.82 ms 0.9382 ms 1.1865 ms 1000.0000 - - 4199.22 KB

After:

Method Mean Error StdDev Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
SyncWriteTestStream 394.18 ms 2.7001 ms 2.1081 ms - - - 497.66 KB
SyncWriteBinaryStream 22.88 ms 0.2245 ms 0.2100 ms - - - 478.91 KB
SyncWriteXmlStream 48.12 ms 0.2843 ms 0.2374 ms 1000.0000 - - 4199.22 KB

/cc @afsanehr, @tarikulsabbir, @Gary-Zh , @David-Engel

Comment thread src/System.Data.SqlClient/src/System/Data/SqlClient/SqlParameter.cs Outdated
writer.Flush();
}

Array.Clear(inBuff, 0, constTextBufferSize);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason for the Clear? Generally we don't bother clearing before returning to the pool. (Although maybe we should test in a mode where we garbelize after return https://github.com/dotnet/coreclr/issues/9885)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't provide any security benefit that I see.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason for the Clear? Generally we don't bother clearing before returning to the pool.

We need to get consistent on this in general.

@danmosemsft is correct that this has been our general stance, and we don't clear in methods like Stream.CopyToAsync. We do clear in code known to be sensitive, e.g. in the crypto stack. However, depending on who the dev is and what their philosophy is, there's a bunch of places recently where clears have been added in situations where they in theory fit into the former case, e.g. in all of the new json-related code (cc: @bartonjs, @ahsonkhan, @steveharter).

We should all get in agreement and come up with very clear rules, then audit all of our usage and make sure we're in compliance with those rules.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I would prefer if the answer was "always use clearArray: true, except for in specific cases where you want to swear in blood that there's next-to-no benefit to seeing this data" (e.g. JsonDocument's database block, since the most it conveys is the length of a given value -- though I'd be just as happy clearing that one, too). Next happiest is "always clear what you wrote" (which is the part that I've been talked down to 😄).

Essentially, as a security person, I want anything that touches user data to clear it before putting it back into the pool; because one of the best things about .NETs memory model is it's very very difficult to get at "discarded" user data.

The litmus test is "assume some web application has a bug where they share the entire contents of a rented array instead of just the portion they meant to... would it be bad for anyone if this data was in that array?". Unless the answer is "I am certain the answer is no" then treat it as if the answer is yes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that the user might want to pass sensitive data into the sql server at some point i thought it best to err on the side of caution and clear the part of the buffer i've used which is why i clear what i know i've rented and not the entire buffer.

Things like string formatting and other internal uses i wouldn't feel the need to clear the array but for this purpose where i don't know what the data contents means and i can't be certain about the lifetime of the array because the purpose of the rental is to be longer lived it seemed appropriate.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps ArrayPool ought to have an overload which takes the size to clear (as a hint)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the buffer isn't actually going to be accepted back into the pool.

If the buffer has previously been in the pool it may be in a higher generation than 0 and not be cleaned up for an unknown amount of time. If you're using SqlClient there's a pretty good chance you'll get all generations of GC fairly quickly but i'm working on improving that 😁.

As far as i know the runtime only guarentees that newly allocated memory is zeroed, that doesn't require that collected memory is zeroed so information could hang around in memory for some time. In any case where the buffer contains user information it has the possibility to be sensitive and i think it should be cleared, even with non pooled buffers.

In cpu performance terms it's a non-issue in SqlClient because it spends 80-90% of it's time waiting on network io.

@danmosemsft it's possible though i think that option came up in the ArrayPool design and was decided against.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danmosemsft @saucecontrol @bartonjs @stephentoub What's the consensus on this comment? Should we keep it as is or rely on ArrayPool.Shared.Return(inBuff, clearArray: true); ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that having ArrayPool clear it ought to be perfectly fine and simpler

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done.

Comment thread src/System.Data.SqlClient/src/System/Data/SqlClient/SqlParameter.cs Outdated
@AfsanehR-zz
Copy link
Copy Markdown
Contributor

@Wraith2 This PR LGTM. All tests passed as well. Once we have an agreement on the conversation for clearing the array I'll approve and we should be good to merge.

@AfsanehR-zz AfsanehR-zz self-assigned this Mar 13, 2019
@AfsanehR-zz AfsanehR-zz added this to the 3.0 milestone Mar 14, 2019
@AfsanehR-zz AfsanehR-zz merged commit 397b8cf into dotnet:master Mar 14, 2019
@Wraith2 Wraith2 deleted the sqlperf-streamwrite branch March 15, 2019 18:26
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
SqlClient rent stream parameter buffers

Commit migrated from dotnet/corefx@397b8cf
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants