Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

SqlClient remove primitive boxing in SqlDataReader.GetFieldValue #34999

Merged
merged 12 commits into from
Mar 19, 2019

Conversation

Wraith2
Copy link
Contributor

@Wraith2 Wraith2 commented Jan 31, 2019

When reading primitive types through a data reader using the GetGuid and GetFieldValue<Guid> methods the guid data was read from the Tds stream into a guid which was then placed in an SqlGuid class which is allocated on the heap. This made reading 16 bytes of data quite expensive and caused a lot of GC noise.

This PR changes guid behaviour by adding guid to the SqlBuffer.Storage union structure allowing the guid to be stored in space already allocated by the SqlBuffer incurring no extra overhead or box. All uses of the StorageType enum have been audited to ensure that Guid and SqlGuid are both handled correctly.

The existing TdsParser code uses a new byte array each time a guid is read from the input stream, I have special cased this behaviour with a stack allocated span in netcore and a rented buffer in other builds.

In SqlDataReader fetching any value through GetFieldValue will assign the value to an object variable causing value types to box. I have used knowledge of two jit behaviours to allow eliminate this boxng behaviour in the case of some common value types.

  • RyuJIT will identify generic type T as a constant at jit time and eliminate branches which cannot be reached allowing the creation of type specific implementations.
  • RyuJIT will also recognise the sequence (T)(object)variable where variable is of type T and remove the leading casts, this allows code to be accepted by the c# compiler because of the object cast but still generate efficient and non-boxing code. I have annotated this in the code so anyone who finds it will understand why it is doing strange looking things. sharplab demo

The auto and manual tests pass in debug and release. The methods affected by these changes are covered by the existing DataStreamTest coverage and I didn't see a need to add more, but let me know if you see a gap.

Some performance measurements with my usual SqlBench project show encouraging improvement. The basic premise of the functions benched is to read 1000 guids with a single data reader and repeat that often enough to get enough time spent that the benchmark is representative. GetField and GetValue are both tested:

Before:

Method Mean Error StdDev Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
SyncReadGetField 381.1 ms 0.6447 ms 0.5715 ms 30000.0000 - - 92.41 MB
SyncReadGetGuid 328.5 ms 1.5653 ms 1.3876 ms 20000.0000 - - 61.9 MB

After:

Method Mean Error StdDev Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
SyncReadGetField 107.0 ms 2.067 ms 2.123 ms - - - 882.81 KB
SyncReadGetGuid 106.3 ms 1.164 ms 1.031 ms - - - 882.81 KB

I have only benched Guids because that was the target of my original investigation. The code changes include all supported value types so they should show the same behaviour (with appropriately sized memory gains for the box of course, guids is the largest of them).

/cc @afsanehr @saurabh500 @tarikulsabbir @David-Engel and @MarkPflug who brought this to my attention in https://github.com/dotnet/corefx/issues/31595#issuecomment-458789444

add ConstructGuid to TdsParser with netcore/netstd implementations
add generic accessors for SqlBuffer storage fields and consume from SqlDataReader
// the jit will emit all the cast operations as written. this will put the value into a box and then attempt to
// cast it, because it is an object even compatible casts will generate the desired InvalidCastException so users
// cannot widen a short to an int preserving external expectations that requests for any type other than the correct
// one will fail without conversion being considered
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment explains the new generic methods with code which looks wrong.

@MarkPflug
Copy link
Contributor

Another thing I would point out is the oddity of the SqlGuid type itself. I would have expected that it would contain a Guid as a field, but instead it uses a byte array to hold the guid value (so allocation on the heap). I think this should be changed to hold the guid value in value-type fields (like a Guid). I think the reason that it might be implemented that was is because of the odd-ball way that Sql server does guid sorting, it uses a different byte-order than .NET does for comparison. This probably gets even more complicated because Guid itself is endian-ness aware, so there are two levels of byte reshuffling that need to be taken into account, one for endianness at the Guid layer, and one for sql style sorting at the SqlGuid layer. It looks like this change you've made will avoid that SqlGuid allocation. And, I suspect that use of SqlGuid is probably pretty limited, but it is another place that an unneccessary heap allocation could be avoided.

Another potential place for improvement would be to read short length binary values (<= 16 bytes) into the Storage struct as well. Example: rowversion columns are pretty common and are returned in the same way as a binary(8) column, as a byte array. The value is small enough to be stored as a long (or a fixed length buffer) in the Storage struct as well.

@Wraith2
Copy link
Contributor Author

Wraith2 commented Jan 31, 2019

I don't see a way to put an array into a struct without having it as separate fields (in which case accessor perf would hurt) or using Unsafe which isn't appropriate in this library IMO. I thought rowversion could be nop converted to timestamp which can probably be made cheap if this set of changes passes review.

In general I assume that whoever designed and implemented the Sql* types had good reasons for what they're doing and that I shouldn't assume they're wrong or misguided. I'm happy to change perf and work on internal improvements but changes like reversing field orders is likely to give someone somewhere a really bad day and I've been on the receiving end of that too many times to do so lightly. Also, I just made guid's free and you want me to make them backwards now?! 😀

@MarkPflug
Copy link
Contributor

This library already had the unsafe flag set, not sure if it is actually used or not. But yeah, I was thinking a fixed length array, which would require unsafe, but the "unsafety" would all be encapsulated in the Storage struct. My experience dealing with rowversion (which to my understanding timestamp is just a sqlServer-specific alias for) is that it can only be accessed via GetBytes(), GetInt64() throws. To me this implies that it is landing in an array, is that array being reused or allocated? I don't know.

I don't really care about SqlGuid, and I suspect very few people do. I was just commenting that I thought it was odd that it used an array instead of a Guid internally, and speculating why that might have been done.

Those numbers above are pretty impressive!

@Wraith2
Copy link
Contributor Author

Wraith2 commented Jan 31, 2019

If you can give me an example setup I can profile and investigate for sql server I can look into it and see if there's anything I can do to improve it. I'm not working to any particular plan I'm just spending my spare time improving things that I happen to find.

By Unsafe I means the type Unsafe which is used to do various il level trickery that languages don't allow but the runtime does. It's powerful and dangerous, hence the name. The unsafe keyword use in sql client is all to do with interop and while I could mark the Storage struct as unsafe I'd rather investigate if there's a cleaner safe way to accomplish whatever is needed first.

@MarkPflug
Copy link
Contributor

Actually, it seems it should be possible with "safe" code. I just discovered these MemoryMarshal methods recently, I think they use the "Unsafe" class internally to do their work.

using System;
using System.Runtime.InteropServices;

class Program
{
    static void Main()
    {
        // create a span "input" containing 16 bytes of data.
        Span<byte> input = stackalloc byte[16];
        Guid.NewGuid().TryWriteBytes(input);

        Byte16 f;
        // copy the bytes from "input" into "f"
        MemoryMarshal.TryRead(input, out f);

        Span<byte> output = stackalloc byte[16];
        // read the data out of "f" into "output"
        MemoryMarshal.TryWrite(output, ref f);
    }
}

[StructLayout(LayoutKind.Explicit, Size = 16)]
struct Byte16
{
    // totally opaque struct, just a fixed-sized block of memory
}

So, you could put a Byte16 inside the Storage and write the binary results to/from it with the MemoryMarshal.TryRead/TryWrite. It seems that TryRead/TryWrite expect the size of the span to exactly match the size of the struct. Might be necessary to handle specific sizes: 8 bytes and 16 bytes. Those seem likely the most common sizes for short fixed-length binary values. 8 covers RowVersion which is probably the most common, and 16 … because the software I work on uses it extensively 😜.

@MarkPflug
Copy link
Contributor

Nevermind. There doesn't appear to be a problem with binary/rowversion. I thought I was seeing new arrays being allocated for the values that were stored behind the _object field, but I think I was misreading the code, or it got optimized away somehow. I profiled reading millions of binary(16) values and rowversions and it didn't seem to be allocating anything significant (according to the VS diagnostic tools). Here's the gist I was using: https://gist.github.com/MarkPflug/b46cdc748bdf71258a4812dd4d809858
It uses the sql localdb, creates a new empty database, fills a table with a million rows, then reads those rows back out in a loop.

I was basing my beliefs on my reading of the code, not actual diagnostics. Lesson learned.

@Wraith2
Copy link
Contributor Author

Wraith2 commented Jan 31, 2019

MemoryMarshal is netstandard2.1 or core only. This library has to be backwards compatible. The same applies to Unsafe and a lot of other useful new functions. If you look at my changes there are two implementations of ConstructGuid one for core which takes a span and another for all other runtimes which does things the old allocatey way.

I'm not sure the PR discussion is the right place to have a conversation about possible additions, that's more something an issue should be used for. Feel free to open one and tag me.

@Wraith2
Copy link
Contributor Author

Wraith2 commented Feb 2, 2019

Added date and decimal types which completes the valuetypes accepted by GetFieldValue and the async version which calls through it according to the code comments.
I also made the type checks more stringent.

@Wraith2 Wraith2 changed the title SqlClient remove most primative boxing in SqlDataReader.Get* SqlClient remove primative boxing in SqlDataReader.GetFieldValue Feb 3, 2019
if (_typeofINullable.IsAssignableFrom(typeofT))
// this block of type specific shortcuts uses RyuJIT jit behaviours to achieve fast implementations of the primative types
// RyuJIT will be able to determine at compilation time that the typeof(T)==typeof(<primative>) options are constant
// and be able to remove all implementations which cannot be reached. this will eliminate non-specialized code for value types
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Wraith2
Have you checked the assumption?
dataType is known at runtime only, so JIT won't be able to do the optimization.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The typeof(T) == typeof(byte) should eliminate the branches; it will then leave behind dataType == typeof(byte)

I assume that's why the double checks with typeof(T) being first?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, i answered in the main thread not this branch but sharplab confirms the elimination because one of the conditions is constantly false.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Wraith2
Hmm.. yes, you are right.
Wouldn't then be more efficient to compare directly with StorageType enum rather than creating an instance of Type just for comparison? Types are cached but the cash is finite and you could save a call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does the storage type checks first and if a direct match is found will return the value as directly as possible, if there's a mismatch then it falls back to the original path of doing the type check. So yes it's better to do what you said and it does.

I believe since it's generic that typeof(T) is low cost because the jit can fold in the type handle directly not requiring a method call. The other part is a readonly static to avoid the lookup cost.

@Wraith2
Copy link
Contributor Author

Wraith2 commented Feb 14, 2019

Yes, see this sharplab code and note the differences in assembly between the Guid and Int32 versions. The check isn't identical since i can't do explicit layout in sharplab but it proves the point because you can see that int specialization route is present but the short and guid ones aren't.

The result of the typeof(T)=typeof(<primative>) is still constant so even though the instance check is added the combination of the two can never be true for the majority of T's and the ones which can never be true are removed, The one which can be true is left present and the instance check done as expected. The Jit is really very clever 😁

// If the value was actually null, then we should throw a SqlNullValue instead
throw SQL.SqlNullValue();
}
else
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: don't need the else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, it could just fall throw and have equivalent functionality but it's less obvious imo and this is the original layout so I stuck with it. Given that it's dealing with exceptions so perf is not a concern I didn't see a need to remove it, is there formatting guidance that applies to this case?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your point about existing code makes sense

@grant-d
Copy link

grant-d commented Feb 19, 2019

types are not primates

Ha. Good one

@Wraith2 Wraith2 changed the title SqlClient remove primative boxing in SqlDataReader.GetFieldValue SqlClient remove primitive boxing in SqlDataReader.GetFieldValue Feb 22, 2019
@roji
Copy link
Member

roji commented Feb 27, 2019

Posting this a bit late, but maybe it can still be helpful - see here for how Npgsql constructs its Guids without any allocations. Rather than using a pooled byte array instance as in the current PR a union struct is used instead.

@grant-d
Copy link

grant-d commented Feb 27, 2019

See here for how ...

Heads-up that it may not be a good idea to link to code with different OSS licenses

@roji
Copy link
Member

roji commented Feb 27, 2019

@grant-d thanks for the heads up, I really have no idea how that works... Npgsql is under the PostgreSQL license which should be very compatible with MIT, but I'm not expert...

@grant-d
Copy link

grant-d commented Feb 27, 2019

Neither me - so I rather err on the side of extreme caution

@Wraith2
Copy link
Contributor Author

Wraith2 commented Feb 27, 2019

I'm going to assume it's the explicit struct overlay trick. If so that's used elsewhere in the library and i'm not a fan. I know it works and the layout of Guid is unlikely to ever change but I'd much prefer to rely on public api surface. I haven't degraded perf on netfx and the span path on core gets equivalent perf to the overlay path.

@roji
Copy link
Member

roji commented Feb 28, 2019

Fair enough, and in any case the netcoreapp version is using span directly, which is in any case the best.

@karelz
Copy link
Member

karelz commented Mar 4, 2019

@afsanehr @tarikulsabbir @Gary-Zh @David-Engel this is another "lost PR" - sorry for tagging you late. Can you please check it out and see what are next steps (given it flew under the radar for 1+ month)? Thanks!

@AfsanehR-zz
Copy link
Contributor

@Wraith2 Could you update this pr with the latest from master please?

@AfsanehR-zz AfsanehR-zz added this to the 3.0 milestone Mar 14, 2019
@Wraith2
Copy link
Contributor Author

Wraith2 commented Mar 14, 2019

Addressed feedback and updated to master.

{
return (T)(object)data.Decimal;
}
else if (typeof(T) == typeof(DateTimeOffset) && dataType == typeof(DateTimeOffset) && _typeSystem > SqlConnectionString.TypeSystem.SQLServer2005 && !metaData.IsNewKatmaiDateTimeType)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we checking for !metaData.IsNewKatmaiDateTimeType here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I traced through how datetime was handled I found that there is special casing going on for old database versions and that it's checked using this flag. Katmai datetime values are really stored as strings according to the code so if you want to get a datetime you have to know whether it's directly stored or has to go through reinterpretation.

In that line particularly I took the logic used in GetSqlValueFromSqlBufferInternal and did the check to see if I can tell if I've really got a datetime I can easily return, if I have then do so, if not follow the old path to ensure compatibility.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, now what would be the case if Sql server version is bigger than 2005 & datatype is datetimeoffset.
Doesn't the check for metaData.IsNewKatmaiDateTimeType return true, hence the !metaData.IsNewKatmaiDateTimeType would eventually be false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I've inverted both parts when I only need to invert the version check. It should (and now does) check that the type system supports date/time/offset and that the column is one of those types. This would have forced dates down the compatibility path which wouldn't have been faster but also wouldn't be slower.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I will rerun the CI.

@AfsanehR-zz
Copy link
Contributor

@Wraith2 There is one other question I had on the check for IsNewKatmaiDateTimeType. Otherwise, everything else LGTM. Once that's finalized, we should be good with merging. Tests also passed.

@AfsanehR-zz AfsanehR-zz merged commit 5430d51 into dotnet:master Mar 19, 2019
@Wraith2 Wraith2 deleted the sqlperf-guidread branch March 19, 2019 19:52
JRahnama added a commit to JRahnama/SqlClient that referenced this pull request Sep 5, 2019
Ported changes from [PR 34999](dotnet/corefx#34999) : SqlClient remove primitive boxing in SqlDataReader.GetFieldValue
Ported changes from [PR 35023](dotnet/corefx#35023) : Remove stale warning 420 pragmas
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
SqlClient remove primitive boxing in SqlDataReader.GetFieldValue

Commit migrated from dotnet/corefx@5430d51
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.