New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

requestId ticks seed, reduced GenerateRequestId allocs #385

Merged
merged 1 commit into from Oct 8, 2015

Conversation

Projects
None yet
7 participants
@benaadams
Contributor

benaadams commented Oct 5, 2015

Seed requestId with DateTime.UtcNow.Ticks rather than Random
Reduced allocs in GenerateRequestId

/cc @DamianEdwards

@dnfclas

This comment has been minimized.

Show comment
Hide comment
@dnfclas

dnfclas Oct 5, 2015

Hi @benaadams, I'm your friendly neighborhood .NET Foundation Pull Request Bot (You can call me DNFBOT). Thanks for your contribution!
You've already signed the contribution license agreement. Thanks!

The agreement was validated by .NET Foundation and real humans are currently evaluating your PR.

TTYL, DNFBOT;

dnfclas commented Oct 5, 2015

Hi @benaadams, I'm your friendly neighborhood .NET Foundation Pull Request Bot (You can call me DNFBOT). Thanks for your contribution!
You've already signed the contribution license agreement. Thanks!

The agreement was validated by .NET Foundation and real humans are currently evaluating your PR.

TTYL, DNFBOT;

@benaadams

This comment has been minimized.

Show comment
Hide comment
@benaadams

benaadams Oct 5, 2015

Contributor
BenchmarkDotNet=v0.7.8.0
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7-4720HQ CPU @ 2.60GHz, ProcessorCount=8
HostCLR=MS.NET 4.0.30319.42000, Arch=32-bit
Type=LongToString  Mode=Throughput  .NET=HostFramework
Method Platform Jit AvrTime StdDev op/s
LongToString X86 LegacyJit 48.6370 ns 0.5701 ns 20,560,474.66
RequestIdCharHeap X86 LegacyJit 7.9431 ns 0.1111 ns 125,896,043.99
RequestIdCharStack X86 LegacyJit 9.6136 ns 0.1804 ns 104,019,793.55
LongToString X64 RyuJit 21.9177 ns 0.7149 ns 45,625,348.42
RequestIdCharHeap X64 RyuJit 7.7770 ns 0.0258 ns 128,584,042.38
RequestIdCharStack X64 RyuJit 7.2523 ns 0.0137 ns 137,887,101.48

Allocations aside, is a little slower on x86, but faster on x64 and x64 is faster than x86; and still way way faster than .ToString()

Contributor

benaadams commented Oct 5, 2015

BenchmarkDotNet=v0.7.8.0
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7-4720HQ CPU @ 2.60GHz, ProcessorCount=8
HostCLR=MS.NET 4.0.30319.42000, Arch=32-bit
Type=LongToString  Mode=Throughput  .NET=HostFramework
Method Platform Jit AvrTime StdDev op/s
LongToString X86 LegacyJit 48.6370 ns 0.5701 ns 20,560,474.66
RequestIdCharHeap X86 LegacyJit 7.9431 ns 0.1111 ns 125,896,043.99
RequestIdCharStack X86 LegacyJit 9.6136 ns 0.1804 ns 104,019,793.55
LongToString X64 RyuJit 21.9177 ns 0.7149 ns 45,625,348.42
RequestIdCharHeap X64 RyuJit 7.7770 ns 0.0258 ns 128,584,042.38
RequestIdCharStack X64 RyuJit 7.2523 ns 0.0137 ns 137,887,101.48

Allocations aside, is a little slower on x86, but faster on x64 and x64 is faster than x86; and still way way faster than .ToString()

@stephentoub

This comment has been minimized.

Show comment
Hide comment
@stephentoub

stephentoub Oct 5, 2015

LGTM.

Is performance here really critical? If so, there are other things that could be done to cut down on memory usage further:

  • If instead of using DateTime.UtcNow.Ticks you were to use Environment.TickCount (which is what Random uses internally), then the majority of requests would have a relatively small ID, e.g. the hex encoding of Environment.TickCount on my machine right now is 62EF26 whereas for DateTime.UtcNow.Ticks it's 8D2CDAE45A15844. If you were to go with the smaller value, and if you were to size the resulting string based on the actual number of characters used (as long.ToString does by default), then in this example most of the allocated strings would be only 6-characters long rather than 15-characters long.
  • If instead of hex you were to use a larger set of characters, e.g. 64 of them rather than 16 of them, you could encode 6 bits per character instead of only 4 bits per character, and save a few additional characters that way.

stephentoub commented Oct 5, 2015

LGTM.

Is performance here really critical? If so, there are other things that could be done to cut down on memory usage further:

  • If instead of using DateTime.UtcNow.Ticks you were to use Environment.TickCount (which is what Random uses internally), then the majority of requests would have a relatively small ID, e.g. the hex encoding of Environment.TickCount on my machine right now is 62EF26 whereas for DateTime.UtcNow.Ticks it's 8D2CDAE45A15844. If you were to go with the smaller value, and if you were to size the resulting string based on the actual number of characters used (as long.ToString does by default), then in this example most of the allocated strings would be only 6-characters long rather than 15-characters long.
  • If instead of hex you were to use a larger set of characters, e.g. 64 of them rather than 16 of them, you could encode 6 bits per character instead of only 4 bits per character, and save a few additional characters that way.
@benaadams

This comment has been minimized.

Show comment
Hide comment
@benaadams

benaadams Oct 5, 2015

Contributor

If instead of hex you were to use a larger set of characters, e.g. 64 of them rather than 16 of them

That trumps them all!

Method Platform Jit AvrTime StdDev op/s
LongToString X86 LegacyJit 48.6370 ns 0.5701 ns 20,560,474.66
RequestIdCharHeap X86 LegacyJit 7.9431 ns 0.1111 ns 125,896,043.99
RequestIdCharStack X86 LegacyJit 9.6136 ns 0.1804 ns 104,019,793.55
RequestIdChar64Stack X86 LegacyJit 7.4285 ns 0.0076 ns 134,617,091.36
LongToString X64 RyuJit 21.9177 ns 0.7149 ns 45,625,348.42
RequestIdCharHeap X64 RyuJit 7.7770 ns 0.0258 ns 128,584,042.38
RequestIdCharStack X64 RyuJit 7.2523 ns 0.0137 ns 137,887,101.48
RequestIdChar64Stack X64 RyuJit 6.0165 ns 0.0120 ns 166,209,448.85

Environment.TickCount I'm not sure about as it wraps every 50 days; and this value will be like used in tracing and logging; so with a DateTime.UtcNow.Ticks based start you can have a correlated event stream spanning years; but Environment.TickCount would give a shorter period, which would work for DevOps and diagnostic correlation; but not necessarily for big data (though a timestamp is probably thrown in also so its probably not too important).

The full string also gives simple alpha sort (case sensitive) to get all the events in order; but again not sure is important as will likely be a timestamp somewhere? If not can go with the standard base64 set and its odd order. Obviously the current set of chars can be changed for different ones (needs two extra over 0-9a-zA-Z) without any performance impact.

Contributor

benaadams commented Oct 5, 2015

If instead of hex you were to use a larger set of characters, e.g. 64 of them rather than 16 of them

That trumps them all!

Method Platform Jit AvrTime StdDev op/s
LongToString X86 LegacyJit 48.6370 ns 0.5701 ns 20,560,474.66
RequestIdCharHeap X86 LegacyJit 7.9431 ns 0.1111 ns 125,896,043.99
RequestIdCharStack X86 LegacyJit 9.6136 ns 0.1804 ns 104,019,793.55
RequestIdChar64Stack X86 LegacyJit 7.4285 ns 0.0076 ns 134,617,091.36
LongToString X64 RyuJit 21.9177 ns 0.7149 ns 45,625,348.42
RequestIdCharHeap X64 RyuJit 7.7770 ns 0.0258 ns 128,584,042.38
RequestIdCharStack X64 RyuJit 7.2523 ns 0.0137 ns 137,887,101.48
RequestIdChar64Stack X64 RyuJit 6.0165 ns 0.0120 ns 166,209,448.85

Environment.TickCount I'm not sure about as it wraps every 50 days; and this value will be like used in tracing and logging; so with a DateTime.UtcNow.Ticks based start you can have a correlated event stream spanning years; but Environment.TickCount would give a shorter period, which would work for DevOps and diagnostic correlation; but not necessarily for big data (though a timestamp is probably thrown in also so its probably not too important).

The full string also gives simple alpha sort (case sensitive) to get all the events in order; but again not sure is important as will likely be a timestamp somewhere? If not can go with the standard base64 set and its odd order. Obviously the current set of chars can be changed for different ones (needs two extra over 0-9a-zA-Z) without any performance impact.

@benaadams

This comment has been minimized.

Show comment
Hide comment
@benaadams

benaadams Oct 5, 2015

Contributor

However using Environment.TickCount and an int based RequestId (6 char even without shrinking) is a very significant boost

Method Platform Jit AvrTime StdDev op/s
LongToString X86 LegacyJit 48.6370 ns 0.5701 ns 20,560,474.66
RequestIdCharHeap X86 LegacyJit 7.9431 ns 0.1111 ns 125,896,043.99
RequestIdCharStack X86 LegacyJit 9.6136 ns 0.1804 ns 104,019,793.55
RequestIdChar64Stack X86 LegacyJit 7.4285 ns 0.0076 ns 134,617,091.36
IntChar64Stack X86 LegacyJit 5.9278 ns 0.0069 ns 168,697,749.18
LongToString X64 RyuJit 21.9177 ns 0.7149 ns 45,625,348.42
RequestIdCharHeap X64 RyuJit 7.7770 ns 0.0258 ns 128,584,042.38
RequestIdCharStack X64 RyuJit 7.2523 ns 0.0137 ns 137,887,101.48
RequestIdChar64Stack X64 RyuJit 6.0165 ns 0.0120 ns 166,209,448.85
IntChar64Stack X64 RyuJit 4.2964 ns 0.0117 ns 232,752,905.73
Contributor

benaadams commented Oct 5, 2015

However using Environment.TickCount and an int based RequestId (6 char even without shrinking) is a very significant boost

Method Platform Jit AvrTime StdDev op/s
LongToString X86 LegacyJit 48.6370 ns 0.5701 ns 20,560,474.66
RequestIdCharHeap X86 LegacyJit 7.9431 ns 0.1111 ns 125,896,043.99
RequestIdCharStack X86 LegacyJit 9.6136 ns 0.1804 ns 104,019,793.55
RequestIdChar64Stack X86 LegacyJit 7.4285 ns 0.0076 ns 134,617,091.36
IntChar64Stack X86 LegacyJit 5.9278 ns 0.0069 ns 168,697,749.18
LongToString X64 RyuJit 21.9177 ns 0.7149 ns 45,625,348.42
RequestIdCharHeap X64 RyuJit 7.7770 ns 0.0258 ns 128,584,042.38
RequestIdCharStack X64 RyuJit 7.2523 ns 0.0137 ns 137,887,101.48
RequestIdChar64Stack X64 RyuJit 6.0165 ns 0.0120 ns 166,209,448.85
IntChar64Stack X64 RyuJit 4.2964 ns 0.0117 ns 232,752,905.73
@benaadams

This comment has been minimized.

Show comment
Hide comment
@benaadams

benaadams Oct 5, 2015

Contributor

Would need call on whether to do any of

  • Move int based with Environment.TickCount
  • Size the array to remove "leading zeros" rather than fully filled
  • Change any of _encode64Chars character choices
Contributor

benaadams commented Oct 5, 2015

Would need call on whether to do any of

  • Move int based with Environment.TickCount
  • Size the array to remove "leading zeros" rather than fully filled
  • Change any of _encode64Chars character choices
@bbowyersmyth

This comment has been minimized.

Show comment
Hide comment
@bbowyersmyth

bbowyersmyth Oct 5, 2015

If this is performance critical, seeing it is a string creation method you could do the unsafe index assignment on a pre-sized string itself and ditch the temp buffer and copy. Don't know this repo stance on such things though.

bbowyersmyth commented Oct 5, 2015

If this is performance critical, seeing it is a string creation method you could do the unsafe index assignment on a pre-sized string itself and ditch the temp buffer and copy. Don't know this repo stance on such things though.

@benaadams

This comment has been minimized.

Show comment
Hide comment
@benaadams

benaadams Oct 6, 2015

Contributor

@bbowyersmyth String looks like its implemented in clr native and exposes no accepted way to do this so it is crossing an implementation detail boundary and moving outside the managed implementation; which may be a step to far? Its further than reflecting over private members which is quite dodgy.

I'd prefer to have this as a exposed way of doing things; rather than assume an implementation (e.g. change corefx/coreclr); but I don't know if that's realistic.

Contributor

benaadams commented Oct 6, 2015

@bbowyersmyth String looks like its implemented in clr native and exposes no accepted way to do this so it is crossing an implementation detail boundary and moving outside the managed implementation; which may be a step to far? Its further than reflecting over private members which is quite dodgy.

I'd prefer to have this as a exposed way of doing things; rather than assume an implementation (e.g. change corefx/coreclr); but I don't know if that's realistic.

@benaadams

This comment has been minimized.

Show comment
Hide comment
@benaadams

benaadams Oct 6, 2015

Contributor

@bbowyersmyth maybe something like LockBits that exists on System.Drawing.Bitmap

Contributor

benaadams commented Oct 6, 2015

@bbowyersmyth maybe something like LockBits that exists on System.Drawing.Bitmap

@bbowyersmyth

This comment has been minimized.

Show comment
Hide comment
@bbowyersmyth

bbowyersmyth Oct 6, 2015

@benaadams It may be a step to far but I though I'd mention it if this is still on the hot path. Something as simple as duplicating data can have an impact on the RPS. All good.

bbowyersmyth commented Oct 6, 2015

@benaadams It may be a step to far but I though I'd mention it if this is still on the hot path. Something as simple as duplicating data can have an impact on the RPS. All good.

@benaadams

This comment has been minimized.

Show comment
Hide comment
@benaadams

benaadams Oct 6, 2015

Contributor

@bbowyersmyth zero alloc, zero copy are my watch words ;-)

Contributor

benaadams commented Oct 6, 2015

@bbowyersmyth zero alloc, zero copy are my watch words ;-)

@DamianEdwards

This comment has been minimized.

Show comment
Hide comment
@DamianEdwards

DamianEdwards Oct 6, 2015

Member

This certainly isn't on the hot path anymore. I think that the change that is currently in this PR is a worth step, to reduce the heap allocations but anything beyond that is most likely overkill right now (or at least until this shows up in our profile again 😄).

Member

DamianEdwards commented Oct 6, 2015

This certainly isn't on the hot path anymore. I think that the change that is currently in this PR is a worth step, to reduce the heap allocations but anything beyond that is most likely overkill right now (or at least until this shows up in our profile again 😄).

@benaadams

This comment has been minimized.

Show comment
Hide comment
@benaadams

benaadams Oct 6, 2015

Contributor

need a call on the base64 set though; won't change perf

Contributor

benaadams commented Oct 6, 2015

need a call on the base64 set though; won't change perf

@DamianEdwards

This comment has been minimized.

Show comment
Hide comment
@DamianEdwards

DamianEdwards Oct 7, 2015

Member

@benaadams base64 implies case sensitivity yes? I don't really want these IDs to be case sensitive. Perhaps base36 is enough.

Member

DamianEdwards commented Oct 7, 2015

@benaadams base64 implies case sensitivity yes? I don't really want these IDs to be case sensitive. Perhaps base36 is enough.

@benaadams

This comment has been minimized.

Show comment
Hide comment
@benaadams

benaadams Oct 7, 2015

Contributor

@DamianEdwards base32 would be easier for a clean 5 bit; shave off a couple letters from the end?

Contributor

benaadams commented Oct 7, 2015

@DamianEdwards base32 would be easier for a clean 5 bit; shave off a couple letters from the end?

@DamianEdwards

This comment has been minimized.

Show comment
Hide comment
@DamianEdwards
Member

DamianEdwards commented Oct 7, 2015

Sure

@benaadams

This comment has been minimized.

Show comment
Hide comment
@benaadams

benaadams Oct 7, 2015

Contributor

Done. ~310% faster than calling long.ToString() on x64 and ~600% faster on x32

Method Platform Jit AvrTime StdDev op/s
LongToString X86 LegacyJit 48.6370 ns 0.5701 ns 20,560,474.66
CharHeap X86 LegacyJit 7.9431 ns 0.1111 ns 125,896,043.99
CharStack X86 LegacyJit 9.6136 ns 0.1804 ns 104,019,793.55
Char32Stack X86 LegacyJit 8.1148 ns 0.0441 ns 123,231,165.02
Char64Stack X86 LegacyJit 7.4285 ns 0.0076 ns 134,617,091.36
IntChar64Stack X86 LegacyJit 5.9278 ns 0.0069 ns 168,697,749.18

Method Platform Jit AvrTime StdDev op/s
LongToString X64 RyuJit 21.9177 ns 0.7149 ns 45,625,348.42
CharHeap X64 RyuJit 7.7770 ns 0.0258 ns 128,584,042.38
CharStack X64 RyuJit 7.2523 ns 0.0137 ns 137,887,101.48
Char32Stack X64 RyuJit 6.9184 ns 0.0076 ns 144,542,881.19
Char64Stack X64 RyuJit 6.0165 ns 0.0120 ns 166,209,448.85
IntChar64Stack X64 RyuJit 4.2964 ns 0.0117 ns 232,752,905.73
Contributor

benaadams commented Oct 7, 2015

Done. ~310% faster than calling long.ToString() on x64 and ~600% faster on x32

Method Platform Jit AvrTime StdDev op/s
LongToString X86 LegacyJit 48.6370 ns 0.5701 ns 20,560,474.66
CharHeap X86 LegacyJit 7.9431 ns 0.1111 ns 125,896,043.99
CharStack X86 LegacyJit 9.6136 ns 0.1804 ns 104,019,793.55
Char32Stack X86 LegacyJit 8.1148 ns 0.0441 ns 123,231,165.02
Char64Stack X86 LegacyJit 7.4285 ns 0.0076 ns 134,617,091.36
IntChar64Stack X86 LegacyJit 5.9278 ns 0.0069 ns 168,697,749.18

Method Platform Jit AvrTime StdDev op/s
LongToString X64 RyuJit 21.9177 ns 0.7149 ns 45,625,348.42
CharHeap X64 RyuJit 7.7770 ns 0.0258 ns 128,584,042.38
CharStack X64 RyuJit 7.2523 ns 0.0137 ns 137,887,101.48
Char32Stack X64 RyuJit 6.9184 ns 0.0076 ns 144,542,881.19
Char64Stack X64 RyuJit 6.0165 ns 0.0120 ns 166,209,448.85
IntChar64Stack X64 RyuJit 4.2964 ns 0.0117 ns 232,752,905.73
@benaadams

This comment has been minimized.

Show comment
Hide comment
@benaadams

benaadams Oct 8, 2015

Contributor

Back of paper calc says can reduce it from 13 bytes to 12 bytes but cutting the representable range to: from now (or open sourcing of .net; or start of asp.net 5) to 365 years from now rather than +/- 9999 years

Contributor

benaadams commented Oct 8, 2015

Back of paper calc says can reduce it from 13 bytes to 12 bytes but cutting the representable range to: from now (or open sourcing of .net; or start of asp.net 5) to 365 years from now rather than +/- 9999 years

@DamianEdwards

This comment has been minimized.

Show comment
Hide comment
@DamianEdwards

DamianEdwards Oct 8, 2015

Member

Looks good to me. @davidfowl please review

Member

DamianEdwards commented Oct 8, 2015

Looks good to me. @davidfowl please review

@davidfowl

This comment has been minimized.

Show comment
Hide comment
@davidfowl

davidfowl Oct 8, 2015

Member

Looks good!

Member

davidfowl commented Oct 8, 2015

Looks good!

@davidfowl

This comment has been minimized.

Show comment
Hide comment
@davidfowl

davidfowl Oct 8, 2015

Member

@benaadams can you squash your commits?

Member

davidfowl commented Oct 8, 2015

@benaadams can you squash your commits?

Less allocs in GenerateRequestId
Case insentive base32 encoding
@benaadams

This comment has been minimized.

Show comment
Hide comment
@benaadams

benaadams Oct 8, 2015

Contributor

Done

Contributor

benaadams commented Oct 8, 2015

Done

davidfowl added a commit that referenced this pull request Oct 8, 2015

Merge pull request #385 from benaadams/could-not-resist
requestId ticks seed, reduced GenerateRequestId allocs

@davidfowl davidfowl merged commit e92d325 into aspnet:dev Oct 8, 2015

2 checks passed

continuous-integration/appveyor AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@benaadams benaadams deleted the benaadams:could-not-resist branch Oct 8, 2015

@icanhasjonas

This comment has been minimized.

Show comment
Hide comment
@icanhasjonas

icanhasjonas Oct 15, 2015

You probably meant to write Base32 encoding here.

You probably meant to write Base32 encoding here.

This comment has been minimized.

Show comment
Hide comment
@benaadams

benaadams Oct 15, 2015

Contributor

Code is moving project, but added PR for correction here: aspnet/HttpAbstractions#441

Contributor

benaadams replied Oct 15, 2015

Code is moving project, but added PR for correction here: aspnet/HttpAbstractions#441

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment