Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch DirectoryControl to use AsnWriter, AsnDecoder #101512

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

edwardneal
Copy link
Contributor

Relates to #97540.

This PR replaces all references to BerConverter in LDAP directory control generation/parsing to use AsnWriter and AsnDecoder. SortRequestControl didn't use BerConverter directly - it called the OpenLDAP and WLDAP ldap_create_sort_control APIs instead. This class was the only thing in S.DS.P which referenced the ldap_create_sort_control and SortKeyInterop struct, so I deleted them both.

SortRequestControl

The change to SortRequestControl's generation mechanism might also resolve #34679, since there shouldn't be any mechanism for the heap corruption to occur.

Most of the SortRequestControl's new ASN.1 encoding is pretty uncontroversial, but there was a bit of discussion in PR #65548 around the encoding of the sort key's attribute name, and this was marshalled (as part of SortKeyInterop) with different encodings between Windows and Linux. In the RFC, this is defined (indirectly) as an LdapString; this is described as ISO10646 characters, encoded as a UTF-8 string and represented as an OCTET STRING. I'm fairly sure that UTF8Encoding.GetBytes fulfils this, and running the associated test case against a real AD domain controller passes.

Test changes

There are also test changes, but these are largely to change the special-casing of expected byte values between OpenLDAP and WLDAP - .NET now generates these values in a consistent format (the OpenLDAP format) regardless of platform. The .NET Framework tests continued to use the version of S.DS.P from the GAC in my environment, so I've special-cased by the framework version rather than by the platform.

Misc. optimizations

There were a handful of byte-by-byte array copies, which I've switched over to using span-based copies in hopes that they'll benefit slightly from vectorisation. TransformControls and GetValue have a related change: where they used to reference properties returning byte arrays (which took defensive copies) they now reference the property values directly. These should both reduce GC traffic slightly.

Replaced this with the managed AsnDecoder, removing PInvoke from a potential hot path.
Also removed the manual API calls to ldap_create_sort_control - this is now built in managed code.
This then has knock-on effects to eliminate the SortKeyInterop classes.
Most of the Control tests were hardcoded to the output of BerConverter, which uses four-byte lengths in all cases.
This behaviour is now different: the same output is returned across all platforms for .NET, and remains unchanged for .NET Framework.
This should also close issue 34679.
Reduce number of copies required in TransformControls, and enable these copies to take advantage of newer intrinsics where available.
Windows domain controllers may return a distinguished name starting with OU=, rather than ou=.
@PaulusParssinen
Copy link
Contributor

Out of curiosity, any benchmarks for perf. numbers before/after switching to AsnReader/AsnWriter?

@edwardneal
Copy link
Contributor Author

I've not got benchmarks right now, but will write some in the next few days. In advance of these, I expect there'll be a modest reduction in managed and unmanaged memory usage, and that execution time will reduce (while remaining within the margin of error for the network request itself.)

Preallocating space for AsnWriter buffers to reduce memory usage.
Correctly handling attribute names in SortControls.
@edwardneal
Copy link
Contributor Author

edwardneal commented Apr 27, 2024

Benchmarks are below. To summarize:

  • As expected, the performance improvements are working in the margins. I fully expect most of the execution time variations to be lost in the noise of network traffic.
  • 35% median reduction in memory usage. One notable improvement on this in DirectoryControl.TransformControls, which is on the hot path for processing LDAP responses and reduces memory usage by about 75%.
  • Although the percentage reductions in memory usage are good, the absolute reductions are pretty small - the median reduction was of 144 bytes.
  • 88.5% median reduction in execution time, although the absolute reductions are often small - AsqRequest is reduced from 1.869 microseconds to 185.1 nanoseconds.
  • DirectoryControl.TransformControls is another notable exception to this, reducing from 6.596us to 1.588us.
  • Most of the original code's memory allocations stuck around for Gen1 GCs. This GC pressure no longer exists.
  • I've got no data on unmanaged memory usage. This is particularly relevant for SortRequest, which moved from 400 bytes to 416 bytes managed memory usage. I'm assuming that this lack of data is the reason for the increase in memory usage - it's not actually increasing, it's just now trackable in the managed counters.
  • Code size is 15 bytes in most places. The disassembly puts this at the size of the benchmark itself - just enough to return DirectoryControl.GetValue. I think this is just noise from the JIT inlining.
Performance header
BenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3296/23H2/2023Update/SunValley3)
Intel Core i7-8565U CPU 1.80GHz (Whiskey Lake), 1 CPU, 8 logical and 4 physical cores
.NET SDK 8.0.200
  [Host]     : .NET 8.0.4 (8.0.424.16909), X64 RyuJIT AVX2
  DefaultJob : .NET 8.0.4 (8.0.424.16909), X64 RyuJIT AVX2
AsqRequestControl.GetValue: -90% execution time, -35% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.869 μs 0.0373 μs 0.0485 μs 127 B 0.1030 0.0992 432 B
PR 185.1 ns 3.25 ns 4.12 ns 15 B 0.0668 280 B
CrossDomainMoveControl.GetValue: -54% execution time, -62% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 89.81 ns 1.297 ns 1.332 ns 1,039 B 0.0516 216 B
PR 41.24 ns 0.469 ns 0.438 ns 2,577 B 0.0191 80 B
DirSyncRequestControl.GetValue: -89% execution time, -35% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.587 μs 0.0453 μs 0.1329 μs 191 B 0.1221 0.1183 520 B
PR 162.4 ns 2.20 ns 2.06 ns 15 B 0.0782 328 B
ExtendedDNControl.GetValue: -89% execution time, -30% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.059 μs 0.0213 μs 0.0522 μs 147 B 0.0877 0.0858 368 B
PR 115.9 ns 1.72 ns 1.44 ns 15 B 0.0610 256 B
PageResultRequestControl.GetValue: -90% execution time, -40% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.412 μs 0.0283 μs 0.0432 μs 160 B 0.1030 0.1011 432 B
PR 134.7 ns 1.49 ns 1.32 ns 15 B 0.0610 256 B
QuotaControl.GetValue: -92% execution time, -28% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.607 μs 0.0261 μs 0.0232 μs 127 B 0.0916 0.0877 392 B
PR 120.4 ns 1.65 ns 1.29 ns 15 B 0.0668 280 B
SearchOptionsControl.GetValue: -90% execution time, -30% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.251 μs 0.0223 μs 0.0197 μs 147 B 0.0877 0.0858 368 B
PR 114.7 ns 1.56 ns 1.39 ns 15 B 0.0610 256 B
SecurityDescriptorFlagControl.GetValue: -88% execution time, -30% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.021 μs 0.0188 μs 0.0460 μs 147 B 0.0877 0.0858 368 B
PR 115.8 ns 1.47 ns 1.37 ns 15 B 0.0610 256 B
SortRequestControl.GetValue: -79% execution time, +4% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.893 μs 0.0220 μs 0.0195 μs 15 B 0.0954 400 B
PR 390.5 ns 5.37 ns 5.03 ns 15 B 0.0992 416 B
DirectoryControl.TransformControls: -75% execution time, -76% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 6.596 μs 0.1310 μs 0.3164 μs 3,603 B 0.8392 0.0153 3.43 KB
PR 1.588 μs 0.0208 μs 0.0255 μs 12,854 B 0.1945 816 B
VerifyNameControl.GetValue: -87% execution time, -46% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.575 μs 0.0480 μs 0.1240 μs 1,526 B 0.1488 0.1469 624 B
PR 195.1 ns 1.43 ns 1.27 ns 15 B 0.0801 336 B
VlvRequestControl.GetValue: -84% execution time, -69% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.459 μs 0.0276 μs 0.0650 μs 15 B 0.2289 0.0038 968 B
PR 227.7 ns 3.21 ns 3.00 ns 15 B 0.0706 296 B

I've made a performance adjustment by specifying the initial size of AsnWriter, since this is trivial to calculate (or always static.) AsnWriter grows in 1KB increments, which is much larger than the size of a normal directory control and causes memory usage to balloon.

One inefficiency which I couldn't eliminate is that when writing strings as ASN.1 octet strings, I want to manually select the encoding to use and encode directly into the AsnWriter buffer. This isn't possible, (probably to keep AsnWriter specification-compliant) so I have to reserve/allocate a byte array, encode into that and write that out as an octet string. An example of this behaviour is in VerifyNameControl.GetValue.

Edit: the updated build has completed and the test failures are unrelated, so I'm now happy that the benchmarks are valid @PaulusParssinen

@edwardneal
Copy link
Contributor Author

Thanks @PaulusParssinen, I'll correct the nits shortly.

I dug into the codegen for stackalloc. The results were fairly interesting.

Variable string length stackalloc

Sample code

Span<byte> byteSpan = stackalloc byte[myString.Length];
return byteSpan.Length;

Assembly

7FFACF4D6C90 push      rbp
7FFACF4D6C91 sub       rsp,30
7FFACF4D6C95 lea       rbp,[rsp+20]
7FFACF4D6C9A mov       rax,613C5BC56FC9
7FFACF4D6CA4 mov       [rbp+8],rax
7FFACF4D6CA8 mov       rax,[rcx+8]
7FFACF4D6CAC mov       eax,[rax+8]
7FFACF4D6CAF mov       ecx,eax
7FFACF4D6CB1 test      rcx,rcx
7FFACF4D6CB4 je        short 00007FFACF4D6CD4
7FFACF4D6CB6 add       rcx,0F
7FFACF4D6CBA shr       rcx,4
7FFACF4D6CBE add       rsp,20
7FFACF4D6CC2 push      0
7FFACF4D6CC4 push      0
7FFACF4D6CC6 dec       rcx
7FFACF4D6CC9 jne       short 00007FFACF4D6CC2
7FFACF4D6CCB sub       rsp,20
7FFACF4D6CCF lea       rcx,[rsp+20]
7FFACF4D6CD4 mov       rcx,613C5BC56FC9
7FFACF4D6CDE cmp       [rbp+8],rcx
7FFACF4D6CE2 je        short 00007FFACF4D6CE9
7FFACF4D6CE4 call      CORINFO_HELP_FAIL_FAST
7FFACF4D6CE9 nop
7FFACF4D6CEA lea       rsp,[rbp+10]
7FFACF4D6CEE pop       rbp
7FFACF4D6CEF ret
Constant length stackalloc

Sample code

Span<byte> byteSpan = stackalloc byte[0x100];
return byteSpan.Length;

Assembly

7FFB47536CD0 push      rbp
7FFB47536CD1 sub       rsp,30
7FFB47536CD5 lea       rbp,[rsp+20]
7FFB47536CDA mov       rax,2DEAD4CD135B
7FFB47536CE4 mov       [rbp+8],rax
7FFB47536CE8 mov       eax,100
7FFB47536CED test      rax,rax
7FFB47536CF0 je        short 00007FFB47536D10
7FFB47536CF2 add       rax,0F
7FFB47536CF6 shr       rax,4
7FFB47536CFA add       rsp,20
7FFB47536CFE push      0
7FFB47536D00 push      0
7FFB47536D02 dec       rax
7FFB47536D05 jne       short 00007FFB47536CFE
7FFB47536D07 sub       rsp,20
7FFB47536D0B lea       rax,[rsp+20]
7FFB47536D10 mov       eax,100
7FFB47536D15 mov       rcx,2DEAD4CD135B
7FFB47536D1F cmp       [rbp+8],rcx
7FFB47536D23 je        short 00007FFB47536D2A
7FFB47536D25 call      CORINFO_HELP_FAIL_FAST
7FFB47536D2A nop
7FFB47536D2B lea       rsp,[rbp+10]
7FFB47536D2F pop       rbp
7FFB47536D30 ret
Constant length stackalloc, with SkipLocalsInit

Sample code

Span<byte> byteSpan = stackalloc byte[0x100];
return byteSpan.Length;

Assembly

7FFACF4D6C50 push      rbp
7FFACF4D6C51 sub       rsp,30
7FFACF4D6C55 lea       rbp,[rsp+20]
7FFACF4D6C5A mov       rax,0D5EB80FED67D
7FFACF4D6C64 mov       [rbp+8],rax
7FFACF4D6C68 test      [rsp],esp
7FFACF4D6C6B sub       rsp,100
7FFACF4D6C72 lea       rax,[rsp+20]
7FFACF4D6C77 mov       eax,100
7FFACF4D6C7C mov       rcx,0D5EB80FED67D
7FFACF4D6C86 cmp       [rbp+8],rcx
7FFACF4D6C8A je        short 00007FFACF4D6C91
7FFACF4D6C8C call      CORINFO_HELP_FAIL_FAST
7FFACF4D6C91 nop
7FFACF4D6C92 lea       rsp,[rbp+10]
7FFACF4D6C96 pop       rbp
7FFACF4D6C97 ret

The general pattern of the assembly is thus:

; Boilerplate
7FFB47536CD0 push      rbp
7FFB47536CD1 sub       rsp,30
7FFB47536CD5 lea       rbp,[rsp+20]
7FFB47536CDA mov       rax,2DEAD4CD135B
7FFB47536CE4 mov       [rbp+8],rax

; If the localsinit flag is set:
    ; eax / rax holds the value to expand the stack by.
    ; If the value is constant:
        7FFB47536CE8 mov       eax,100
        7FFB47536CED test      rax,rax
    ; If the value is dynamic:
        7FFB47536CA8 mov       rax,[rcx+8]
        7FFB47536CAC mov       eax,[rax+8]
        7FFB47536CAF mov       ecx,eax
        7FFB47536CB1 test      rcx,rcx

    ; Boilerplate - stack smashing checking
    7FFB47536CF0 je        short 00007FFB47536D10

    ; Relevant section.
    7FFB47536CF2 add       rax,0F
    7FFB47536CF6 shr       rax,4
    7FFB47536CFA add       rsp,20
    7FFB47536CFE push      0
    7FFB47536D00 push      0
    7FFB47536D02 dec       rax
    7FFB47536D05 jne       short 00007FFB47536CFE

    7FFB47536D07 sub       rsp,20

; If the localsinit flag isn't set, we just subtract from rsp directly:
    7FFACF4D6C68 test      [rsp],esp
    7FFACF4D6C6B sub       rsp,100

; Boilerplate (including the return value)
7FFB47536D0B lea       rax,[rsp+20]
7FFB47536D10 mov       eax,100
7FFB47536D15 mov       rcx,2DEAD4CD135B
7FFB47536D1F cmp       [rbp+8],rcx
7FFB47536D23 je        short 00007FFB47536D2A
7FFB47536D25 call      CORINFO_HELP_FAIL_FAST
7FFB47536D2A nop
7FFB47536D2B lea       rsp,[rbp+10]
7FFB47536D2F pop       rbp
7FFB47536D30 ret

The relevant section is surprising. localsinit affects over a stackalloc'd Span, even though the documentation is fairly specific in saying that this might not happen:

The content of the newly allocated memory is undefined. You should initialize it before the use. For example, you can use the Span.Clear method that sets all the items to the default value of type T.

I'll mark the DirectoryControls method to disable localsinit, then make the Span length constant. If localsinit were to be left enabled, without any intrinsics helping the zeroing along, switching to using a constant 256-byte Span is slower than a dynamic (but shorter) one. The benchmark for this is:

Method Mean Error StdDev Ratio RatioSD Code Size
DynamicStackAlloc 8.106 ns 0.1953 ns 0.2171 ns 1.00 0.00 96 B
DynamicStackAllocSkipLocalsInit 1.345 ns 0.0419 ns 0.0392 ns 0.17 0.01 121 B
ConstantStackAlloc 9.204 ns 0.1208 ns 0.1071 ns 1.13 0.04 97 B
ConstantStackAllocSkipLocalsInit 1.138 ns 0.0274 ns 0.0214 ns 0.14 0.00 72 B

It doesn't change the control-by-control benchmarks earlier in the PR though - it's a rounding error in comparison.

@PaulusParssinen
Copy link
Contributor

PaulusParssinen commented May 11, 2024

IIRC, BCL is marked with [assembly: SkipLocalsInit] already. Not exactly sure if it is limited in anyway.

update: If I'm reading it right, this should be enabling it for all libraries.

<PropertyGroup>
<SkipLocalsInit Condition="'$(SkipLocalsInit)' == '' and '$(MSBuildProjectExtension)' == '.csproj' and '$(IsNETCoreAppSrc)' == 'true' and '$(TargetFrameworkIdentifier)' == '.NETCoreApp'">true</SkipLocalsInit>
</PropertyGroup>
<!--Instructs compiler not to emit .locals init, using SkipLocalsInitAttribute.-->
<Choose>
<When Condition="'$(SkipLocalsInit)' == 'true'">
<PropertyGroup >
<!-- This is needed to use the SkipLocalsInitAttribute. -->
<AllowUnsafeBlocks>true</AllowUnsafeBlocks>
</PropertyGroup>
<ItemGroup>
<Compile Include="$(CommonPath)SkipLocalsInit.cs" Link="Common\SkipLocalsInit.cs" />
</ItemGroup>
</When>
</Choose>

Corrected off-by-one error in stack allocation comparison.
Made some stackallocs constant (unless a meaningful constant size would be >256 bytes) and reviewed these constant values based on real-world attribute lengths/server names.
@edwardneal
Copy link
Contributor Author

Excellent, thanks. I've fixed the off-by-one error and made all but one of the stackallocs constant. In the case of attribute names, I've then cut the threshold from 256 -> 64, since the largest AD attribute name is 53 bytes.

The stackalloc which I've kept dynamic is the server name in VerifyNameControl. Looking at RFC1035, a three-label FQDN could be up to 384 bytes when UTF-16 encoded - potentially 512 bytes in the case of an enterprise AD environment. I think that at that point, the elimination of two MOVs is outweighed by the cost of always allocating 512 bytes of stack space.

Comment on lines 220 to 228
// Attribute name is optional: AD for example never returns attribute name
asnReadSuccessful = AsnDecoder.TryReadEncodedValue(asnSpan, AsnEncodingRules.BER, out Asn1Tag octetStringTag, out _, out int octetStringLength, out _);
if (asnReadSuccessful)
{
// decoding might fail as attribute is optional
o = BerConverter.Decode("{e}", value);
Debug.Assert(o != null && o.Length == 1);
Span<byte> attributeNameBuffer = octetStringLength <= AttributeNameStackAllocationThreshold ? attributeNameScratchSpace.Slice(0, octetStringLength) : new byte[octetStringLength];

result = (int)o[0];
_ = AsnDecoder.TryReadOctetString(asnSpan, attributeNameBuffer, AsnEncodingRules.BER, out _, out _);
attribute = s_utf8Encoding.GetString(attributeNameBuffer);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TryReadEncodedValue can fail for a lot of sub-reasons; but I think you're just using it for "there was still data". In which case you should rewrite this to

if (!asnSpan.IsEmpty)
{
    scoped ReadOnlySpan<byte> attributeNameBuffer;

    if (asnSpan.Length <= AttributeNameStackAllocationThreshold)
    {
        asnReadSuccessful = AsnDecoder.TryReadOctetString(
            asnSpan,
            attributeNameScratchSpace,
            AsnEncodingRules.BER,
            out int consumed,
            out int written);

        // TryReadOctetString can't fail when the output buffer is at least as big as the input buffer.
        Debug.Assert(asnReadSuccessful);
        attributeNameBuffer = attributeNameScratchSpace.Slice(0, written);
        asnSpan = asnSpan.Slice(consumed);
    }
    else
    {
        attributeNameBuffer = AsnDecoder.ReadOctetString(
            asnSpan,
            AsnEncodingRules.BER,
            out int consumed);

        asnSpan = asnSpan.Slice(consumed);
    }

    attribute = s_utf8Encoding.GetString(attributeNameBuffer);
}

if (!asnSpan.IsEmpty)
{
    throw new BerConversionException();
}

SortResponseControl sort = ...;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this library wants to change to using array pool for "too big for the stack", then you'd go with something like your current code. But once you're using new byte[len] you should just call the array-allocating variety.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using the pattern of TryReadEncodedValue + TryReadOctetString to do a few things. The octet strings in question are optional, so I'm interrogating the buffer to figure out it's about to yield an octet string and if so, how big it is. I'm also adding a post-hoc check to make sure that it's the last thing in the sequence, in line with the prior review comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's assume you have a definite-length payload: 30 04 0A 01 01 04.

  • ReadSequence will report success. The definite length encoding means it isn't going to bother checking that the children make sense.
  • ReadEnumerated will consume 3 bytes (0A 01 01). The sequence contents now have one byte left (which is never valid).
  • TryReadEncodedValue will return false. It can read a tag, but not a length, you're out of data. Nothing was corrupt, so no exception, just a false return.
  • What state are you in now?

The code I wrote is "there's more stuff. Just read it as an OCTET STRING, because that's the only optional thing remaining". That will cause an exception, because the data is insufficient and therefore corrupt. If there was NO data remaining, it'd not read anything else, and satisfy the OPTIONAL.

I don't know what BerConverter would do in this invalid-data case. If it previously rejected it (which I hope it would) then it needs to continue to be rejected. If it's super loosey goosey, then what you have is more compatible (though I think the more strict read is still better)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your example SortControl payload and explanation makes sense. Incidentally, BerConverter will reject that if I use the {eO} format specifier.

That payload should naturally fail for either of two reasons:

  • An octet string is expected to follow the enumerated value, and the data which actually follows that enumerated value isn't a valid octet string.
  • The data following the enumerated value isn't an octet string, so we're expecting the enumerated value to be the last thing in the sequence. It isn't.

It doesn't right now because TryReadEncodedValue will return false and the updated code will ignore the extra data inside the payload's sequence. I need to fix that when handling PageControl, SortControl and VlvResponseControl.

_keys = new SortKey[value.Length];
for (int i = 0; i < value.Length; i++)
{
_keys[i] = new SortKey(value[i].AttributeName, value[i].MatchingRule, value[i].ReverseOrder);
_keysAsnLength += 13 + (value[i].AttributeName?.Length ?? 0) + (value[i].MatchingRule?.Length ?? 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly advise ditching all of the size estimation and just using a shared writer with the default 1k first page.

@edwardneal
Copy link
Contributor Author

Thanks @bartonjs - your review makes sense, and I'll apply the feedback later today.

What's your reasoning behind ditching the size estimation? It seems to be a fairly natural progression to shrink the first allocated block to a more appropriate size (since most controls will never even come close to 1KB in size.) From there, the sizes are either constant (because the control's structure is a set of fixed-size fields) or trivially calculated; in both cases, the length of each control is implied by the specification which defines it, so I don't see it as violating separation of concerns. AsnWriter will expand if the estimate is wrong, so I don't think it's creating any exceptions in the edge case. Is it a problem solely because the estimation makes it difficult to pool the AsnWriter instances, or does it have a problem in its own right?

@bartonjs
Copy link
Member

What's your reasoning behind ditching the size estimation?

Assuming the objects are pooled centrally, it's that a low estimate from one area just forces a resize to happen somewhere else. And otherwise that it's a lot of fiddly calculations to explain, especially in the ones that aren't just a sequence of an integer.

If they're pooled by class that needs them, then I suppose the "this is the size I should need here" applies... at least for rigidly sized ones.

* Added limited caching of AsnWriters, as small/medium/large buckets.
* Replaced several Debug.Asserts with BerExceptions.
* Changed the type of raised exception to retain backwards-compatibility.
* Added extra checks on BER content lengths to match BerConverter.
@edwardneal
Copy link
Contributor Author

edwardneal commented May 24, 2024

That makes sense, thanks. I've changed this to use a set of up to three AsnWriters based on expected payload length (0-32, 33-128, 129+) and this seems to work well. As a result, everything except for AsqRequestControl, VerifyNameControl and SortRequestControl can use the smallest pooled buffer. I expect that most of the time we'll see the three remaining controls use the middle pooled buffer, but I've added the unbounded one there as a fallback with a more appropriate initial size of 256 bytes.

The changes have slightly improved performance, but nothing to write home about - about 10-20ns shaved off the previous changes.

The encode/decode tests pass locally, but I've not currently got access to an AD server to perform real-world testing. I'll re-test once I do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.DirectoryServices community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[mono] Test failed on windows: System.DirectoryServices.Protocols.Tests.SortRequestControlTests
3 participants