-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Description
See https://github.com/dotnet/corefx/issues/27625#issuecomment-369791334 for full discussion.
dotnet/coreclr#16926 fixes the major problem which is unbounded growth. However in #25272 it's also observed that Insert has a potentially suboptimal chunk use strategy:
Insert does not try hard to reuse space (it won't shuffle along more than 32 characters) so it's fairly easy to end up prepending a new block.
Those prepended blocks are made only as big as needed but no less than the default of 16. So Inserts of less than 16 leave slack space that will remain unused unless there are more inserts into that chunk no more than 16 from the end of it so that its contents get shuffled along. If there's an insert at the end of that block it will create a new block rather than writing to the dead area. eg given
|0123456789012345| (where | is block end) inserting 0123456789 leaves us with |0123456789......|0123456789012345| . This leaves 6 array entries permanently dead unless there is another insert after the 10th character.
Still one could argue it is optimizing for inserting directly after the last insert rather than at the front again. However it doesn't do that either as then inserting abc at offset 10 gives us |0123456789......|abc.............|0123456789012345| rather than |0123456789abc..|0123456789012345| as you might expect.
Two potential changes to make here
-
When insert pre-pends a new first chunk, consider writing the characters to the end of the chunk rather than the start of it since it's probably more likely that a subsequent Insert would be at index 0 rather than index n+1. (As for insert adding interior chunks, I do not know whether the behavior is better one way or the other)
-
Insert should certainly write to slack space before making a new block.
StringBuilder is intentionally heavily optimized for Append not Insert and it's already complex. Any changes to improve Insert must not impair Append nor make the implementation even more complex.