-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String Deduplication Design Doc #31971
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for writing this down!
Related: Kestrel does string dedupe in byte->string conversion across requests on a single connection (checking if the new string for the request is the same as the one from the previous request; before trying to create it) dotnet/aspnetcore#8374. This helps with http, as the values of some of the more egregious sized strings While runtime string dedupe wouldn't help necessarily directly with these strings (as they may be unique); since it also makes the strings become longer lived it would automagically widen this deduping to be across connection for the values of other common strings This wider string deduplication for longer lived strings would be a beautiful addition; and also an advantage a non-GC'd environment would find very hard to replicate. |
Would not deduping pointers on stack and providing a attribute for fields (say: |
@benaadams as with any perf feature it's a matter of how much adoption you can get vs how much it requires to implement. stack vars tend to live for a short period of time so the benefit of deduping there isn't as obvious as old gen objects. for this release I'm not thinking about deduping refs on the stack. we might look into deduping long lived stack refs in the future. I can add a note of this in the doc. |
|
||
- We know when new LOH strings are allocated (all LOH allocations are done via C++ helpers in gchelpers.cpp), we can take the accumulative size of new LOH strings till it reaches a certain percentage of LOH before starting the next dedup cycle. | ||
|
||
- We know how much is promoted into gen2 and we can also wait till a certain percentage of gen2 consists of newly promoted memory before starting the next dedup cycle. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to also trigger if there is (a smaller value, numeric rather than percentage?) promoted memory and enough (idle?) time has elapsed, so it can use more idle periods to also dedpue? Also in that phase maybe check if the string count in that memory goes over a threshold prior to starting the scan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah. when we actually start doing work we'll experiment with various tuning options to see what works best. I didn't want to put too much implementation detail in the doc.
ok...I believe I've addressed all feedback. thanks all, especially @jkotas for your excellent help! |
@benaadams meant this as a suggestions to deal with the compat problems in existing libraries that depend on string object identity. In his suggestion, not deduping pointers on the stack is a good thing (tm), not something to fix. |
|
||
- Backup nonconcurrent dedup mode | ||
|
||
If you are running into these problematic scenarios and cannot make changes, and you can tolerate longer STW (Stop-The-World) pauses, we will provide a mode where deduping is completely done in either a full blocking GC or during the STW pause of a BGC. Obviously this would make BGC’s STW pauses much, much longer. This would basically just call the function that the dedup thread calls, but in these STW pauses. If you use a CPU profiler you should be able to see preciously how long the dedup part takes during the STW pause. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The blocking mode cannot be used to workaround the compatibility problems reliably. It would work only if you can tell whether none of the threads are running the problematic code while the world is stopped which is close to impossible. All of the compat problems listed above will still be a problem even in the full blocking mode.
I doubt we would be able to recommend this as a workaround to anybody important. We cannot recommend to use something that causes crash or data corruption some of the time. Maybe as the one-shot deduplication after startup when you can know that other threads in the process are not doing anything problematic (we would need the API for that), but not as the in-flight dedup while the service is running.
|
||
1. Using the fixed keyword to modify string content. | ||
|
||
I’m seeing that there are almost 600 places in libraries that do `fixed (char*` but hopefully most of them do not actually modify the string content. We should definitely be encouraging folks to switch to using `string.Create` like what [PR#31700](https://github.com/dotnet/runtime/pull/31700) did. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use this pattern internally for implement string constructors. We will need an internal API that will be called at the end of the string constructor to mark the string as "baked" and candidate for deduping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh right, I should add this...sorry I forgot about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we won't need this method if we are not deduping pointers on stack as @benaadams pointed out. Now I remember we talked about it already, but I forgot. Either way, it would be useful to mention it in the doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, because it can not be assigned to any field before it gets full constructed. makes sense.
Above, an opt-out attribute
The runtime would know not to touch the That way, opt-out can be granular per variable. Also, I think it is an important result of the discussion above that |
On Linux, it should be possible to use a I don't know how expensive this would be, but my gut instinct says it's going to be fine given that modifying strings shouldn't happen in most cases, and that string deduping is an opt-in feature that's not 100% free -- all the while providing an alternative to make things just work without requiring people to change their code. I also don't know if other platforms support handling page faults in user code. |
hmm, seems confusing to me 'cause I never mentioned we were deduping strings specifically pointed to by the stack. in the ref equality example, the string field was deduped because it's pointed to by an object (thus is called "_field") and the stack vars just happened to point to it. are you suggesting that while we are deduping we also check if a ref is also pointed by stack vars? that would be very difficult to do - I dunno of any performant way to do it. |
Never mentioned it wasn't deduping string pointers on the stack either, only the generation of the string? 😄 I had assumed it also would be deduping stack pointers due to the worry around mutating strings immediately after construction; as presumably they will only be pointed at by the stack (at that time). What I was saying is: if it didn't dedupe pointers on the stack (which it doesn't) and you could opt out specific fields with it (via Further to that; assuming the analyzers were robust and the deduping showed good results, there is potential for it to move to opt-out (if something like |
the design outline states "As the dedup thread goes through the old generations linearly, it looks for references to a string object (denoted by the method table) ...". so it does not do anything with the stack vars (at least in my mind that's what it meant 😆) but I can see how this detail can be missed - it certainly can be made more obvious by mentioning it specifically in the design goals section. I get what you mean now. I'll need to see how much work it'd be to implement something to specify [NoDedup] efficiently. thanks for your explanation! |
@lpereira, yes, Windows has this too. probably wouldn't want to start out on the path of handling AVs but this is something we can definitely keep in mind. I'll add a note to the doc. |
|
||
#### **Design outline** | ||
|
||
This is an opt in feature. When the runtime detects it’s turned on, it creates a dedup thread to do the work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate more? Is this a compile flag or .config flag?
How would dll behave? Would it share the same pool? AppDomain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for chiming in :)
I think that would be turned on through an environment variable. As per our tradition, it would probably be called COMPLUS_DeduplicateString
or something similar. The environment variable is read on process startup, and it will impact the current process.
CC @jkotas @cshung