New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent nepotism with generational GCs #5016
Conversation
Could you update the code comment and the commit comment? Please add the URL to the related article about nepotism with generational GC algorithms. http://netty.io/wiki/writing-a-commit-message.html |
Hi guys. I'll update the commit message and squash the commits. How much of an explanation do you expect in a comment? Nepotism has been around as long as generational garbage collection has existed. There is a simple explanation here: http://www.memorymanagement.org/glossary/n.html . http://dl.acm.org/citation.cfm?doid=1133956.1133960 also have a good explanation of the problem. Searching on Google for generational garbage collection and nepotism leads straight to the definition. |
@varming just add the link to the motivation section and add something like "null out references" in the modifications section. So nothing fancy :) |
BTW, do you know why a LinkedHashMap was not used. The code looks like a re-implementation of LinkedHashMap that is available in the JDK. On a similar note the same nepotism issue was fixed in LinkedHashMap in JDK8: http://hg.openjdk.java.net/jdk8/jdk8/jdk/rev/d62c911aebbb#l2.165 |
Motivation: If a single Encoder object is promoted to the old generation then every object reachable from the promoted object will eventually be promoted as well. A queue illustrates the problem very well. Say a sequence of inserts and deletions generate an object graph: A -> B -> C -> D -> E -> F -> G -> H, the head of the queue is E, the tail of the queue is H, and A, B, C, D are dead. If all queue nodes are in the young generation, then a young gc will clean up the object graph and leave us with: E -> F -> G -> H on the other hand, if B and C were previously promoted to the old generation, then a young collection assumes the refernece from C to D is from a live object (this is a key result of generational gc, no need to mark the old generation). Hence the young collection assumes the refence to D is a gc root and leave us with the object graph: B-> C -> D -> E -> F -> G -> H. Eventually D, E, F, G, H, and all queue nodes ever seen from this point on will be promoted, regardless of their global live or dead status. It is generally trivial to fix nepotism issues by simply breaking the reference chain after dequeuing a node. Currently Encoder objects do not null their references when removed from the hash map. We have observed a 20X increase in promoted Encoder objects due to nepotism. Modifications: Null before, after, and next fields when removing Encoder objects from maps. Result: Fewer promoted Encoder objects, fewer Encoder objects in the old generation, shorter young collection times, old collections spaced further apart (nepotism is just really bad). Enjoy.
@varming that is an "awesome" commit message :) Thanks for this! I will cherry-pick today. |
NP, and thank you for taking the patch. Is there an interesting in removing most of the code in Encoder and use a reference to a LinkedHashMap ? I don't mind doing it. |
@Scottmitch @nmittler @trustin WDYT ? Sounds fine for me. |
@varming I believe this was done because we needed specific control over the "size" and "capacity" of the table computed in terms of the length of the strings in the entries. |
@varming I'm sure if you wanted you could create a sub-class of LinkedHashMap will all the right bits overridden to give you proper eviction behavior and then use that inside of Encoder. Not sure it's worth the trouble in this case. |
@jpinner thank you for your reply. I was looking at the code and found out that the linear list structure of the internal buckets is important for getIndex(byte[]), so I think I am going to leave this for now. |
Thanks a lot @varming :-) LGTM |
@varming - Thanks for submitting the PR! Sorry for the delay in reviewing but LGTM! |
No description provided.