Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize TableId cache performance #1781

Closed
milleruntime opened this issue Nov 16, 2020 · 5 comments
Closed

Optimize TableId cache performance #1781

milleruntime opened this issue Nov 16, 2020 · 5 comments
Labels
enhancement This issue describes a new feature, improvement, or optimization.

Comments

@milleruntime
Copy link
Contributor

The TableId class uses a guava cache for all accesses to table ID and it currently uses weakValues(). According to the Guava javadoc, weakValues are poor for caching and softValues should be considered instead. It would be good to compare the performance of the two to see which is better. Since a cluster with high table turnover may see different performance than one that maintains the same set of tables, we may want to make this configurable/pluggable.

@milleruntime milleruntime added the enhancement This issue describes a new feature, improvement, or optimization. label Nov 16, 2020
@ctubbsii
Copy link
Member

ctubbsii commented Nov 16, 2020

SoftReferences are cleaned up at the discretion of the Garbage Collector. They are kept around longer, which is better for caching, because it is more likely you'll get a cache hit. However, that's not how we're using this cache. We're using WeakReferences, specifically because we're only using it for deduplication/canonicalization, and not cache hit performance. The javadoc for WeakReference says "Weak references are most often used to implement canonicalizing mappings.", which is how we are using it. The javadoc for SoftReference says "Soft references are most often used to implement memory-sensitive caches.", which is not how we are using it.

I think what we have is correct. Also, this is a slight internal optimization so we don't have a ton of internal objects referring to the same table id (canonicalization for storage optimization), essentially the same as String interning. It is not something that is cached for performance/speed, as it is not speed-critical code, and I certainly don't think it would be worth making pluggable.

@milleruntime
Copy link
Contributor Author

OK sounds good.

@milleruntime
Copy link
Contributor Author

It is probably worth mentioning this in the code since there are no comments about the cacheing in the class.

@ctubbsii
Copy link
Member

I will add some comments.

@ctubbsii
Copy link
Member

Added comments in #1782

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This issue describes a new feature, improvement, or optimization.
Projects
None yet
Development

No branches or pull requests

2 participants