Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(redis): Don't re-encode string when computing hash #3838

Merged
merged 1 commit into from
Jul 2, 2019
Merged

perf(redis): Don't re-encode string when computing hash #3838

merged 1 commit into from
Jul 2, 2019

Conversation

ezimanyi
Copy link
Contributor

@ezimanyi ezimanyi commented Jul 2, 2019

When caching values in redis, we compare the hash of the value we're about to write against the value already in the cache to avoid writing unchanged values again.

The hash function we're using is expensive as it encodes the string to bytes before hashing it. Per the documentation of the putString function, it is only useful for cross-language compatibility and we should instead use putUnencodedChars for much better performance when this is not needed.

Link to header block:
https://github.com/google/guava/blob/af57f19e99110695d2b1329ae313aa5158580417/guava/src/com/google/common/hash/Hasher.java#L110

When caching values in redis, we compare the hash of the value
we're about to write against the value already in the cache to
avoid writing unchanged values again.

The hash function we're using is expensive as it encodes the string
to bytes before hashing it. Per the documentation of the
putString function, it is only useful for cross-language compatibility
and we should instead use putUnencodedChars for much better performance
when this is not needed.
Copy link
Contributor

@maggieneterval maggieneterval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⬆️ 💻 #️⃣

@ezimanyi
Copy link
Contributor Author

ezimanyi commented Jul 2, 2019

@robzienert : Wanted to run this by you---this will change the hash implementation for storing keys in redis to be more efficient. The only downside I could see is that it could (in some cases if there are non-ASCII characters) cause a one-time unenecessary write to redis as the hash value could change when this merges. That seems to me to be a small price to pay for the performance improvement here, but wanted a second opinion here.

(The string encoding accounts for ~25% of allocated memory on Kubernetes caching cycles.)

Copy link
Member

@robzienert robzienert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find! This seems good to me. Things may get laggy switching from the old hash strategy to this one, but that's a one-time price.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants