Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for #34117, Improve Hash Code Distribution for RelationalCommandCache Keys #34118

Merged
merged 2 commits into from
Jul 9, 2024

Conversation

ADNewsom09
Copy link
Contributor

@ADNewsom09 ADNewsom09 commented Jun 28, 2024

This PR addresses a performance issue in the RelationalCommandCache class where the CommandCacheKey struct uses a hard-coded GetHashCode implementation that returns 0. This implementation leads to all instances being placed in the same bucket within the MemoryCache's internal ConcurrentDictionary, causing linear search times and negating the performance benefits of the hash-based collection.

See Issue #34117

Changes

  • Modified the GetHashCode implementation in the CommandCacheKey struct to use RuntimeHelpers.GetHashCode(_queryExpression).

  • Ensured that the new implementation maintains the contract with the overridden Equals method, which first checks for reference equality.

…Code

Switched to RuntimeHelpers.GetHashCode over .GetHashCode to avoid any overridden GetHashCode methods that might need additional processing to propagate hash calculations down the expression tree.
@ADNewsom09 ADNewsom09 marked this pull request as ready for review July 1, 2024 13:11
@AndriySvyryd AndriySvyryd requested a review from roji July 1, 2024 23:54
Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks... Good catch on the perf issue, and the fix looks good. This will bucketize RelationalCommandCache entries based on their query expressions, which is indeed an improvement.

Note that we can create more high-quality keys by also including parameter nullability information in the hashcode, exactly as the Equals method does. As the PR currently is, all entries for a given query are bucketized together (since there's no parameter nullability differentiation in the hashcode), and so there will be linear searching between the different cache entries for the query. If we include parameter nullability, that would be improved.

Let me know if this makes sense to you and is something you want to implement in this PR; if not, I'm still fine with merging it as it's a clear improvement over the existing situation.

@ADNewsom09
Copy link
Contributor Author

I'm good with this, I understand the additions you are referring to but will leave those as an improvement for when time allows. Thanks.

@roji roji merged commit 966df0e into dotnet:main Jul 9, 2024
7 checks passed
@ADNewsom09 ADNewsom09 deleted the patch-1 branch July 10, 2024 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve performance for RelationalCommandCache by adding a hashcode to the Command
2 participants