Skip to content

Conversation

andrewjsaid
Copy link
Contributor

In #94667 we fixed a bug where FrozenDictionary<string, T> was incorrectly using case sensitive comparison. The solution chosen minimized the diff as the PR would be ported to .NET 8. Upon reflection, alongside fixing the bug, it did de-optimize some cases where the optimization was in fact valid. The examples below give a quick summary.

// PR94667 correctly targeted this scenario. Case sensitivity matters here even if only hashing last character.
keys1 = ["abc1", "abc2", "abc3", "abc4", "abc5", "abc6"];

// PR94667 left this scenario unchanged. Case sensitivity doesn't matter but the hash is against the entire string.
keys2 = ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"];

// PR94667 unnecessarily de-optimized this scenario. Case sensitivity doesn't matter even if only hashing last character.
keys3 = ["001", "002", "003", "004", "005", "006"];

This PR I am submitting re-introduces the optimization for scenarios like keys3 above i.e. where:

  1. The KeyAnalyzer has found a substring
  2. The entirety of all keys are ASCII
  3. There are no letters in any of the keys - not only checking the substring.

I am not submitting any benchmarks up-front as it is a re-introduction of an optimization in some cases and there are no benchmarks for case insensitive Frozen Dictionaries. Please let me know if there's any relevant benchmarks I should run.

@ghost ghost added area-System.Collections community-contribution Indicates that the PR has been added by a community member labels Nov 25, 2023
@ghost
Copy link

ghost commented Nov 25, 2023

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

Issue Details

In #94667 we fixed a bug where FrozenDictionary<string, T> was incorrectly using case sensitive comparison. The solution chosen minimized the diff as the PR would be ported to .NET 8. Upon reflection, alongside fixing the bug, it did de-optimize some cases where the optimization was in fact valid. The examples below give a quick summary.

// PR94667 correctly targeted this scenario. Case sensitivity matters here even if only hashing last character.
keys1 = ["abc1", "abc2", "abc3", "abc4", "abc5", "abc6"];

// PR94667 left this scenario unchanged. Case sensitivity doesn't matter but the hash is against the entire string.
keys2 = ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"];

// PR94667 unnecessarily de-optimized this scenario. Case sensitivity doesn't matter even if only hashing last character.
keys3 = ["001", "002", "003", "004", "005", "006"];

This PR I am submitting re-introduces the optimization for scenarios like keys3 above i.e. where:

  1. The KeyAnalyzer has found a substring
  2. The entirety of all keys are ASCII
  3. There are no letters in any of the keys - not only checking the substring.

I am not submitting any benchmarks up-front as it is a re-introduction of an optimization in some cases and there are no benchmarks for case insensitive Frozen Dictionaries. Please let me know if there's any relevant benchmarks I should run.

Author: andrewjsaid
Assignees: -
Labels:

area-System.Collections

Milestone: -

@andrewjsaid
Copy link
Contributor Author

@stephentoub I have addressed your comments. Thanks

@danmoseley
Copy link
Member

Is it worth adding benchmarks, you mention they don't exist?

@andrewjsaid
Copy link
Contributor Author

@danmoseley I would recommend that at least a few benchmarks are added to dotnet/performance with case insensitive comparison, yes.

I can try to find the time to do so but I can not reliably make that commitment at this moment.

Copy link
Member

@stephentoub stephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@andrewjsaid
Copy link
Contributor Author

Sorry to pester but as it's approved, could it also be merged, please?

@stephentoub stephentoub merged commit bbb97a7 into dotnet:main Feb 15, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Mar 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Collections community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants