Skip to content

Conversation

@h3xds1nz
Copy link
Member

@h3xds1nz h3xds1nz commented Jul 15, 2024

Description

Replaces CaseInsensitiveComparer with IComparer<string> interface (StringComparer). Also replaces ArrayList with generic List<string> for minor perf benefit and increased code quality.

  • The IComparer<string> is now stored as a readonly variable in Speller, removing constant comparer allocs on segment checks.
  • Also removes unnecessary char[] allocation/copy and its immediate conversion to string, skipping one whole alloc round.
  • Keeps the initial capacity of 1 for the list, the subsequent sorted insertions are not a bottleneck for this use-case.

I've created a very rough typical use-case benchmark that's not gonna be 100% time-accurate but the time itself isn't important, allocs are (should mimic the previous/new code).

Method Mean [ns] Error [ns] StdDev [ns] Gen0 Code Size [B] Allocated [B]
Original 163,420.2 ns 1,550.7 ns 1,450.5 ns 6.1035 10,976 B 104576 B
PR__EDIT 154,813.4 ns 1,272.3 ns 1,190.1 ns 2.1973 5,332 B 40000 B
Benchmark code
private ArrayList _alIgnoredWords;
private List<string> _ignoredList;

private readonly CultureInfo _defaultCulture; //CultureInfo.CurrentCulture;
private readonly IComparer<string> _defaultComparer; //StringComparer.Create(_defaultCulture, true);

private readonly static char[] oldHannah =  { 'h', 'a', 'n', 'n', 'a', 'h' };

[Benchmark]
public void Original()
{
    //Add some items
    IgnoreAll("hell");
    IgnoreAll("aleela");
    IgnoreAll("anna");
    IgnoreAll("hannah");
    IgnoreAll("vaka");
    IgnoreAll("cute");
    IgnoreAll("he2ll");
    IgnoreAll("al2eela");
    IgnoreAll("an2na");
    IgnoreAll("han2nah");
    IgnoreAll("vak2a");
    IgnoreAll("cu2te");
    IgnoreAll("hell4");
    IgnoreAll("aleela4");
    IgnoreAll("anna4");
    IgnoreAll("hannah4");
    IgnoreAll("vaka4");
    IgnoreAll("cute4");
    IgnoreAll("he2ll4");
    IgnoreAll("al2eela4");
    IgnoreAll("an2na4");
    IgnoreAll("han2nah4");
    IgnoreAll("vak2a4");
    IgnoreAll("cu2te4");

    //Let's run some scans
    for (int i = 0; i < 1_000; i++)
    {
        char[] word = { 'h', 'a', 'n', 'n', 'a', 'h'};
        IsIgnoredWord(word);
    }
}
[Benchmark]
public void PR__EDIT()
{
    //Add some items
    IgnoreAllV2("hell");
    IgnoreAllV2("aleela");
    IgnoreAllV2("anna");
    IgnoreAllV2("hannah");
    IgnoreAllV2("vaka");
    IgnoreAllV2("cute");
    IgnoreAllV2("he2ll");
    IgnoreAllV2("al2eela");
    IgnoreAllV2("an2na");
    IgnoreAllV2("han2nah");
    IgnoreAllV2("vak2a");
    IgnoreAllV2("cu2te");
    IgnoreAllV2("hell4");
    IgnoreAllV2("aleela4");
    IgnoreAllV2("anna4");
    IgnoreAllV2("hannah4");
    IgnoreAllV2("vaka4");
    IgnoreAllV2("cute4");
    IgnoreAllV2("he2ll4");
    IgnoreAllV2("al2eela4");
    IgnoreAllV2("an2na4");
    IgnoreAllV2("han2nah4");
    IgnoreAllV2("vak2a4");
    IgnoreAllV2("cu2te4");

    //Let's run some scans
    for (int i = 0; i < 1_000; i++)
    {
        string word = new(oldHannah);
        IsIgnoredWordV2(word);
    }
}

private void IgnoreAll(string word)
{
    if (_alIgnoredWords is null)
        _alIgnoredWords = new ArrayList(1);

    int index = _alIgnoredWords.BinarySearch(word, new CaseInsensitiveComparer(_defaultCulture));

    if (index < 0)
    {
        _alIgnoredWords.Insert(~index, word);
    }
}

private bool IsIgnoredWord(char[] word) => _alIgnoredWords?.BinarySearch(new string(word), new CaseInsensitiveComparer(_defaultCulture)) >= 0;

private void IgnoreAllV2(string word)
{
    if (_ignoredList is null)
        _ignoredList = new List<string>(1);

    int index = _ignoredList.BinarySearch(word, _defaultComparer);

    if (index < 0)
    {
        _ignoredList.Insert(~index, word);
    }
}

private bool IsIgnoredWordV2(string word) => _ignoredList?.BinarySearch(word, _defaultComparer) >= 0;

Customer Impact

Increased performance, decreased allocations.

Regression

No.

Testing

Local build, base functionality.

Risk

Low.

Microsoft Reviewers: Open in CodeFlow

@h3xds1nz h3xds1nz requested review from a team as code owners July 15, 2024 21:03
@dotnet-policy-service dotnet-policy-service bot added PR metadata: Label to tag PRs, to facilitate with triage Community Contribution A label for all community Contributions labels Jul 15, 2024
@singhashish-wpf singhashish-wpf merged commit 174c5d4 into dotnet:main Jul 19, 2024
@singhashish-wpf
Copy link
Contributor

Thanks @h3xds1nz for this contribution.

@h3xds1nz
Copy link
Member Author

Thanks @singhashish-wpf for including it!

@github-actions github-actions bot locked and limited conversation to collaborators Aug 20, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Community Contribution A label for all community Contributions PR metadata: Label to tag PRs, to facilitate with triage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants