Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce allocations in Interner #2481

Merged
merged 2 commits into from Jan 9, 2017

Conversation

ReubenBond
Copy link
Member

Cleans up the Interner class so that allocations do not occur on the hot path.

  • Avoid allocating a closure for the createFunc parameter if possible
  • Avoid closure allocations within Interner.FindOrCreate
  • Converted to WeakReference<T> instead of WeakReference in order to simplify the code.

@gabikliot gabikliot self-assigned this Dec 5, 2016
@sergeybykov sergeybykov added this to the 1.4.0 milestone Dec 6, 2016
@ReubenBond
Copy link
Member Author

Functionals passed except for ActivationCollectorShouldNotCollectBusyStatelessWorkers, which failed in CI but passed locally

@jason-bragg
Copy link
Contributor

@dotnet-bot test this please

}
return false;
WeakReference<T> cacheEntry;
return internCache.TryGetValue(key, out cacheEntry) && cacheEntry != null && cacheEntry.TryGetTarget(out obj);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&& obj != null?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's guaranteed by TryGetTarget

// Create new object and ensure the entry is still valid by re-inserting it into the cache.
result = creatorFunc(key);
cacheEntry.SetTarget(result);
internCache[key] = cacheEntry;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are not worried about the different atomicity semantic snow (AddOrUpdate vs. what you do now)?
I am not saying its wrong now, just drawing your attention that you changed it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were using AddOrUpdate just like a setter (our update lambda just replaces the original value). There is no added safety by performing AddOrUpdate over just using the index setter.

internCache.TryGetValue(key, out cacheEntry);

// If no cache entry exists, create and insert a new one using the creator function.
if (cacheEntry == null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has a different atomicity now - before that it GetOrAdd was atomic, now it is not. Is it a problem?

Copy link
Member Author

@ReubenBond ReubenBond Dec 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Func passed to GetOrAdd are invoked outside of any lock. This effectively inlines the logic so that we do not need to allocate that closure.

This is the GetOrAdd code:

TValue resultingValue;
if (this.TryGetValue(key, out resultingValue)) return resultingValue;
this.TryAddInternal(
  key,
  valueFactory(key),
  updateIfExists: false,
  acquireLock: true,
  out resultingValue);
return resultingValue;

There's no point allocating the valueFactory closure (captures creatorFunc), it makes more sense (in my opinion) to just call TryGetValue and if the value doesn't exist or is no longer valid, then to create and set the value.

Additionally, the purpose of this class is to cache immutable values (or am I wrong about this?) - if the values stored in this class do mutate, then that's an issue, i,.e, user is abusing this class.

Thanks for calling this out, by the way. I should have preemptively added a comment here to discuss my reasoning.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if under if (cacheEntry == null) we simply called cacheEntry = internCache.GetOrAdd() like before to maintain the behavior in the add case? While the code looks safe to me for how it's used, it's not equivalent to the internal implementation of GetOrAdd and can create multiple copies of the cached item. Not a big deal though IMO.


In reply to: 91662953 [](ancestors = 91662953)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sergeybykov

it's not equivalent to the internal implementation of GetOrAdd and can create multiple copies of the cached item

GetOrAdd can, too, since it's executed outside of the lock (look at valueFactory(key) above). The reason to go with this route over GetOrAdd is that this avoids allocations and offers the same semantics/guarantees.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing something. Doesn't
updateIfExists: false,
acquireLock: true,
prevent it from creating multiple copies?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sergeybykov the copy is made before the call. Note that the signature of the function is

private bool TryAddInternal(TKey key, TValue value, bool updateIfExists, bool acquireLock, out TValue resultingValue)

The call is made to obtain value prior to invoking TryAddInternal

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GetOrAdd docs say this:

If you call GetOrAdd simultaneously on different threads, addValueFactory may be called multiple times, but its key/value pair might not be added to the dictionary for every call.

@ReubenBond
Copy link
Member Author

A benchmark verifies that no allocations are performed in the new iteration:

Host Process Environment Information:
BenchmarkDotNet.Core=v0.9.9.0
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Xeon(R) CPU E3-1230 v5 3.40GHz, ProcessorCount=8
Frequency=3328123 ticks, Resolution=300.4697 ns, Timer=TSC
CLR=MS.NET 4.0.30319.42000, Arch=32-bit RELEASE
GC=Non-concurrent Server
JitModules=clrjit-v4.6.2006.0

Type=InternerBenchmarks  Mode=Throughput

      Method |      Median |    StdDev |  Gen 0 | Gen 1 | Gen 2 | Bytes Allocated/Op |
------------ |------------ |---------- |------- |------ |------ |------------------- |
    Interner | 109.9614 ns | 2.6255 ns |      - |     - |     - |               0.01 |
 OldInterner | 190.1600 ns | 4.0172 ns | 225.00 |     - |     - |              78.76 |

The benchmark code was simple:

public class InternerBenchmarks
{
    private const int NumKeys = 1;
    private const int NumIterationsPerKey = 2;
    private readonly Interner<string, Tuple<string, int>> interner = new Interner<string, Tuple<string, int>>();
    private readonly OldInterner<string, Tuple<string, int>> oldInterner = new OldInterner<string, Tuple<string, int>>();
    private readonly List<string> keys;

    public InternerBenchmarks()
    {
        this.keys = new List<string>(NumKeys);
        for (var i = 0; i < NumKeys; i++)
        {
            this.keys.Add(i.ToString());
        }
    }

    [Benchmark]
    public int Interner()
    {
        var result = 0;
        for (int i = 0; i < NumIterationsPerKey; i++)
        {
            foreach (var key in keys)
            {
                result += this.interner.FindOrCreate(key, k => Tuple.Create(k, k.Length)).Item2;
            }
        }

        return result;
    }

    [Benchmark]
    public int OldInterner()
    {
        var result = 0;
        for (int i = 0; i < NumIterationsPerKey; i++)
        {
            foreach (var key in keys)
            {
                result += this.oldInterner.FindOrCreate(key, () => Tuple.Create(key, key.Length)).Item2;
            }
        }

        return result;
    }
}

@ReubenBond
Copy link
Member Author

Apologies for the prod, but is someone able to look at this?

@ReubenBond ReubenBond closed this Jan 3, 2017
@ReubenBond ReubenBond reopened this Jan 3, 2017
@sergeybykov sergeybykov merged commit cce5669 into dotnet:master Jan 9, 2017
@ReubenBond ReubenBond deleted the reduce-allocs-interner branch January 9, 2017 21:51
@ReubenBond ReubenBond added area-performance All performance related issues and removed performance labels Mar 24, 2022
@github-actions github-actions bot locked and limited conversation to collaborators Dec 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-performance All performance related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants