New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce allocations in Interner #2481
Conversation
Functionals passed except for |
@dotnet-bot test this please |
} | ||
return false; | ||
WeakReference<T> cacheEntry; | ||
return internCache.TryGetValue(key, out cacheEntry) && cacheEntry != null && cacheEntry.TryGetTarget(out obj); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
&& obj != null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's guaranteed by TryGetTarget
// Create new object and ensure the entry is still valid by re-inserting it into the cache. | ||
result = creatorFunc(key); | ||
cacheEntry.SetTarget(result); | ||
internCache[key] = cacheEntry; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are not worried about the different atomicity semantic snow (AddOrUpdate vs. what you do now)?
I am not saying its wrong now, just drawing your attention that you changed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were using AddOrUpdate
just like a setter (our update lambda just replaces the original value). There is no added safety by performing AddOrUpdate
over just using the index setter.
internCache.TryGetValue(key, out cacheEntry); | ||
|
||
// If no cache entry exists, create and insert a new one using the creator function. | ||
if (cacheEntry == null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has a different atomicity now - before that it GetOrAdd was atomic, now it is not. Is it a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Func
passed to GetOrAdd
are invoked outside of any lock. This effectively inlines the logic so that we do not need to allocate that closure.
This is the GetOrAdd
code:
TValue resultingValue;
if (this.TryGetValue(key, out resultingValue)) return resultingValue;
this.TryAddInternal(
key,
valueFactory(key),
updateIfExists: false,
acquireLock: true,
out resultingValue);
return resultingValue;
There's no point allocating the valueFactory
closure (captures creatorFunc
), it makes more sense (in my opinion) to just call TryGetValue
and if the value doesn't exist or is no longer valid, then to create and set the value.
Additionally, the purpose of this class is to cache immutable values (or am I wrong about this?) - if the values stored in this class do mutate, then that's an issue, i,.e, user is abusing this class.
Thanks for calling this out, by the way. I should have preemptively added a comment here to discuss my reasoning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if under if (cacheEntry == null)
we simply called cacheEntry = internCache.GetOrAdd()
like before to maintain the behavior in the add case? While the code looks safe to me for how it's used, it's not equivalent to the internal implementation of GetOrAdd
and can create multiple copies of the cached item. Not a big deal though IMO.
In reply to: 91662953 [](ancestors = 91662953)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not equivalent to the internal implementation of GetOrAdd and can create multiple copies of the cached item
GetOrAdd
can, too, since it's executed outside of the lock (look at valueFactory(key)
above). The reason to go with this route over GetOrAdd
is that this avoids allocations and offers the same semantics/guarantees.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing something. Doesn't
updateIfExists: false,
acquireLock: true,
prevent it from creating multiple copies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sergeybykov the copy is made before the call. Note that the signature of the function is
private bool TryAddInternal(TKey key, TValue value, bool updateIfExists, bool acquireLock, out TValue resultingValue)
The call is made to obtain value
prior to invoking TryAddInternal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GetOrAdd docs say this:
If you call GetOrAdd simultaneously on different threads, addValueFactory may be called multiple times, but its key/value pair might not be added to the dictionary for every call.
A benchmark verifies that no allocations are performed in the new iteration: Host Process Environment Information:
BenchmarkDotNet.Core=v0.9.9.0
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Xeon(R) CPU E3-1230 v5 3.40GHz, ProcessorCount=8
Frequency=3328123 ticks, Resolution=300.4697 ns, Timer=TSC
CLR=MS.NET 4.0.30319.42000, Arch=32-bit RELEASE
GC=Non-concurrent Server
JitModules=clrjit-v4.6.2006.0
Type=InternerBenchmarks Mode=Throughput
Method | Median | StdDev | Gen 0 | Gen 1 | Gen 2 | Bytes Allocated/Op |
------------ |------------ |---------- |------- |------ |------ |------------------- |
Interner | 109.9614 ns | 2.6255 ns | - | - | - | 0.01 |
OldInterner | 190.1600 ns | 4.0172 ns | 225.00 | - | - | 78.76 | The benchmark code was simple: public class InternerBenchmarks
{
private const int NumKeys = 1;
private const int NumIterationsPerKey = 2;
private readonly Interner<string, Tuple<string, int>> interner = new Interner<string, Tuple<string, int>>();
private readonly OldInterner<string, Tuple<string, int>> oldInterner = new OldInterner<string, Tuple<string, int>>();
private readonly List<string> keys;
public InternerBenchmarks()
{
this.keys = new List<string>(NumKeys);
for (var i = 0; i < NumKeys; i++)
{
this.keys.Add(i.ToString());
}
}
[Benchmark]
public int Interner()
{
var result = 0;
for (int i = 0; i < NumIterationsPerKey; i++)
{
foreach (var key in keys)
{
result += this.interner.FindOrCreate(key, k => Tuple.Create(k, k.Length)).Item2;
}
}
return result;
}
[Benchmark]
public int OldInterner()
{
var result = 0;
for (int i = 0; i < NumIterationsPerKey; i++)
{
foreach (var key in keys)
{
result += this.oldInterner.FindOrCreate(key, () => Tuple.Create(key, key.Length)).Item2;
}
}
return result;
}
} |
Apologies for the prod, but is someone able to look at this? |
Cleans up the
Interner
class so that allocations do not occur on the hot path.createFunc
parameter if possibleInterner.FindOrCreate
WeakReference<T>
instead ofWeakReference
in order to simplify the code.