Optimize dictionary lookups in C# #32

yuriyostapenko · 2021-03-15T19:07:05Z

Please note that there should be a better API to avoid double key lookups in a dictictionary hopefully coming some time later.

Until then, this PR removes second lookup for existing keys by wrapping counter in a reference cell, shaving off 15-20% on my machine (on .net5).

akraines · 2021-03-15T19:16:23Z

You probably should call this optimize.cs - since it isn't the simple case anymore (see the comment on to close pull request #24).

benhoyt · 2021-03-15T19:33:38Z

Yep, this is 15% faster for me too, thanks. However, I don't want to optimize the simple.cs simple version. I'd be happy to add this as a first cut at the C# optimized version, though. If you want to do that, please add a new optimized.cs instead, and add to test.sh, benchmark.py, and the credits in README.md.

yuriyostapenko · 2021-03-15T20:34:10Z

Sure, to be honest I totally expected this, so I'll do almost that :)
But given it's dotnet, I think we need to introduce structure similar to Rust's, since it's very unidiomatic with just a single .cs file.

… build instead of mono

yuriyostapenko · 2021-03-15T21:07:03Z

I've made the mentioned above changes.
It will only work with .net 5 SDK, which is how it should be IMHO. Measuring dotnet/c# performance using mono is, respectfully, like using an alternative distribution of python2.

benhoyt · 2021-03-15T21:11:03Z

Thanks! I'll see if I can get .net going on Linux.

yuriyostapenko · 2021-03-15T21:11:46Z

Hope this helps: https://docs.microsoft.com/en-us/dotnet/core/install/linux

erikbozic · 2021-03-15T21:27:15Z

Hi,
Didn't notice you were doing the same so I also sent a PR setting up a structure simillar to rust for c#: #35 😅

I guess we need to consolidate. I was thinking we could do something like techempower benchmarks, but simpler.

yuriyostapenko · 2021-03-15T21:32:55Z

@erikbozic, I'm glad to see whatever structure will emerge.
I would not bother with mono performance, though, as it's going away soon, anyway.

erikbozic · 2021-03-15T21:50:48Z

By going away you're referring to .NET 6? I guess you're right.
If we don't want to have both mono and dotnet benchmarks we can just use the structure you set up in this PR. I'll close mine :)

yuriyostapenko · 2021-03-15T21:53:13Z

Yes, I did mean that .NET 6 will consume it. And mono is a weak "showcase" of modern .net potential, anyway.

benhoyt · 2021-03-16T00:04:10Z

Thank you -- just downloaded .NET for Linux (it was a "snap" :-).

Joe4evr · 2021-03-16T10:49:10Z

csharp/optimized/Program.cs

+
+class Program
+{
+    public sealed class Ref<T> {


Small note: The BCL supplies StrongBox<T> (under System.Runtime.CompilerServices) that fulfils the exact same purpose as this type.

TIL, thanks!

What is the purpose of having wrapped the integer anyway? My initial thought was that this would actually make the program less performant, because a reference type lives on the heap, whereas an int lives on the stack. Clearly there must be something I'm missing here?

@calledude it's a reference type, so we don't copy a new int every time we get it out of the dictionary - but instead increment the existing one. If you try it without you will see all counts are just 1.
In a more general sense: we're storing a pointer to integer instead of only the integer itself.

Otherwise we would need to get it out of the dictionary, increment it and then set the new incremented value back.

What about a ref struct then? This would force it to still live on the stack and we would have a pointer to it, perhaps that would make it even more performant? I assume there's a tradeoff in the current solution where having it on the stack does not outweigh the cost of copying the value all the time?

Edit; Nvm, ref structs can't be used as generic type arguments.

To be honest, I haven't looked into that part too much.
A trace I did earlier shows that the actual lookup takes the most time, followed by string creation. (not sure, if the latest code version, but should be close)

So I'm focusing on that first.

FWIW; According to BenchmarkDotNet int is still faster.

FWIW; According to BenchmarkDotNet int is still faster.

@calledude, I'm not sure what exactly you are measuring there, but if you simply removed wrapping object and still use the rest of the optimized code as-is (that is, using TryGetValue), your faster code does not work correctly. As @erikbozic already pointed out - all your counters will never increment and always stay at 1.

Yes, extra object indirection adds overhead, but this code is faster than code in "simple" without it exactly because code without it will have to do hash lookup twice: first, to read value and then to store it back again. That is because current Dictionary API does not support atomic increment or in-place value modification.

As for you question on ref struct - the whole Dictionary lives on the heap and anything you store there has to be boxed and put on a heap, which ref struct cannot be.

Just double checked and I could've sworn I had made sure I was actually incrementing the value correctly, but I guess not then.

Not my proudest moment... :) Sorry about that!

Joe4evr · 2021-03-16T10:52:00Z

csharp/optimized/Program.cs

+        string line;
+        while ((line = Console.ReadLine()) != null)
+        {
+            line = line.ToLower();


Could save this potential string allocation by instantiating the dictionary with one of the *IgnoreCase StringComparers. Additionally, that would show the explicit choice of if the algorithm is culture-aware or not.

To be honest I intentionally did not include any of those optimizations that were plentiful in the many parallel PRs :)

benhoyt mentioned this pull request Mar 15, 2021

Optimized C# version #27

Closed

Add optimized C# version and reorganize to use proper modern .net sdk…

0d594a0

… build instead of mono

Add optimized C# to benchmark

4117b23

erikbozic mentioned this pull request Mar 15, 2021

Seperate mono and netcore runtimes for csharp #35

Closed

benhoyt added 2 commits March 16, 2021 13:01

Merge branch 'master' into cs-dictionary

74534fe

Add credits

5f1553b

benhoyt merged commit a823d73 into benhoyt:master Mar 16, 2021

Joe4evr reviewed Mar 16, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize dictionary lookups in C# #32

Optimize dictionary lookups in C# #32

yuriyostapenko commented Mar 15, 2021 •

edited

Loading

akraines commented Mar 15, 2021

benhoyt commented Mar 15, 2021

yuriyostapenko commented Mar 15, 2021

yuriyostapenko commented Mar 15, 2021

benhoyt commented Mar 15, 2021

yuriyostapenko commented Mar 15, 2021

erikbozic commented Mar 15, 2021

yuriyostapenko commented Mar 15, 2021

erikbozic commented Mar 15, 2021

yuriyostapenko commented Mar 15, 2021

benhoyt commented Mar 16, 2021

Joe4evr Mar 16, 2021

yuriyostapenko Mar 16, 2021

calledude Mar 17, 2021 •

edited

Loading

erikbozic Mar 17, 2021 •

edited

Loading

calledude Mar 17, 2021 •

edited

Loading

erikbozic Mar 17, 2021 •

edited

Loading

calledude Mar 17, 2021

yuriyostapenko Mar 17, 2021 •

edited

Loading

yuriyostapenko Mar 17, 2021

calledude Mar 17, 2021

Joe4evr Mar 16, 2021

yuriyostapenko Mar 16, 2021

Optimize dictionary lookups in C# #32

Optimize dictionary lookups in C# #32

Conversation

yuriyostapenko commented Mar 15, 2021 • edited Loading

akraines commented Mar 15, 2021

benhoyt commented Mar 15, 2021

yuriyostapenko commented Mar 15, 2021

yuriyostapenko commented Mar 15, 2021

benhoyt commented Mar 15, 2021

yuriyostapenko commented Mar 15, 2021

erikbozic commented Mar 15, 2021

yuriyostapenko commented Mar 15, 2021

erikbozic commented Mar 15, 2021

yuriyostapenko commented Mar 15, 2021

benhoyt commented Mar 16, 2021

Joe4evr Mar 16, 2021

Choose a reason for hiding this comment

yuriyostapenko Mar 16, 2021

Choose a reason for hiding this comment

calledude Mar 17, 2021 • edited Loading

Choose a reason for hiding this comment

erikbozic Mar 17, 2021 • edited Loading

Choose a reason for hiding this comment

calledude Mar 17, 2021 • edited Loading

Choose a reason for hiding this comment

erikbozic Mar 17, 2021 • edited Loading

Choose a reason for hiding this comment

calledude Mar 17, 2021

Choose a reason for hiding this comment

yuriyostapenko Mar 17, 2021 • edited Loading

Choose a reason for hiding this comment

yuriyostapenko Mar 17, 2021

Choose a reason for hiding this comment

calledude Mar 17, 2021

Choose a reason for hiding this comment

Joe4evr Mar 16, 2021

Choose a reason for hiding this comment

yuriyostapenko Mar 16, 2021

Choose a reason for hiding this comment

yuriyostapenko commented Mar 15, 2021 •

edited

Loading

calledude Mar 17, 2021 •

edited

Loading

erikbozic Mar 17, 2021 •

edited

Loading

calledude Mar 17, 2021 •

edited

Loading

erikbozic Mar 17, 2021 •

edited

Loading

yuriyostapenko Mar 17, 2021 •

edited

Loading