# `System.Linq.Enumerable.ToDictionary` and duplicates

Calling `.ToDictionary` [📖 [docs](https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.todictionary?view=net-8.0)] can be dangerous because of this error message:

```console
An item with the same key has already been added.
```

This can happen with the `data` below:

In [1]:
IEnumerable<KeyValuePair<string, string>> data = new []
{
    new KeyValuePair<string, string>("key-01", "value-01"),  //original
    new KeyValuePair<string, string>("key-02", "value-02"),
    new KeyValuePair<string, string>("key-01", "value-01b"), //duplicate with different value
    new KeyValuePair<string, string>("key-03", "value-03"),
    new KeyValuePair<string, string>("key-04", "value-04"),
    new KeyValuePair<string, string>("key-01", "value-01"),  //duplicate
};

data

index,value
,
,
,
,
,
,
0,"[key-01, value-01]Keykey-01Valuevalue-01"
,
Key,key-01
Value,value-01

Unnamed: 0,Unnamed: 1
Key,key-01
Value,value-01

Unnamed: 0,Unnamed: 1
Key,key-02
Value,value-02

Unnamed: 0,Unnamed: 1
Key,key-01
Value,value-01b

Unnamed: 0,Unnamed: 1
Key,key-03
Value,value-03

Unnamed: 0,Unnamed: 1
Key,key-04
Value,value-04

Unnamed: 0,Unnamed: 1
Key,key-01
Value,value-01


## the danger of calling `.ToDictionary` without `.DistinctBy`

Because of the duplicate `KeyValuePair<string, string>.Key` in the `data` above, we can demonstrate how we get our error message:

In [2]:
data.ToDictionary(pair => pair.Key, pair => pair.Value)

Error: System.ArgumentException: An item with the same key has already been added. Key: key-01
   at System.Collections.Generic.Dictionary`2.TryInsert(TKey key, TValue value, InsertionBehavior behavior)
   at System.Collections.Generic.Dictionary`2.Add(TKey key, TValue value)
   at System.Linq.Enumerable.ToDictionary[TSource,TKey,TElement](TSource[] source, Func`2 keySelector, Func`2 elementSelector, IEqualityComparer`1 comparer)
   at System.Linq.Enumerable.ToDictionary[TSource,TKey,TElement](IEnumerable`1 source, Func`2 keySelector, Func`2 elementSelector, IEqualityComparer`1 comparer)
   at System.Linq.Enumerable.ToDictionary[TSource,TKey,TElement](IEnumerable`1 source, Func`2 keySelector, Func`2 elementSelector)
   at Submission#3.<<Initialize>>d__0.MoveNext()
--- End of stack trace from previous location ---
   at Microsoft.CodeAnalysis.Scripting.ScriptExecutionState.RunSubmissionsAsync[TResult](ImmutableArray`1 precedingExecutors, Func`2 currentExecutor, StrongBox`1 exceptionHolderOpt, Func`2 catchExceptionOpt, CancellationToken cancellationToken)

Naïvely calling `.DistinctBy` [📖 [docs](https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.distinctby?view=net-8.0)] loses data:

In [3]:
data
    .DistinctBy(pair => pair.Key)
    .ToDictionary(pair => pair.Key, pair => pair.Value)

key,value
key-01,value-01
key-02,value-02
key-03,value-03
key-04,value-04


## `.GroupBy` can be used to generate a dictionary (with caveats)

We can avoid throwing the error above by calling `.GroupBy` [📖 [docs](https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.groupby?view=net-8.0)] and `.ToDictionary`:

In [4]:
Dictionary<string, string[]> dictionary = data
    .GroupBy(pair => pair.Key)
    .ToDictionary(group => group.Key, group => group.Select(pair => pair.Value).ToArray());

dictionary

key,value
key-01,"[ value-01, value-01b, value-01 ]"
key-02,[ value-02 ]
key-03,[ value-03 ]
key-04,[ value-04 ]


The first caveat here is our changing the output from `Dictionary<string, string>` to `Dictionary<string, string[]>` (with a little help from the `.ToArray` [📖 [docs](https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.toarray?view=net-8.0)] call in the second lambda expression).

# `.ToLookup` is the one-liner alternative to `.ToDictionary`

Unless one is several thousand percent certain that `.ToDictionary` will be called against unique keys, the safest alternative is `.ToLookup` [📖 [docs](https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.tolookup?view=net-8.0)]. This call returns `ILookup<string, string>` in the example below:

In [5]:
ILookup<string, string> lookup = data.ToLookup(i => i.Key, i => i.Value);

lookup

index,value
,
,
,
,
Count,4
(values),"indexvalue0[ value-01, value-01b, value-01 ]Keykey-01(values)[ value-01, value-01b, value-01 ]1[ value-02 ]Keykey-02(values)[ value-02 ]2[ value-03 ]Keykey-03(values)[ value-03 ]3[ value-04 ]Keykey-04(values)[ value-04 ]"
index,value
0,"[ value-01, value-01b, value-01 ]Keykey-01(values)[ value-01, value-01b, value-01 ]"
,
Key,key-01

index,value
,
,
,
,
0,"[ value-01, value-01b, value-01 ]Keykey-01(values)[ value-01, value-01b, value-01 ]"
,
Key,key-01
(values),"[ value-01, value-01b, value-01 ]"
1,[ value-02 ]Keykey-02(values)[ value-02 ]
,

Unnamed: 0,Unnamed: 1
Key,key-01
(values),"[ value-01, value-01b, value-01 ]"

Unnamed: 0,Unnamed: 1
Key,key-02
(values),[ value-02 ]

Unnamed: 0,Unnamed: 1
Key,key-03
(values),[ value-03 ]

Unnamed: 0,Unnamed: 1
Key,key-04
(values),[ value-04 ]


While `Dictionary<TKey,TValue>.Keys` is defined, `ILookup<TKey,TValue>` does not have a `.Keys` property. Instead we can project the keys with:

In [6]:
lookup.Select(i => i.Key)

However, like a dictionary, `ILookup<TKey,TValue>` has an _indexer_ [📖 [docs](https://learn.microsoft.com/en-Us/dotnet/csharp/programming-guide/indexers/)] property:

In [7]:
lookup["key-03"]

Unnamed: 0,Unnamed: 1
Key,key-03
(values),[ value-03 ]


The presence of an indexer often suggests that there is a `.Count` property:

In [8]:
lookup.Count

While the dictionary has `Dictionary<TKey,TValue>.ContainsKey(TKey)` and `Dictionary<TKey,TValue>.ContainsValue(TValue)`, this `ILookup<TKey,TValue>` instance only has the equivalent of `.ContainsKey` which is `ILookup<TKey,TElement>.Contains(TKey)` [📖 [docs](https://learn.microsoft.com/en-us/dotnet/api/system.linq.ilookup-2.contains?view=net-8.0)]:

In [9]:
lookup.Contains("key-05")

Since the Microsoft implementation of `ILookup<TKey,TElement>` inherits from `IEnumerable<TElement>`, we can, of course, call `.ToDictionary`:

In [10]:
lookup.ToDictionary(group => group.Key, group => group.Aggregate((a,i) => $"{i},{a}"))

key,value
key-01,"value-01,value-01b,value-01"
key-02,value-02
key-03,value-03
key-04,value-04


## <!-- -->

[Bryan Wilhite is on LinkedIn](https://www.linkedin.com/in/wilhite)🇺🇸💼