## 7 ways to count occurances in a list

Let's start with a list of animals, how do we count how many times each animal shows up?

In [1]:
animals = ["cat", "dog", "cat", "fish", "cat"]

### Method 1: try except

In [2]:
d = {}
for key in animals:
    try:
        d[key] += 1
    except KeyError:
        d[key] = 1
d

{'cat': 3, 'dog': 1, 'fish': 1}

### Method 2: if statement with in expression variant 1

In [3]:
d = {}
for key in animals:
    if key not in d:
        d[key] = 1
    else:
        d[key] += 1
d

{'cat': 3, 'dog': 1, 'fish': 1}

### Method 3: if statement with in expression variant 2

In [25]:
d = {}
for key in animals:
    if key not in d:
        d[key] = 0
    d[key] += 1
d

{'cat': 3, 'dog': 1, 'fish': 1}

### Method 4: setdefault

In [22]:
d = {}
for key in animals:
    d.setdefault(key, 0)
    d[key] += 1
d

{'cat': 3, 'dog': 1, 'fish': 1}

### Method 5: get

In [6]:
d = {}
for key in animals:
    d[key] = d.get(key, 0) + 1
d

{'cat': 3, 'dog': 1, 'fish': 1}

### Method 6: defaultdict

In [24]:
from collections import defaultdict

d = defaultdict(int)
for key in animals:
    d[key] += 1
d

defaultdict(int, {'cat': 3, 'dog': 1, 'fish': 1})

### Method 7: Counter (recommended)

In [8]:
from collections import Counter

Counter(animals)

Counter({'cat': 3, 'dog': 1, 'fish': 1})

### Other methods

In [9]:
from toolz import frequencies

frequencies(animals)

{'cat': 3, 'dog': 1, 'fish': 1}

In [10]:
from cytoolz import frequencies

frequencies(animals)

{'cat': 3, 'dog': 1, 'fish': 1}

In [11]:
from tlz import frequencies

frequencies(animals)

{'cat': 3, 'dog': 1, 'fish': 1}

In [12]:
L = list(range(10_000_000))

In [13]:
%timeit dict(Counter(L))

334 ms ± 2.84 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [14]:
%timeit frequencies(L)

295 ms ± 5.43 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [15]:
L2 = [0 for _ in range(10_000_000)]

In [16]:
%timeit dict(Counter(L2))

232 ms ± 1.96 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [17]:
%timeit frequencies(L2)

226 ms ± 1.76 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
