# 10.1 Counter

A counter is a sub-class of the dictionary. It is used to keep the count of the elements in an iterable in the form of an unordered dictionary where the key represents the element in the iterable and value represents the count of that element in the iterable. 

```text
class collections.Counter([iterable-or-mapping])
```

Below is a simple implementation of counter, note it only works for the int or string list. Because other data type can not be a key in a dict

```python
def counter(list):
    counter = {}
    for item in list:
        if item not in counter:
            counter[item] = 0
            counter[item] += 1
    return counter 
```

## 10.1.1 Creating a counter

The constructor of counter can be called in any one of the following ways :

- With sequence of items
- With dictionary containing keys and counts
- With keyword arguments mapping string names to counts

In [1]:
from collections import Counter


# We create a list (sequence of items) of char 
l=['B','B','A','B','C','A','B','B','A','C']
# then we build a counter by using the list
c1=Counter(l)
print(f"The counter has value: {c1}")
  
# note the counter constructor countes the occurrence of each element of the list, 
# and build a dict where the key is the char, and value is the count
# but the key is not sorted, as a result there is no order

The counter has value: Counter({'B': 5, 'A': 3, 'C': 2})


In [2]:
# We can also use a string 
l1=list("hello world")
c2=Counter(l1)
print(f"The counter has value: {c2}")

The counter has value: Counter({'l': 3, 'o': 2, 'h': 1, 'e': 1, ' ': 1, 'w': 1, 'r': 1, 'd': 1})


In [3]:
# with dictionary, we give directly the key and value pair
print(Counter({'A':3, 'B':5, 'C':2}))
  
# with keyword arguments, we give directly the key and value pair
print(Counter(A=3, B=5, C=2))

Counter({'B': 5, 'A': 3, 'C': 2})
Counter({'B': 5, 'A': 3, 'C': 2})


In [4]:
# we can also create an empty counter

c2=Counter()
print(f"This is an empty counter {c2}")

This is an empty counter Counter()


### Counter for complex key

We can also use counter to count more complex keys other than int or char.

In [5]:
# Create a list
z = ['blue', 'red', 'blue', 'yellow', 'blue', 'red']
  
# Count distinct elements and print Counter aboject
print(Counter(z))

Counter({'blue': 3, 'red': 2, 'yellow': 1})


## 10.1.2 Update a counter

- update(data): can update an counter, with a input data. 

Important Note:
- The Data can be provided in any of the three ways as mentioned in initialization and the counter’s data will be increased/decreased, but not replaced. 
- The Data can be another counter
- **The Counts in the data can be zero or negative**.

In [6]:
c2=Counter()
print(f"Before update, counter has value: {c2}")
l1=[1, 2, 3, 1, 2, 1, 1, 2]
c2.update(l1)
print(f"After update {l1}, counter has value: {c2}")

# update counter with another list
l2=[1, 2, 4]  
c2.update(l2)
print(f"After update {l2}, counter has value: {c2}")

Before update, counter has value: Counter()
After update [1, 2, 3, 1, 2, 1, 1, 2], counter has value: Counter({1: 4, 2: 3, 3: 1})
After update [1, 2, 4], counter has value: Counter({1: 5, 2: 4, 3: 1, 4: 1})


### Update counter with zero or negative count

In [7]:
c2=Counter({1: 5, 2: 4, 3: 1, 4: 1})
print(f"Before update, counter has value: {c2}")
d1={1:-3, 2:0, 4:-2}
c2.update(d1)
print(f"After update {d1}, counter has value: {c2}")

# note after the update, the order of key may change too. We have zero control on how the key is organized inside a counter

Before update, counter has value: Counter({1: 5, 2: 4, 3: 1, 4: 1})
After update {1: -3, 2: 0, 4: -2}, counter has value: Counter({2: 4, 1: 2, 3: 1, 4: -1})


### Update with complex keys

In [8]:
c=Counter({'blue': 3, 'red': 2, 'yellow': 1})
print(f"Before update, counter has value: {c}")
d={'blue':3,'red':-2,'yellow':0}
c.update(d)
print(f"After update {d}, counter has value: {c}")

Before update, counter has value: Counter({'blue': 3, 'red': 2, 'yellow': 1})
After update {'blue': 3, 'red': -2, 'yellow': 0}, counter has value: Counter({'blue': 6, 'yellow': 1, 'red': 0})


## 10.1.3 Accessing counters

- Counter[key]: Counters can be accessed just like dictionaries. Also, it does not raise the KeyValue error (if key is not present) instead the value’s count is shown as 0.
- elements(): It returns an iterator that produces all of the items known to the Counter.
- most_common(): It returns a sequence of the n most frequently encountered input values and their respective counts.
- items(): It returns a sequence of all input values and their respective counts.

In [9]:
z = ['blue', 'red', 'blue', 'yellow', 'blue', 'red']
col_count = Counter(z)
print(col_count)
  
col = ['blue','red','yellow','green']
  
# Here green is not in col_count so count of green will be zero
for color in col:
    print (color, col_count[color])

Counter({'blue': 3, 'red': 2, 'yellow': 1})
blue 3
red 2
yellow 1
green 0


### 10.1.3.2 Using elements()
- elements(): It returns an iterator that produces all of the items known to the Counter.

Note : Elements with count <= 0 are not included.

In [10]:
c=Counter({'blue': 3, 'red': 2, 'yellow': 1})

# we convert the iterator that returned by c.elements() to a list
print(list(c.elements()))

['blue', 'blue', 'blue', 'red', 'red', 'yellow']


### 10.1.3.3 Using most_common()
- most_common(): It returns a sequence of the n most frequently encountered input values and their respective counts.

In [11]:
c = Counter(a=1, b=2, c=30, d=120, e=123, f=219)
  
# This prints 3 most frequent characters in the counter
for letter, count in c.most_common(3):
    print('%s: %d' % (letter, count))

f: 219
e: 123
d: 120


### 10.1.3.3 Using items()
- items(): It returns a sequence of all input values and their respective counts.

In [12]:
c = Counter(a=1, b=2, c=30, d=120, e=123, f=219)

for letter, count in c.items():
    print('%s: %d' % (letter, count))

a: 1
b: 2
c: 30
d: 120
e: 123
f: 219


### 10.1.4 Ohter Built-in method
- subtract(counter): can substract an counter with another input counter

In [13]:
c1 = Counter(A=4,  B=3, C=10)
c2 = Counter(A=10, B=3, C=4)
print(f"Counter c1 has value: {c1}")
print(f"Counter c2 has value: {c2}")

# subtract is a method of object c1. so after calling subtract, c1 is modified, no new object is returned
c1.subtract(c2)
print(f"After substraction, counter c1 has value: {c1}")

Counter c1 has value: Counter({'C': 10, 'A': 4, 'B': 3})
Counter c2 has value: Counter({'A': 10, 'C': 4, 'B': 3})
After substraction, counter c1 has value: Counter({'C': 6, 'B': 0, 'A': -6})


## 10.1.5 Using Counter in real scenarios

We will use some example to illustrate how to use Counter to solve real world example

### 10.1.5.1 Finding the mode

In statistics, the mode is the most frequent value (or values) in a sample of data. For example, 
```text 
[2, 1, 2, 2, 3, 5, 3]: the mode is 2 because it appears most frequently.
[2, 1, 2, 2, 3, 5, 3, 3]. Here you have two modes, 2 and 3, because both appear the same number of times. The mode isn’t a unique value.
```

**You’ll often use the mode to describe categorical data. For example, the mode is useful when you need to know which category is the most common in your data.**

To find the mode with Python, you need to count the number of occurrences of each value in your sample. Then you have to find the most frequent value (or values). In other words, the value with the highest number of occurrences. That sounds like something you can do using Counter and .most_common().

> Note
>
> Python’s statistics module in the standard library provides functions for calculating several statistics, including **the mode of unimodal and multimodal samples**. The example below is just intended to show how useful Counter can be.
>

In [26]:
data = ["apple",     "orange",     "apple",     "apple",     "orange",     "banana",     "banana",     "banana",     "apple","banana"]

# create a counter
counter = Counter(data)

# get the first element of the most common items list, as the element is a tuple(name,count), 
# we need to match the two elements with variables. 
# We only need to get the count, and ommit the name, so we use _, top_count as the matching variables
_, top_count = counter.most_common(1)[0]

# because the mode can be multiple values that have the highest count
# so we will find all item name that has the highest count, and put these name in a list
mode_list=[point for point, count in counter.items() if count == top_count]

print(f"The mode of {data} is : {mode_list}")

The mode of ['apple', 'orange', 'apple', 'apple', 'orange', 'banana', 'banana', 'banana', 'apple', 'banana'] is : ['apple', 'banana']


### 10.1.5.2 Counting file by type

In this example, we count the files in a given directory, grouping them by file extension or file type.


In [15]:
import pathlib

path="../Lesson10_Python_collections"

# create an iterator over the entries in a given directory
entries = pathlib.Path(path).iterdir()

# build a list containing the extensions (.suffix) of all the files in the target directory.
extensions = [entry.suffix for entry in entries if entry.is_file()]
print(f"extensions under the path: {extensions}")

# print the count
print(f"file count by type: {Counter(extensions)}")

extensions under the path: ['.ipynb', '.ipynb', '.ipynb', '.ipynb']
file count by type: Counter({'.ipynb': 4})


### 10.1.5.3 Using Counter Instances as Multisets

In math, a **multiset represents a variation of a set that allows multiple instances of its elements**. The number of instances of a given element is known as its multiplicity. So, you can have a multiset like {1, 1, 2, 3, 3, 3, 4, 4}, but the set version will be limited to {1, 2, 3, 4}.

We can use counter to simulate a multisets.

In [16]:
# In python, set can't have duplication too 
l1=[1, 1, 2, 3, 3, 3, 4, 4]
s1={1, 1, 2, 3, 3, 3, 4, 4}
s2=set(l1)
print(f"list has element: {l1}")
print(f"set 1 has element: {s1}")
print(f"set 2 has element: {s2}")

list has element: [1, 1, 2, 3, 3, 3, 4, 4]
set 1 has element: {1, 2, 3, 4}
set 2 has element: {1, 2, 3, 4}


In [36]:
# create a counter to represent a multiset
multiset = Counter([1, 1, 2, 3, 3, 3, 4, 4])

# the keys of the multiset represent the set of items
item_set=multiset.keys()
print(f"item set: {item_set}")

item set: dict_keys([1, 2, 3, 4])


In [37]:
compare= (item_set == {1,2,3,4})
print(f"compare result: {compare}")

compare result: True


### Multiset use case: Shopping cart

A common use case for a multiset in programming is a shopping cart because it can contain more than one instance of each product, depending on the client’s needs:



In [44]:
prices = {"course": 97.99, "book": 54.99, "wallpaper": 4.99}

shopping_cart = Counter(course=1, book=3, wallpaper=2)

total=0
for item, num in shopping_cart.items():
    price=prices[item]
    sub_total=price*num
    total+=sub_total
    print(f"{item:9} : {price:7.2f} * {num} = {sub_total:7.2f}")
print(f"total amount: {total}")

course    :   97.99 * 1 =   97.99
book      :   54.99 * 3 =  164.97
wallpaper :    4.99 * 2 =    9.98
total amount: 272.94


## 10.1.6 Conclusion
When you need to count several repeated objects in Python, you can use Counter from collections. This class provides an efficient and Pythonic way to count things without the need for using traditional techniques involving loops and nested data structures. This can make your code cleaner and faster.
