# Python Counter Excecises

In Python you have to do a lot of counting while dealing with text data to find out word frequencies and so on.
The below pythonic way of counting is the most efficient way to do counting in Python. We will advance to the best practice

### Using the dict.get() method

The get() method will will create a new key for the dict if a key is not found and initializes with zero. So that at run time there is no keyerror

In [5]:
sentence = "mississippi is running so fast"
counter = {}
for letter in sentence:
    counter[letter] = counter.get(letter,0) +1
counter

{'m': 1,
 'i': 6,
 's': 7,
 'p': 2,
 ' ': 4,
 'r': 1,
 'u': 1,
 'n': 3,
 'g': 1,
 'o': 1,
 'f': 1,
 'a': 1,
 't': 1}

### Using defaultdict from colletions

The defaultdict will do what get() method dis earlier, during assignment, we can tell defaultdict how the new key value has to be initialized. In this case we initialize it with the factory funtion  'int'

In [6]:
from collections import defaultdict

counter = defaultdict(int)
for letter in sentence:
    counter[letter] +=1 
counter

defaultdict(int,
            {'m': 1,
             'i': 6,
             's': 7,
             'p': 2,
             ' ': 4,
             'r': 1,
             'u': 1,
             'n': 3,
             'g': 1,
             'o': 1,
             'f': 1,
             'a': 1,
             't': 1})

## Using Pythons Counter 

In [7]:
from collections import Counter
Counter(sentence)

Counter({'m': 1,
         'i': 6,
         's': 7,
         'p': 2,
         ' ': 4,
         'r': 1,
         'u': 1,
         'n': 3,
         'g': 1,
         'o': 1,
         'f': 1,
         'a': 1,
         't': 1})

#### Constructing Counter
1. Use a string as an argument
2. Use a list as an argument
3. Provide a dict or key/value pair
4. Provide the key and value as arguments
5. Call set() which initializes all of the Unique keys to 1

In [16]:
from collections import Counter
# Use a string as an argument
cnt1 = Counter('hollywood')
print('1 --->',cnt1)

# Use a list as argument
cnt2 = Counter(list('hollywood'))
print('2 --->',cnt2)

# Provide a dict or key/value pair
cnt3 = Counter({'apple':2,"orage":5,"papaya":1})
print('3 --->',cnt3)

# Provide a key and value 
cnt4 = Counter(apple=2,orange=5,papaya=1)
print('4 --->',cnt4)

# Call set() which initializes all of the unique keys
cnt5 = Counter(set('hollywood'))
print('5 --->',cnt5)

1 ---> Counter({'o': 3, 'l': 2, 'h': 1, 'y': 1, 'w': 1, 'd': 1})
2 ---> Counter({'o': 3, 'l': 2, 'h': 1, 'y': 1, 'w': 1, 'd': 1})
3 ---> Counter({'orage': 5, 'apple': 2, 'papaya': 1})
4 ---> Counter({'orange': 5, 'apple': 2, 'papaya': 1})
5 ---> Counter({'d': 1, 'h': 1, 'o': 1, 'l': 1, 'w': 1, 'y': 1})


#### Updaing the Object Counts
1. Providing iterables to the update function.
2. Providing another Counter/dict instance to the update function

In [40]:
# Iterables to the update function
count_of_items = Counter('hollywood')
count_of_items.update('mollywood')
count_of_items.update('bollywood')
print('Iterables to update:', count_of_items)

# Counter to the update function

fruit_count_today = Counter({'apple':10,'orange':25,'papaya':3})
fruit_count_yesterday =Counter({'apple':5,'orange':20,'papaya':4}) 
fruit_count_today.update(fruit_count_yesterday)
print('counter update',fruit_count_today)

# Dict to update function
fruit_count_added_today = {'apple':5,'orange':4,'papaya':2}
fruit_count_today.update(fruit_count_added_today)
print('dict update',fruit_count_today)

Iterables to update: Counter({'o': 9, 'l': 6, 'y': 3, 'w': 3, 'd': 3, 'h': 1, 'm': 1, 'b': 1})
counter update Counter({'orange': 45, 'apple': 15, 'papaya': 7})
dict update Counter({'orange': 49, 'apple': 20, 'papaya': 9})


#### Access the Counter

In [13]:
fruit_count_today['apple']
# to access the keys
print('keys------------------------')
for key in fruit_count_today.keys():
    print(key)
    
# to access the values
print('values---------------------')
for val in fruit_count_today.values():
    print(val)
    
# to access key and values
print('key and value--------------')
for key,val in fruit_count_today.items():
    print(key,'-->',val)

keys------------------------
apple
orange
papaya
values---------------------
20
49
9
key and value--------------
apple --> 20
orange --> 49
papaya --> 9


#### Finding the most common
The function most_common(x) on the counter will retun a list of 'x' most common items. If provide 'None' will return a sorted list of values.

In [14]:
fruit_count_today.most_common(1)

[('orange', 49)]

In [15]:
fruit_count_today.most_common(2)

[('orange', 49), ('apple', 20)]

In [16]:
fruit_count_today.most_common(None)

[('orange', 49), ('apple', 20), ('papaya', 9)]

In [17]:
# to get the list in the reverse order
fruit_count_today.most_common()[::-1]

[('papaya', 9), ('apple', 20), ('orange', 49)]

In [18]:
fruit_count_today.most_common()[:-3:-1]

[('papaya', 9), ('apple', 20)]

## Working with Counter
1. Find the most common words in the text
2. Find the mode of a sample
3. Using arithmatic operations using counter like subtract, addition, multiplication, division.
4. Using multiset features.


Find the most common words in the text

In [30]:
import string 
from collections import Counter

text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."

#get the clean words without puncutation
trans_table = str.maketrans('','',string.punctuation) 
word_list = text.lower().split() 
clean_word_list = [word.translate(trans_table) for word in word_list]

#apply counter to the clean word list
word_counter = Counter(clean_word_list)

#Find the 2 most common words in the text
word_counter.most_common(2)

[('ut', 3), ('in', 3)]

Find the mode of a sample

In [31]:
# Find the mode of a sample

fruit_count_today.most_common(1)[0][0]

'orange'

Using arithmatic operations with counter like subtract, addition, min and max.

In [41]:
subt = fruit_count_today.subtract(fruit_count_yesterday)
add = fruit_count_today + fruit_count_yesterday
minc = fruit_count_today & fruit_count_yesterday
maxc = fruit_count_today | fruit_count_yesterday
print(subt,add,minc,maxc)


None Counter({'orange': 49, 'apple': 20, 'papaya': 9}) Counter({'orange': 20, 'apple': 5, 'papaya': 4}) Counter({'orange': 29, 'apple': 15, 'papaya': 5})


Counter Multiset feature 'elements'

In [42]:
lst = [1,2,3,2,2,3,3,1,1,1,3,5,3]
list_count = Counter(lst)

for ele in list_count.elements():
    print(ele)


1
1
1
1
2
2
2
3
3
3
3
3
5
