## Problem
- Let's say you have a list comments made on a blog post, and you want to know the top 3 Users with the most number of comments.
- How to accomplish that in the most pythonic way possible?

In [1]:
from collections import namedtuple
Comment = namedtuple('Comment', ['author', 'content'])

comments = [
    Comment(author='Junior', content='Python3 is awesome'),
    Comment(author='Zak', content='Yeah I agree'),
    Comment(author='Amy', content='Indeed'),
    Comment(author='Junior', content='Python3 beats Python2'),
    Comment(author='Paul', content='Sure'),
    Comment(author='Ralf', content='Yeah I agree'),
    Comment(author='Becca', content='Yeah I agree'),
    Comment(author='Zak', content='Yeah I agree'),
    Comment(author='Simon', content='Yeah I agree'),
    Comment(author='Zak', content='Yeah I agree'),
    Comment(author='Becca', content='Yeah I agree'),
    Comment(author='Junior', content='Yeah I agree'),
    Comment(author='Matt', content='Yeah I agree'),
    Comment(author='Ali', content='Yeah I agree'),
    Comment(author='Becca', content='Yeah I agree'),
    Comment(author='Junior', content='Yeah I agree'),
    Comment(author='Amy', content='Yeah I agree'),
    Comment(author='Becca', content='We all agree then :)')
]

comments_authors = [comment.author for comment in comments]
print(f'comments_authors = {comments_authors}')

comments_authors = ['Junior', 'Zak', 'Amy', 'Junior', 'Paul', 'Ralf', 'Becca', 'Zak', 'Simon', 'Zak', 'Becca', 'Junior', 'Matt', 'Ali', 'Becca', 'Junior', 'Amy', 'Becca']


## Answer
- The most Pythonic way is to use collections.Counter

In [7]:
# BAD WAY: to much of boilerplate

from collections import defaultdict

authors_count = defaultdict(int) #<0>

for author in comments_authors:
    authors_count[author] += 1 #<0>
print(f'authors_count = {authors_count}')

sorted_authors = sorted(authors_count.items(), key=lambda el: el[1], reverse=True) #<1>
print(f'sorted_authors = {sorted_authors}')

top3 = sorted_authors[:3]
print(f'top3 = {top3}')

authors_count = defaultdict(<class 'int'>, {'Junior': 4, 'Zak': 3, 'Amy': 2, 'Paul': 1, 'Ralf': 1, 'Becca': 4, 'Simon': 1, 'Matt': 1, 'Ali': 1})
sorted_authors = [('Junior', 4), ('Becca', 4), ('Zak', 3), ('Amy', 2), ('Paul', 1), ('Ralf', 1), ('Simon', 1), ('Matt', 1), ('Ali', 1)]
top3 = [('Junior', 4), ('Becca', 4), ('Zak', 3)]


In [8]:
# GOOD WAY: using collections.Counter

from collections import Counter

authors_count = Counter(comments_authors) #<2>
print(f'authors_count = {authors_count}')

top3 = authors_count.most_common(3)  #<3>
print(f'top3 = {top3}')

authors_count = Counter({'Junior': 4, 'Becca': 4, 'Zak': 3, 'Amy': 2, 'Paul': 1, 'Ralf': 1, 'Simon': 1, 'Matt': 1, 'Ali': 1})
top3 = [('Junior', 4), ('Becca', 4), ('Zak', 3)]


## Discussion
- <0> using defaultdict allows us to auto-initialize keys to an integer value of 0 if missing. (ref: 3.2)
- <1> items() return a list of (author, count) tuples that we sort in descending order (via reverse=True) based on the count (via key=lambda el: el[1])
- <2, 3> building the authors_count Lookup Table via Counter instead of dict we can find the top3 comment authors in just 2 steps.

## Problem
- You are keeping track of a dict of average daily reviews for each movie, and one per day.
- How to know if a movie has ever been reviewed in the most Pythonic way regardless of the review day?

In [24]:
daily_avg_reviews = {
    '01-Jan-2019': {'Python3 beats Python2': 3.5, 'Python2 end game': 4.7},
    '02-Jan-2019': {'Python2 end game': 3.9},
    '03-Jan-2019': {'Python3 beats Python2': 4.5},
    '04-Jan-2019': {'Python3 is the future': 5.0, 'Python2 end game': 4.1},
}

all_movies = [
    'Python3 beats Python2',
    'Python2 end game',
    'Python3 is the future Season 2',
    'Python3 is the future',
    'dummy movie',
]

## Answer
- The most Pythonic way is to use collections.ChainMap

In [25]:
# BAD WAY: not the most pythonic

for movie_name in all_movies:
    for day, day_reviews_dict in daily_avg_reviews.items():
        if movie_name in day_reviews_dict:
            print(f'[HIT] - [{movie_name}] has been reviewed !')
            break
    else:
        print(f'[MISS] - [{movie_name}] has NEVER been reviewed !')

[HIT] - [Python3 beats Python2] has been reviewed !
[HIT] - [Python2 end game] has been reviewed !
[MISS] - [Python3 is the future Season 2] has NEVER been reviewed !
[HIT] - [Python3 is the future] has been reviewed !
[MISS] - [dummy movie] has NEVER been reviewed !


In [26]:
# GOOD WAY: Using ChainMap
from collections import ChainMap

all_days_reviews_dicts = daily_avg_reviews.values() #<0>
chained_lookup = ChainMap(*all_days_reviews_dicts)  #<1>

print(f'chained_lookup = {chained_lookup}')
print()

for movie_name in all_movies:
    if movie_name in chained_lookup: #<2>
        print(f'[HIT] - [{movie_name}] has been reviewed !')
    else:
        print(f'[MISS] - [{movie_name}] has NEVER been reviewed !')

chained_lookup = ChainMap({'Python3 beats Python2': 3.5, 'Python2 end game': 4.7}, {'Python2 end game': 3.9}, {'Python3 beats Python2': 4.5}, {'Python3 is the future': 5.0, 'Python2 end game': 4.1})

[HIT] - [Python3 beats Python2] has been reviewed !
[HIT] - [Python2 end game] has been reviewed !
[MISS] - [Python3 is the future Season 2] has NEVER been reviewed !
[HIT] - [Python3 is the future] has been reviewed !
[MISS] - [dummy movie] has NEVER been reviewed !


## Discussion
- <0> building an iterator made of all the individual daily average reviews dicts
- <1> passing all the individual daily average reviews dicts to ChainMap to "emulate" a single "meta" dict from them. 
- <2> using the in operator on the chained_lookup finds a given key in each of the individual dict sequentially and returns True if one of them contains the key.

## Problem
- we have the same dict above mapping days to daily_avg_reviews per movie with the keys sorted by dates and we would like the order to be preserved every time we insert a new day or remove an existing one.


## Answer
- Using collections.OrederedDict it is possible to have key-ordered lookup table.

In [34]:
# <0>
from collections import OrderedDict

daily_avg_reviews = OrderedDict()
daily_avg_reviews['01-Jan-2019'] = {'Python3 beats Python2': 3.5, 'Python2 end game': 4.7}
daily_avg_reviews['02-Jan-2019'] = {'Python2 end game': 3.9}
daily_avg_reviews['03-Jan-2019'] = {'Python3 beats Python2': 4.5}
daily_avg_reviews['04-Jan-2019'] = {'Python3 is the future': 5.0, 'Python2 end game': 4.1}

print(f'keys insertion order preserved during = {list(daily_avg_reviews.keys())}')

del daily_avg_reviews['04-Jan-2019']
print(f'keys order preserved after delete = {list(daily_avg_reviews.keys())}')

keys insertion order preserved during = ['01-Jan-2019', '02-Jan-2019', '03-Jan-2019', '04-Jan-2019']
keys order preserved after delete = ['01-Jan-2019', '02-Jan-2019', '03-Jan-2019']


## Discussion
- <0> OrderedDict preserves keys insertion order. However, They consume more than twice the memory space required by a normal dict as they use linked-lists under the hood to maintain the order so they should be use carefully.