### Exercises

#### Exercise #1

Let's revisit an exercise we did right after the section on dictionaries.

You have text data spread across multiple servers. Each server is able to analyze this data and return a dictionary that contains words and their frequency.

Your job is to combine this data to create a single dictionary that contains all the words and their combined frequencies from all these data sources. Bonus points if you can make your dictionary sorted by frequency (highest to lowest).

For example, you may have three servers that each return these dictionaries:

In [13]:
d1 = {'python': 10, 'java': 3, 'c#': 8, 'javascript': 15}
d2 = {'java': 10, 'c++': 10, 'c#': 4, 'go': 9, 'python': 6}
d3 = {'erlang': 5, 'haskell': 2, 'python': 1, 'pascal': 1}

Your resulting dictionary should look like this:

In [14]:
d = {'python': 17,
     'javascript': 15,
     'java': 13,
     'c#': 12,
     'c++': 10,
     'go': 9,
     'erlang': 5,
     'haskell': 2,
     'pascal': 1}

If only servers 1 and 2 return data (so d1 and d2), your results would look like:

In [15]:
d = {'python': 16,
     'javascript': 15,
     'java': 13,
     'c#': 12,
     'c++': 10, 
     'go': 9}

This was one solution to the problem:

In [16]:
def merge(*dicts):
    unsorted = {}
    for d in dicts:
        for k, v in d.items():
            unsorted[k] = unsorted.get(k, 0) + v
            
    # create a dictionary sorted by value
    return dict(sorted(unsorted.items(), key=lambda e: e[1], reverse=True))

Implement two different solutions to this problem:

**a**: Using `defaultdict` objects

**b**: Using `Counter` objects

In [17]:

from collections import defaultdict, Counter

ddd = {'python': 17,
    'javascript': 15,
    'java': 13,
    'c#': 12,
    'c++': 10,
    'go': 9,
    'erlang': 5,
    'haskell': 2,
    'pascal': 1}

dd = {'python': 16,
    'javascript': 15,
    'java': 13,
    'c#': 12,
    'c++': 10, 
    'go': 9}

def tot_freqs(*dcts)->dict:
    total = defaultdict(int)
    for dct in dcts:
        for key, val in dct.items():
            total[key] += val
    return dict(
        sorted(total.items(),
            key = lambda x: x[1],
            reverse=True
        )
    )

def sorted_counter(*dcts)->dict:
    tot_counter = sum(
        map(Counter,dcts),
        Counter()
    )
    return dict(tot_counter.most_common())

print(tot_freqs(d1,d2,d3))
print(sorted_counter(d1,d2,d3))

print('ddd == tot_freqs(d1,d2,d3)?',
    ddd == tot_freqs(d1,d2,d3))
print('dd == tot_freqs(d1,d2)?',
    dd == tot_freqs(d1,d2))
print('ddd == sorted_counter(d1,d2,d3)?',
    ddd == sorted_counter(d1,d2,d3))
print('dd == sorted_counter(d1,d2)?',
    dd == sorted_counter(d1,d2))




{'python': 17, 'javascript': 15, 'java': 13, 'c#': 12, 'c++': 10, 'go': 9, 'erlang': 5, 'haskell': 2, 'pascal': 1}
{'python': 17, 'javascript': 15, 'java': 13, 'c#': 12, 'c++': 10, 'go': 9, 'erlang': 5, 'haskell': 2, 'pascal': 1}
ddd == tot_freqs(d1,d2,d3)? True
dd == tot_freqs(d1,d2)? True
ddd == sorted_counter(d1,d2,d3)? True
dd == sorted_counter(d1,d2)? True


---

#### Exercise #2

Suppose you have a list of all possible eye colors:

In [18]:
eye_colors = ("amber", "blue", "brown", "gray", "green", "hazel", "red", "violet")

Some other collection (say recovered from a database, or an external API) contains a list of `Person` objects that have an eye color property.

Your goal is to create a dictionary that contains the number of people that have the eye color as specified in `eye_colors`. The wrinkle here is that even if no one matches some eye color, say `amber`, your dictionary should still contain an entry `"amber": 0`.

Here is some sample data:

In [19]:
class Person:
    def __init__(self, eye_color):
        self.eye_color = eye_color

In [20]:
from random import seed, choices
seed(0)
persons = [Person(color) for color in choices(eye_colors[2:], k = 50)]

In [36]:

def count_eye_colors(people)->dict:
    # starter = dict.fromkeys(eye_colors, 0)
    count = Counter({ color: 0 for color in eye_colors })
    count.update( Counter(
        person.eye_color
        for person in people
    ) )
    return count

eye_colors_freqs = count_eye_colors(persons)

print(eye_colors_freqs)
print(
    'eye_colors keys are all contained in count_eye_colors(persons)?',
    all( color in eye_colors_freqs for color in eye_colors )
)

Counter({'violet': 12, 'gray': 10, 'red': 10, 'green': 8, 'hazel': 7, 'brown': 3, 'amber': 0, 'blue': 0})
eye_colors keys are all contained in count_eye_colors(persons)? True


As you can see we built up a list of `Person` objects, none of which should have `amber` or `blue` eye colors

Write a function that returns a dictionary with the correct counts for each eye color listed in `eye_colors`.

#### Exercise #3

You are given three JSON files, representing a default set of settings, and environment specific settings.
The files are included in the downloads, and are named:
* `common.json`
* `dev.json`
* `prod.json`

Your goal is to write a function that has a single argument (the environment name) and returns the "combined" dictionary that merges the two dictionaries together, with the environment specific settings overriding any common settings already defined.

For simplicity, assume that the argument values are going to be the same as the file names, without the `.json` extension. So for example, `dev` or `prod`.

The wrinkle: We don't want to duplicate data for the "merged" dictionary - use `ChainMap` to implement this instead.

In [41]:
import json
from collections import ChainMap

envs_keys = ['common','dev','prod']

envs = {k: json.loads(open(k+'.json').read()) for k in envs_keys}

def get_env(env_key):
    return merge(envs['common'], envs[env_key] )

def merge(base:dict, to_merge:dict,)->ChainMap:
    cmap = ChainMap({}, to_merge, base)
    for key in cmap.keys():
        if any(
                isinstance(dct.get(key, None), dict)
                for dct in (base, to_merge,)
            ):
            cmap[key] = merge(
                base.get(key, {}),
                to_merge.get(key,{})
            )
    return cmap
# common_env = get_env('common')
dev_env = get_env('dev')
prod_env = get_env('prod')

def cast_to_dict(cmap:ChainMap)->dict:
    return {
        key: isinstance(val, ChainMap) and cast_to_dict(val) or val
        for key, val in cmap.items()
    }

# print(json.dumps(cast_to_dict(dev_env), indent=2))

print(prod_env['logs'] == envs['common']['logs'])
print(dev_env['logs'] == envs['dev']['logs'])


True
True
