# Dictionaries

#### Introduction to Dictionaries
Datastructure that associates keys with values (each key maps to a specific value).

NOTE: Each key MUST be unique, although values do not need to be unique

In [14]:
country_codes = {'Finland': 'fi', 'South Africa': 'za', 'Nepal': 'np'}

country_codes

# we can additionally use len(country_codes) to determine length
# or use 'country_codes' as a condition to determine if empty (e.g. if country_codes: )

{'Finland': 'fi', 'South Africa': 'za', 'Nepal': 'np'}

#### Iterating through a Dictionary
The items() method returns a view object containing the key-value pairs of a dictionary.

In [24]:
days_per_month = {'January': 31, 'February': 28, 'March': 31}

for month, days in days_per_month.items():
    print(f'{month} has {days} days')

January has 31 days
February has 28 days
March has 31 days


#### Basic Dictionary Operations
1. Get a specific value from a dicitonary by specifying the key similar to indexes with lists.
2. Add a key value pair to the dictionary through assignment
3. Remove a key-value pair by specifying the key (using 'del' keyword)
4. Pop a key-value pair by specifying the key

In [53]:
roman_numerals = {'I': 1, 'II': 2, 'III': 3, 'V': 5, 'X': 10}

# Getting a value by specifying a key
print(roman_numerals['V'])

# Adding a key-value pair through asssignment
roman_numerals['L'] = 50
print(roman_numerals)

# Removing a key-value pair by specifying the key
del roman_numerals['III']
print(roman_numerals)

# Popping a key-value pair (removing it)
print(roman_numerals.pop('X'))
print(roman_numerals)

5
{'I': 1, 'II': 2, 'III': 3, 'V': 5, 'X': 10, 'L': 50}
{'I': 1, 'II': 2, 'V': 5, 'X': 10, 'L': 50}
10
{'I': 1, 'II': 2, 'V': 5, 'L': 50}


Additional Notes on Dictionary Keys:
- attempting to access a nonexistent key results in 'KeyError'
- prevent this error using 'get' method (e.g. roman_numerals.get('III') )
- can use 'in' & 'not in' to test if key exists in dictionary

In [79]:
roman_numerals = {'I': 1, 'II': 2, 'III': 3, 'V': 5, 'X': 10}

# using get() to find value of a key - 2nd argument specifies error message
print(roman_numerals.get('III', 'not in dictionary'))

# using 'in' to see if key exists
print('V' in roman_numerals)

3
True


#### Dictionary Views
Methods items(), keys(), and values() each return a view of a dictionary's data
- keys() gets the keys of a target dictionary in form of a list
- values() gets the values of a target dictionary in form of a list
- items() gets both the key-value pairs of a dictionary in form of a list

In [107]:
months = {'January': 1, 'February': 2, 'March': 3, 'December': 12}
# what a dictionary view looks like:
months.keys()

dict_keys(['January', 'February', 'March', 'December'])

In [109]:
# using keys()
months_view = months.keys()

# iterating months_view
for key in months_view:
    print(key, end='\t')

January	February	March	December	

In [111]:
# converting a dictionary keys to a list
list(months.keys())

['January', 'February', 'March', 'December']

In [117]:
# converting a dictionary's key-value pairs into a list (of tuples)
list(months.items())

[('January', 1), ('February', 2), ('March', 3), ('December', 12)]

#### Dictionary Comparisons
Dictionaries can be compared using '==' and '!='.

NOTE: order of key-value pairs is of no significance

In [130]:
country_capitals1 = {'Belgium': 'Brussels', 'Haiti': 'Port-au-Prince'}
country_capitals2 = {'Nepal': 'Kathmandu', 'Uruguay': 'Montevideo'}
country_capitals3 = {'Haiti': 'Port-au-Prince', 'Belgium': 'Brussels'}

country_capitals1 == country_capitals3

True

#### Ex. Using Dictionaries for Grades

In [141]:
"""Using a dictionary to represent instructor's grade book"""
grade_book = {
    'Susan': [92,85,100],
    'Eduardo': [83,95,79],
    'Azizi': [91,89,82],
    'Pantipa': [97,91,92]
}
all_grades_total = 0
all_grades_count = 0

# iterate through each student to sum that students total, report average
#     for that individual student, & add to student total to global total
for name, grades in grade_book.items():
    total = sum(grades)
    print(f'Average for {name} is {total/len(grades):.2f}')
    all_grades_total += total
    all_grades_count += len(grades)

print(f"Class's average is: {all_grades_total/all_grades_count:.2f}")

Average for Susan is 92.33
Average for Eduardo is 85.67
Average for Azizi is 87.33
Average for Pantipa is 93.33
Class's average is: 89.67


#### Ex. Using Dictionaries for Word Counts

In [139]:
"""Tokenizing a string and counting unique words."""

text = ('this is sample text with several words'
        'this is more sample text with some different words')

# create a dictionary to hold key-value pairs
#    in form of 'word': frequency
word_counts = {}

# count occurences of each unique word
for word in text.split(): # split() splits a string into a list
    if word in word_counts:
        word_counts[word] += 1 # update existing key-value pair
    else:
        word_counts[word] = 1 # insert new key-value pair

print(f'{"WORD":<12}COUNT')

for word, count in sorted(word_counts.items()):
    print(f'{word:<12}{count}')

print('\nNumber of unique words:',len(word_counts))

WORD        COUNT
different   1
is          2
more        1
sample      2
several     1
some        1
text        2
this        1
with        2
words       1
wordsthis   1

Number of unique words: 11


#### Ex. Word Counts using Collections Module

In [144]:
"""Performing the same task from above using collections"""
from collections import Counter

text = ('this is sample text with several words'
        'this is more sample text with some different words')

counter = Counter(text.split())

for word, count in sorted(counter.items()):
    print(f'{word:>12}\t{count}')

   different	1
          is	2
        more	1
      sample	2
     several	1
        some	1
        text	2
        this	1
        with	2
       words	1
   wordsthis	1


In [149]:
# out of curiousity, this is what the counter object looks like:
# (counter similar to dictionary, but provides layer of wrapping above the dictionary)
counter

Counter({'is': 2,
         'sample': 2,
         'text': 2,
         'with': 2,
         'this': 1,
         'several': 1,
         'wordsthis': 1,
         'more': 1,
         'some': 1,
         'different': 1,
         'words': 1})

#### Inserting Key-Value Pairs - 'update()'
Add key-value pairs using update() (similar to append for lists). Either use '{ }' notation to specify key & value or use assignment format

In [157]:
country_codes = {}

# using curly braces format
country_codes.update({'South Africa': 'za'})
print(country_codes)

# using '=' assignment format
country_codes.update(Australia='ar')
print(country_codes)

{'South Africa': 'za'}
{'South Africa': 'za', 'Australia': 'ar'}


#### Dictionary Comprehensions
Can provide convenient notation for quickly generating dictionaries. Often mapping one dictionary to another.

In [171]:
# reverse a dictionary
months = {'January': 1, 'February': 2, 'March': 3}
months2 = {number: name for name, number in months.items()}
months2

# NOTE: if we try to swap keys & values but a value appears more than once, the resulting
#    dictionary may be altered (remember that keys must be unique

{1: 'January', 2: 'February', 3: 'March'}

In [168]:
# mapping a list of grades into an average
grades = {'Sue': [98,87,94], 'Bob': [84,95,91]}
grades2 = {k: sum(v) / len(v) for k, v in grades.items()}
grades2

{'Sue': 93.0, 'Bob': 90.0}

# Sets

#### Introduction to Sets
A set is an unordered collection of unique values that are immutable. Created using curly braces { }
- iterable but not sequences
- do not support indexing and slicing with square brackets, []
- NO DUPLICATES!

In [174]:
colors = {'red','orange','yellow','green','red','blue'}
colors

{'blue', 'green', 'orange', 'red', 'yellow'}

In [178]:
# can determine a set's length
print(len(colors))

# check whether a value exists in a set
print('red' in colors)

5
True


#### Iterating through a set

In [183]:
for color in colors:
    print(color.upper(),end='\t')

GREEN	YELLOW	ORANGE	RED	BLUE	

#### Creating a Set
May create sets using the set() function, or create an empty set.

In [197]:
numbers = list(range(10)) + list(range(5))
print(numbers)

# create a set of the unique numbers from above
print(set(numbers))

# create empty set
a_set = set()
print(a_set)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4]
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
set()


#### Set Operations
We can find the union ( | ), intersection ( & ) & difference ( - ) of two sets
- union: set consisting of all the unique elements from both sets
- intersection: set consisting of all unique elements that the two sets have in common
- difference: set consisting of all the elements in the left oprand that are not in the right operand

In [11]:
set_a = {1,3,5}
set_b = {2,3,4}

# getting the union of two sets
print(set_a | set_b)

# getting the intersection of two sets
print(set_a & set_b)

# getting the difference of two sets
print(set_a - set_b)

{1, 2, 3, 4, 5}
{3}
{1, 5}


Additionally, we can find the symettrical difference ( ^ ) & determine whether two sets are disjoint.
- symmetric difference: set consisting of the elements of both sets that are not in common with one another (as opposed to regular difference which only considers the set as the left operand
- disjoint: use 'isdisjoint()' method to determine if two sets have no common elements

In [22]:
set_a = {1,3,5}
set_b = {2,3,4}

# getting symmetric difference
print(set_a ^ set_b)

# checking for disjoint
print(set_a.isdisjoint(set_b)) # set_a & set_b share a common value, so False

{1, 2, 4, 5}
False


We can also perform a union operation ( | ) by using the update() method

In [25]:
numbers = {1,3,5}

# using update() to modify the targeted set
numbers.update(range(10))
print(numbers)

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}


#### Augmented Assignments
We can perform union ( |= ), intersection ( &= ), difference ( -= ) & symmetric difference ( ^= ) on the same set using augmented assignemnts
- We can additionally use methods 'intersection_update', 'difference_update' and 'symmetric_difference_update' passing an iterable argument

In [33]:
numbers = {1,3,5}

# augmented union
numbers |= {8,5,6}
print(numbers)

numbers.difference_update({1,2,3})
print(numbers)

{1, 3, 5, 6, 8}
{5, 6, 8}


#### Adding/Removing Elements
We can add elements with the add() method & remove with the remove() method

NOTE: 'KeyError' will be returned if value is not in the set

In [42]:
numbers = {2,4,6,8}

# adding elements
numbers.add(10)
print(numbers)

# removing an element
numbers.remove(2)
print(numbers)

{2, 4, 6, 8, 10}
{4, 6, 8, 10}


We can also use pop() to remove AND return an element. This is different from dictionaries though, because we can not specify an element to pop, it just pops the top (last) element

NOTE: 'KeyError' occurs if the set is empty when pop is called

In [48]:
numbers = {2,4,6,8}
numbers.pop()

8

Empty a set with clear()

In [58]:
numbers = {2,4,6}
numbers.clear()
print(numbers)

set()


# Data Science - Dynamic Visualizations

To produce a dynamic visualization, the code needs to be enhanced with the Matplotlib animation module's FuncAnimation function to update the bar plot dynamically.

#### Law of Large Numbers - 6 sided die
In the example below we see the law of large numbers. For a six-sided die, each value should occur one-sixth of the time

In [87]:
# RollDieDynamic.py
"""Dynamically graphing frequencies of die rolls."""
from matplotlib import animation
import matplotlib.pyplot as plt
import random
import seaborn as sns
import sys

def update(frame_number, rolls, faces, frequencies):
    """Configures bar plot contents for each animation frame."""

    # roll die and update frequencies
    for i in range(rolls):
        frequencies[random.randrange(1,7) - 1] += 1
    # reconfigure plot for updated die frequencies
    plt.cla() # clear old contents of current Figure
    axes = sns.barplot(x=faces, y=frequencies, palette='bright') # new bars
    axes.set_title(f'Die Frequencies for {sum(frequencies):,} Rolls')
    axes.set(xlabel='Die Value', ylabel='Frequency')
    axes.set_ylim(top=max(frequencies) * 1.10) # scale y-axis by 10%

    # display frequency and percentage above each patch (bar)
    for bar, frequency in zip(axes.patches, frequencies):
        text_x = bar.get_x() + bar.get_width() / 2.0
        text_y = bar.get_height()
        text = f'{frequency:,}\n{frequency / sum(frequencies):.3%}'
        axes.text(text_x, text_y, text, ha='center', va='bottom')

# read command-line arguments for number of frames and rolls per frame
number_of_frames = 10000 # int(sys.argv[1])
rolls_per_frame = 600 # int(sys.argv[2])

sns.set_style('whitegrid') # white background with gray grid lines
figure = plt.figure('Rolling a Six-Sided Die') # Figure for animation
values = list(range(1,7)) # die faces for display on x-axis
frequencies = [0] * 6 # six-element list of die frequencies

# configure and start animation that calls function update
die_animation = animation.FuncAnimation(
    figure, update, repeat=False, frames=number_of_frames, interval=33,
    fargs=(rolls_per_frame, values, frequencies)
)

plt.show() # display window

<Figure size 640x480 with 0 Axes>

#### FuncAnimation
FuncAnimation has two required arguments & other optional arguments:
1. figure ~ the Figure object in which to display the animation
2. update ~ the function to call once per animation frame
3. repeat ~ 'False' terminates the animation after the specified number of frames. If 'True' (default), when the animation completes it restarts from the beginning
4. frames ~ the total number of animation frames, which controls how many times FuncAnimation calls update
5. interval ~ the number of milliseconds (33, in this case) between animation frames. After each call to update, FuncAnimation waits 33 miliseconds before making the next call
6. fargs ~ a tuple of other arguments to pass to the function specified in FuncAnimation's second argument. The arguments you specify in the fargs tuple correspond to update's parameters rolls, faces and frequencies