<div class="pagebreak"></div>

# Dictionaries
Dictionaries are unordered collections of key-value pairs. Generally, we use strings for keys, but keys can be any immutable type. The primary immutable types we have seen so far are strings, ints, and floats.  Python does have an immutable version of a list type called tuple. Values can be of any type - including other dictionaries and lists.

Keys must be unique. If we attempt to store another key-value pair where the key already exists, the new value replaces the previous value in the dictionary.

Programmers use dictionaries (i.e., [associative arrays](https://en.wikipedia.org/wiki/Associative_array)) quite frequently. For example, translating one value to another value, e.g., a stock symbol to a stock name or a word in English to its counterpart in another language. We could also have a dictionary where the key is a university course name, and the values are a list of students within that course. Another dictionary could map HTTP response status codes to their descriptions: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status. Retailers often use [SKUs](https://en.wikipedia.org/wiki/Stock_keeping_unit) to track products and inventory. The SKU could be the key, while the value may be another dictionary keeping track of a series of properties the retailer needs for each product: name, purchase_price, selling_price, inventory_count, …

As with lists, dictionaries are another example of [data structures](https://en.wikipedia.org/wiki/Data_structure). A dictionary organize the data as unordered collection of key-value pairs. Accessing the dictionary with the key allows us to quickly retrieve a particular value.  The primary relationship in the dictionary is the key to value mapping.  Keys have an unordered relationship to other keys.  However, keys must be unique among themselves - a unique relationship.  Throughout this notebook, you will see the various operations that exist for dictionaries.

As with other collections, we have different ways of creating dictionaries - literals with `{}` and the `dict()` function.

## Creating: Literals
With literals, we can either create an empty dictionary or use a sequence of key-value pairs to  populate a dictionary.

In [None]:
empty_dict = {}
empty_dict

In [None]:
stocks = { "AXP"  : "American Express",
           "AMGN" : "Amgen",
           "AAPL" : "Apple",
           "BA"   : "Boeing"}
print(stocks)

As a side note, we intentionally used extra whitespace while creating the dictionary to make the code easier to read. We can format this code multiple ways:
```
stocks = {'AXP': 'American Express', 'AMGN': 'Amgen', 'AAPL': 'Apple', 'BA': 'Boeing'}
```
```
stocks = { "AXP" : "American Express", "AMGN" : "Amgen",
           "AAPL" : "Apple",
           "BA" : "Boeing"}
```

## Creating: dict()
We can pass named arguments and values to the built-in `dict()` function.

In [None]:
more_stocks = dict(CAT='Caterpillar', CVX='Chevron', CSCO='Cisco', KO='Coca-Cola')
more_stocks

With this named argument method, argument names must follow the naming rules for variables: no spaces, no reserved words, and the name must start with a letter or underscore.

Creating dictionaries from two-value sequences:

In [None]:
from_list_of_list = dict([ ['DIS', 'Walt Disney Co'], ['DOW', 'Dow'], ['GS', 'Goldman Sacs'], ['HD', 'Home Depot'] ])
from_list_of_list

## Combining Dictionaries
Python has two different ways to combine dictionaries.

First, we can use `{**a, **b}` operator.

In [None]:
combined = { **more_stocks, **from_list_of_list}
combined

Or, we can use the `.update()` method.  If both dictionaries contain the same key, the value from the dictionary in the argument takes precedence.

In [None]:
more_stocks.update(from_list_of_list)
more_stocks

In [None]:
stocks.update(combined)
stocks.update(more_stocks)
stocks

## Number of Entries / Length
Use `len() to` get a dictionary's size (number of entries).

In [None]:
len(stocks)

## Adding, Changing, and Deleting Keys
We can add and change keys by using the `[]` operator.


In [None]:
stocks['UNH'] = 'UnitedHealth Group Inc'
stocks['WBA'] = 'Walgreens Boot Alliance, Inc'
stocks['WMT'] = 'Wally World'
len(stocks)

We now have half of the thirty stocks of the Dow Jones Industrial Average, but we should correct Walmartʼs name.

In [None]:
print(stocks['WMT'])
stocks['WMT'] = 'Walmart Inc'
print(stocks['WMT'])

In [None]:
stocks['TSLA'] = 'Tesla, Inc'
stocks['TWTR'] = 'Twitter Inc'

Whoa, how did those last two appear?  So what is Elon Musk doing now?

We can delete an entry in a dictionary with `del`.

In [None]:
del stocks['TWTR']

We can also get an item's value and remove that entry simultaneously with `pop()`.

In [None]:
name = stocks.pop('TSLA')
print(name)
print(len(stocks))

To reiterate, each key in a dictionary must be unique. If a routine tries to put a key-value pair into the dictionary and the key already exists, then the existing key's value is replaced with the new value.

## Getting an item by key
We have already seen how to access an item by `[key]`, but we can also get the value with `get()`.

In [None]:
print(stocks['WBA'])
print(stocks.get('UNH'))

With `get()`, we can specify a default value if the key doesn't exist in the dictionary

In [None]:
stocks.get('BAC','Unknown stock symbol')

The optional / default value comes in handy if we count items - such as the number of times various words appear in a document.  In this case, the key is the word, and the value is the occurrence count.  As we update the count, if we have not yet seen a particular word, we use a default value of `0`.

In [None]:
word_counts = {}
word = 'test'
word_counts[word] = word_counts.get(word,0) + 1
print(word_counts)

## Getting all Keys
Use the `keys()` method to get an iterable view of all the keys in a dictionary. Before Python 3, this returned a list, but now it returns a type of `dict_keys`.  We can convert that view to a list with the built-in function `list()`.  The advantage of the `dict_keys()` approach is that it does not necessarily create a list (which takes time and memory) with sizable dictionaries.

In [None]:
symbols = stocks.keys()
print(type(symbols))
print(symbols)
symbol_list = list(symbols)
print(symbol_list)

## Getting all Values
Similarly, use `values()` to get an iterable view of all the values in a dictionary.

In [None]:
names = stocks.values()
print(type(names))
print(names)
name_list = list(names)
print(name_list)

## Getting all Key-Value Pairs
To get both the keys and values together as an iterable view, use `items()`.  As with `keys()` and `values()`, this now returns a type of `dict_items`.  Each item is a tuple consisting of a key and a value.

In [None]:
stocks.items()

## Iterating 
Using the `keys()`, `values()`, or `items()` method, we can iterate over the keys, values, or key-value pairs of a dictionary.

In [None]:
some_stocks = {'AXP': 'American Express', 'AMGN': 'Amgen', 'AAPL': 'Apple'}
for symbol in some_stocks.keys():
    print(symbol,end=':')
print()
# notice that we have a fencepost loop issue - code was to demonstrate iterating, 
# we could have called join() from a separator string
":".join(some_stocks.keys())

In [None]:
for symbol in some_stocks.values():
    print(symbol,end=':')
print()

In [None]:
for item in some_stocks.items():
    print("Ticker: {:>4s}, Name: {}".format(item[0], item[1]))

In [None]:
for symbol, name in some_stocks.items():
    print("{}: {}".format(symbol,name))

## Deleting all entries
To delete (remove) all entries from the dictionary, use `clear()`

In [None]:
some_stocks = {'AXP': 'American Express', 'AMGN': 'Amgen', 'AAPL': 'Apple'}
print(len(some_stocks))
some_stocks.clear()
print(len(some_stocks))
print(some_stocks)

You can also assign an empty dictionary to the variable.  However, if you have two variables that refer to the same table, the other variable still has a reference to the original dictionary. [Visualize on PythonTutor](https://pythontutor.com/render.html#code=some_stocks%20%3D%20%7B'AXP'%3A%20'American%20Express',%20'AMGN'%3A%20'Amgen',%20'AAPL'%3A%20'Apple'%7D%0Astocks_dict%20%3D%20some_stocks%0Aprint%20%28some_stocks%29%0A%0Asome_stocks%20%3D%20%7B%7D%0Aprint%28some_stocks%29%0Aprint%28stocks_dict%29&cumulative=false&curInstr=0&heapPrimitives=true&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false)

In [None]:
some_stocks = {'AXP': 'American Express', 'AMGN': 'Amgen', 'AAPL': 'Apple'}
stocks_dict = some_stocks
print (some_stocks)

some_stocks = {}
print(some_stocks)
print(stocks_dict)

## Checking if a key exists
You can check if a key exists by using the `in` operator. This evaluates to `True` or `False`.

In [None]:
'IBM' in stocks_dict

## Assignment
We can use `=` to have multiple variables refer to the same dictionary, as demonstrated a few cells back. Realize that any operation on one of those variables affects them all as they refer to the same underlying dictionary object.

Rename stocks to a more descriptive name.

In [None]:
dow_stocks = stocks

However, note that `dow_stocks` and `stocks` still refer to the same object(dictionary):

In [None]:
print(id(stocks))
print(id(dow_stocks))

In [None]:
s1 = {'AXP': 'American Express', 'AMGN': 'Amgen', 'AAPL': 'Apple'}
s2 = s1
s2['IBM'] = "International Business Machines"
print(s1)
s1.clear()
print(s2)

## Copying
To copy a dictionary to a new object, we can use the `copy()` method.  Copy creates a shallow copy that copies reference values, but not necessarily the objects to which those references point.  For the simple dictionaries seen so far, we will not have any issues with mutability as strings are immutable. However, if you store mutable objects as values, you may have problems with unintended side-effect.

In [None]:
s1 = {'AXP': 'American Express', 'AMGN': 'Amgen', 'AAPL': 'Apple'}
s2 = s1.copy()
s2['IBM'] = "International Business Machines"
print(s1)
print(s2)
s1.clear()
print(s1)
print(s2)

[Visualize on PythonTutor](https://pythontutor.com/render.html#code=s1%20%3D%20%7B'AXP'%3A%20'American%20Express',%20'AMGN'%3A%20'Amgen',%20'AAPL'%3A%20'Apple'%7D%0As2%20%3D%20s1.copy%28%29%0As2%5B'IBM'%5D%20%3D%20%22International%20Business%20Machines%22%0Aprint%28s1%29%0Aprint%28s2%29%0As1.clear%28%29%0Aprint%28s1%29%0Aprint%28s2%29&cumulative=false&curInstr=0&heapPrimitives=true&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false)

You may need to use a deep copy if you store mutable objects (e.g., a list or another dictionary).

In [None]:
import copy
s1 = {'A': {'AX': 'American Express', 'AMGN': 'Amgen', 'AAPL': 'Apple'}}
s2 = copy.deepcopy(s1)
s2['A']['AA']='Alcoa Corp'
print(s1)
print(s2)

[Visualize on PythonTutor](https://pythontutor.com/render.html#code=import%20copy%0As1%20%3D%20%7B'A'%3A%20%7B'AX'%3A%20'American%20Express',%20'AMGN'%3A%20'Amgen',%20'AAPL'%3A%20'Apple'%7D%7D%0As2%20%3D%20copy.deepcopy%28s1%29%0As2%5B'A'%5D%5B'AA'%5D%3D'Alcoa%20Corp'%0Aprint%28s1%29%0Aprint%28s2%29&cumulative=false&curInstr=0&heapPrimitives=true&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false)

In the above example, we have a dictionary of dictionaries.

So `s2['A']` refers to a dictionary(stocks that begin with the letter 'A'), then `s2['A']['AA']` refers to an entry in that "sub" dictionary.  As that entry does not exist, it is created and assigned the value of 'Alcoa Corp'.

In the following cell, print the dictionary for the stock symbols beginning with 'A' - i.e., just that dictionary that the key 'A' indexes.  Then print out the value for 'AX'.

In [None]:
# Get the value that contains the dictionary in the dictionary s1.  Print the value

# From that dictionary, print out the value of the key 'AX'


## Comparing
Unlike some of the other built-in data types that hold multiple values (e.g., lists and tuples), we can only test dictionaries for equality with `==` and `!=`. 

In [None]:
s1 = {'AXP': 'American Express', 'AMGN': 'Amgen', 'AAPL': 'Apple'}
s2 = s1.copy()
print (s1 == s2)
s2['AA'] = 'Alcoa Corp'
print (s1 != s2)
print (s1 != s1)

## Sorting
To sort dictionaries, we first need to ask what are we sorting? We also need to ask what do we need to do with that sort?

For example, if we need to sort by the dictionary's keys and then print out information from the corresponding values, we can use the following pattern:

In [None]:
stock_prices = {'PG': 141.96, 'AXP': 154.42, 'AAPL': 137.13,'CSCO':43.49, 'HD':289.24, 'AMZN': 2538.23}
print(stock_prices)
for key in sorted(stock_prices):
    print("{}: Stock price: ${:,.2f}".format(key,stock_prices[key]))

To get a list of the sorted values, we need to get all of the values and then use the built-in function `sorted()` as we did for lists.

[Python's Guide for Sorting](https://docs.python.org/3/howto/sorting.html)

In [None]:
stock_prices = {'PG': 141.96, 'AXP': 154.42, 'AAPL': 137.13,'CSCO':43.49, 'HD':289.24, 'AMZN': 2538.23}
print(stock_prices)
print(sorted(stock_prices.values()))
print(sorted(stock_prices.values(),reverse=True))

To sort by values but produce a list of the corresponding keys, we need to pass a function as the "key" argument. (Later notebooks will discuss passing functions in more detail)

In [None]:
sorted(stock_prices,key=stock_prices.get)

To understand the previous command and what occurred, try running just `sorted(stock_prices)`.

In [None]:
# Sort the dictionary without any arguments


We can also produce a list of tuples, where each tuple is a key-value pair.  As shown above, to get the key-value pairs, call `items()` on the dictionary:

In [None]:
print (stock_prices.items())
items = stock_prices.items()

print(sorted(items))

By default, sorted uses the first item in the sequence. Similar behavior to sorting a dictionary, it just uses the key.

To sort by the second entry in the tuple, we need to specify a function to the key argument

In [None]:
def sortAtSecondPosition(x):
    return x[1]

In [None]:
sorted(stock_prices.items(), key=sortAtSecondPosition)

Python provides a capability to create small, anonymous functions. While these functions can take any number of arguments, they can consist of only a single expression.
```
lambda p1{, pX}: expression
```
The ```{, pX}``` signifies the portion may repeat zero or more times.  p1, p2, ..., pX represent parameters.

In [None]:
sorted(stock_prices.items(), key=lambda x: x[1])

*Note:* As of Python 3.7, the order of iterating over a dictionary's keys is gauranteed to be the same as the insert order. From a practicality point, though, if you ever explicitly need to specify the order (e.g., sorted), you should do so in code. Yes, the code may run a bit slower, but you avoid obtuse errors introduced by relying upon the insert order.

In [None]:
stock_prices = { 'AAPL': 137.13, 'AXP': 154.42, 'AMZN': 2538.23,'CSCO':43.49, 'HD':289.24, 'PG': 141.96}
print(stock_prices)
# Prefer this approach
for key in sorted(stock_prices):
    print("{}: Stock price: ${:,.2f}".format(key,stock_prices[key]))
print("-------------------")
# over this (even though the results appear the same)
for key in stock_prices:
    print("{}: Stock price: ${:,.2f}".format(key,stock_prices[key]))

## Nesting different data structures
Most of the dictionaries in this notebook have been straightforward key-value pairs of strings.  However, realize that we can create arbitrarily complex data structures by nesting additional data structures as entries.

Lists can contain other lists, but they can also include a dictionary as an entry.

Dictionaries can have lists as values as well as other dictionaries as those values.  For example, to manage the organizational hierarchy for a company, we could have $n$-nested dictionary, where $n$ is the number of levels in the company's hierarchy. The keys in the nested dictionary are the direct reports of an individual.

## Exercises

1. Create a dictionary that maps [top-level domain names](https://en.wikipedia.org/wiki/List_of_Internet_top-level_domains) to their description.  Include at least three entries.

2. Now add another entry into your dictionary.

3. now delete one of the first entries you created from your dictionary

4. Iterate through your dictionary, printing the domain name and the corresponding description.
5. Create a dictionary where the keys are the numbers 1 to 10, and the values are the square of the corresponding key.
6. Find the minimum and maximum values in a dictionary.
7. For a given list of keys, delete those entries from the dictionary.
8. How could we check if a given value exists in the dictionary? How expensive is this operation? How could you measure it?

9. Physical money comes in many different denominations. Write a function that, for a given amount, returns a dictionary containing the smallest number of bills and coins that add up to that number. Do not place any entries in the dictionary that are zero. The possible denominations are 100.00, 50.00, 20.00, 10.00, 5.00, 2.00, 1.00, 0.25, 0.10, 0.05, and 0.01. 

10. Write a method to create a text-based horizontal histogram.  You should have the following function signature:
```
def create_histogram(data, title, sort=False, max_table_width=70, max_label_width=10):
```

So the following - 
```
data = { "apples": 58, "pears": 10, "grapes":35, "pineapple":70}
create_histogram(data,"Number of Fruits in Basket", max_table_width=80, max_label_width=7)
```
produces
```
                           Number of Fruits in Basket                           

 apples │*********************************************************** 58
  pears │********** 10
 grapes │************************************ 35
pineapp │************************************************************************ 70
        └───────────────────────────────────┰───────────────────────────────────┐
        0                                  35                                  70
```
Then
```
create_histogram(data,"Number of Fruits in Basket", sort=True, max_table_width=85, max_label_width=7)
```
produces
```
                              Number of Fruits in Basket                             

  pears │*********** 10
 grapes │************************************** 35
 apples │*************************************************************** 58
pineapp │***************************************************************************** 70
        └─────────────────────────────────────┰──────────────────────────────────────┐
        0                                    35                                     70
```

[Unicode Characters: Box Drawing](https://web.archive.org/web/20220403113744/https://jrgraphix.net/r/Unicode/2500-257F)

Assume all values are positive. As you start this problem, break it down into sub-problems and solve those individually. Create some wins for yourself. For example, how would you produce the row labels along the left-hand side? What other sub-problems exist? What should you tackle next?

This exercise brings together many different topics: functions, default parameters, math operations, variables, string methods, string formatting iteration, dictionaries