# Dictionaries
Dictionaries are unordered collections of key-value pairs.  Generally, we use strings for keys, but keys can be any immutable type.  (The primary immutable types we have seen so far are strings, ints, floats, and tuples).  Values can be over any type - including other dictionaries and lists.

Keys must be unique.  If we attempt to put in another key-value pair where the key already exists, the value is overwritten with the new value.

Programmers use dictionaries (i.e., [associative arrays](https://en.wikipedia.org/wiki/Associative_array)) quite frequently. Sometimes, it's just to translate from one value to another such as the stock symbol to stock names that shown in this notebook or may a word in English to its counterpart in another language.    We could also have a dictionary where the key is the class name and the values are the list of students within that class.  Another possibility would be to map HTTP response status code to their description like the information at https://developer.mozilla.org/en-US/docs/Web/HTTP/Status. Retailers often use [SKUs](https://en.wikipedia.org/wiki/Stock_keeping_unit) to track products and inventory.  The SKU could be the key while the value may actually be another dictionary keeping track of a series of properties the retailer needs for each product: name, purchase_price, selling_price, inventory_count, ...

Just as with other objects, we have different ways of creating dictionaries - literals with `{}` and the `dict()` function.

## Literals
With literals, we can either create an empty dictionary or use a sequence of key-value pairs to initially populate a dictionary.

In [None]:
empty_dict = {}
empty_dict

In [None]:
stocks = { "AXP"  : "American Express",
           "AMGN" : "Amgen",
           "AAPL" : "Apple",
           "BA"   : "Boeing"}
print(stocks)

As a side note, we intentionally used extra whitespace while creating the dictionary to make the code easier to read. We code have typed it in several diferent ways:
```
stocks = {'AXP': 'American Express', 'AMGN': 'Amgen', 'AAPL': 'Apple', 'BA': 'Boeing'}
```
```
stocks = { "AXP" : "American Express", "AMGN" : "Amgen",
           "AAPL" : "Apple",
           "BA" : "Boeing"}
```

## dict() function
With the built-in dict function, we can create list from key value pairs specified as arguments or from two value sequences.

In [None]:
more_stocks = dict(CAT='Caterpillar', CVX='Chevron', CSCO='Cisco', KO='Coca-Cola')
more_stocks

With this named argument method, arguments must follow the naming rules for variables: no spaces, no reserved words, starts with a letter or underscore.

And then for converting from two-value sequences:

In [None]:
from_list_of_list = dict([ ['DIS', 'Walt Disney Co'], ['DOW', 'Dow'], ['GS', 'Goldman Sacs'], ['HD', 'Home Depot'] ])
from_list_of_list

In [None]:
from_list_of_tuples = dict([ ('HON', 'Honeywell'), ('IBM', 'International Business Machine Corp'), ('INTC', 'Intel'), 
                             ('JNJ', 'Johnson & Johnson'), ('JPM', 'JP Morgan Chase'), ('MCD', 'McDonald\'s')])
from_list_of_tuples

In [None]:
from_tuple_of_tuples = dict( ( ('MMM','3M Corp'), ('MRK', 'Merck'), ('MSFT', 'Microsoft'), 
                               ('NKE', 'Nike'), ('PG', 'Proctor and Gamble') ) )

In [None]:
from_tuple_of_mixed = dict(( ['CRM', 'Saleforce'], ['TRV', 'Travelers'], ('VZ', 'Verizon'), ('V','Visa') )  )

## Combining Dictionaries
Python has two different ways to combine dictionaries.

First, we can use `{**a, **b}` operator.

In [None]:
combined = { **from_tuple_of_tuples, **from_tuple_of_mixed}
combined

Or, we can can the `.update()` method.  If both dictionaries contain the same key, the value from the dictionary in the argument will be kept.

In [None]:
more_stocks.update(from_list_of_list)
more_stocks.update(from_list_of_tuples)
more_stocks

In [None]:
stocks.update(combined)
stocks.update(more_stocks)
stocks

## Number of Entries / Length
Use `len()` to get the size of (number of entries in) a dictionary.

In [None]:
len(stocks)

hmmm...  looks like we are missing a few items from this dictionary.

## Adding, Changing, and Deleting Keys
We can add and change keys by using `[]` operator.


In [None]:
stocks['UNH'] = 'UnitedHealth Group Inc'
stocks['WBA'] = 'Walgreens Boot Alliance, Inc'
stocks['WMT'] = 'Wally World'
len(stocks)

We've got the thirty stocks of the Dow Jones Industrial Average now, but we should correct Walmart's name.

In [None]:
print(stocks['WMT'])
stocks['WMT'] = 'Walmart Inc'
print(stocks['WMT'])

In [None]:
stocks['TSLA'] = 'Tesla, Inc'
stocks['TWTR'] = 'Twitter Inc'

Whoa, how did those last two sneak in?  What's Elon Musk doing now?

We can delete an entry in a dictionry with `del`

In [None]:
del stocks['TWTR']

We can also get an item's value and remove that entry at the same time with `pop()`.

In [None]:
name = stocks.pop('TSLA')
print(name)
print(len(stocks))

TODO: TALK ABOUT KEYS BEING UNIQUE.  both with UPDATEING AND COMBINING

## Getting an item by key
We've already seen how to access an item by `[key]`, but we can also get the value with `get()`

In [None]:
print(stocks['WBA'])
print(stocks.get('UNH'))

With `get()`, we can specify a default value if the key doesn't exist in the dictionary

In [None]:
stocks.get('BAC','Unknown stock symbol')

The optional / default value comes in handy if we are counting items - such as the number of times various words appear in a document.  In this case, the key is the word and the value is occurence count.  As we update the count, if the word hasn't already been seen already, we can use a value of `0`.

In [None]:
word_counts = {}
word = 'test'
word_counts[word] = word_counts.get(word,0) + 1
print(word_counts)

## Getting all Keys
Use the `keys()` method to get an iterable view of all of the keys in a dictionary.  Prior to Python 3, this used to return a list, but now returns a type of `dict_keys`.  We can convert that to a list with `list()`.  The advantage of the `dict_keys()` is that it does not necessarily create a list (which takes time and memory) with large dictionaries.

In [None]:
symbols = stocks.keys()
print(type(symbols))
print(symbols)
symbol_list = list(symbols)
print(symbol_list)

## Getting all Values
Similarly, use `values()` to get an iterable view of all of the values in a dictionary.

In [None]:
names = stocks.values()
print(type(names))
print(names)
name_list = list(names)
print(name_list)

## Getting all Key-Value Pairs
To get both the keys and values together as iterable view, use `items()`.  Just as with `keys()` and `values()`, this now returns a type of `dict_items`.  Each item is a tuple consisting of the key and then value.

In [None]:
stocks.items()

## Iterating 
Using the `keys()`, `values()`, or `items()` method, we can iterate over the keys, values, or key value pairs (as a tuple) witin a dictionary.

In [None]:
some_stocks = {'AXP': 'American Express', 'AMGN': 'Amgen', 'AAPL': 'Apple'}
for symbol in some_stocks.keys():
    print(symbol,end=':')
print()
# notice that we have a fencepost loop issue - code was to demonstrate iterating, 
# we could have called join() from a separator string
":".join(some_stocks.keys())

In [None]:
for symbol in some_stocks.values():
    print(symbol,end=':')
print()

In [None]:
for item in some_stocks.items():
    print(item[0], item[1])

In [None]:
for symbol, name in some_stocks.items():
    print("%s:%s" % (symbol,name))

## Deleting all entries
To delete (remove) all entries from the dictionary, use `clear()`

In [None]:
some_stocks = {'AXP': 'American Express', 'AMGN': 'Amgen', 'AAPL': 'Apple'}
print(len(some_stocks))
some_stocks.clear()
print(len(some_stocks))
print(some_stocks)

You can also assign an empty dictionary to the variable.  However, if you have two variables that refer to the same table, the other variable still has a reference to the original dictionary.

In [None]:
some_stocks = {'AXP': 'American Express', 'AMGN': 'Amgen', 'AAPL': 'Apple'}
stocks_dict = some_stocks
print (some_stocks)

some_stocks = {}
print(some_stocks)
print(stocks_dict)

## Checking if a key exists
You can check if a key exists by using the `in` operator.

In [None]:
'IBM' in stocks_dict

## Assignment
We can use `=` to have multiple variables refer to the same dictionary as we demonstrated a few cells back.  Just realize, that any operation on one of those varaibles, effects them all as they refer to the same underlying dictionary object.

Let's give `stocks` to a more descriptive name.

In [None]:
dow_stocks = stocks

However, note that `dow_stocks` and `stocks` still refer to the same object(dictionary):

In [None]:
print(id(stocks))
print(id(dow_stocks))

In [None]:
s1 = {'AXP': 'American Express', 'AMGN': 'Amgen', 'AAPL': 'Apple'}
s2 = s1
s2['IBM'] = "International Business Machines"
print(s1)
s1.clear()
print(s2)

## Copying
To copy a dictionary to a new object, we can use the `copy()` method.  This creates a shallow copy in which the references themselve are copied, but not necessarily the objects that those references point to.  In our simple dictionaries that notebook has used, we will not see the problem as strings are immutable, but if you store mutable objects than this can be an issue.


In [None]:
s1 = {'AXP': 'American Express', 'AMGN': 'Amgen', 'AAPL': 'Apple'}
s2 = s1.copy()
s2['IBM'] = "International Business Machines"
print(s1)
print(s2)
s1.clear()
print(s1)
print(s2)

If you are storing mutable objects (e.g., a list or another dictionary), than you may need to use a deep copy.

In [None]:
import copy
s1 = {'A': {'AX': 'American Express', 'AMGN': 'Amgen', 'AAPL': 'Apple'}}
s2 = copy.deepcopy(s1)
s2['A']['AA']='Alcoa Corp'
print(s1)
print(s2)

In the above example, we have a dictionary of dictionaries.

So `s2['A']` refers to a dictionary(stocks that begin with the letter 'A'), then `s2['A']['AA']` refers to an entry in that "sub" dictionary.  As that entry does not exist yet, it is created and assigned the value of 'Alcoa Corp'.

In the following cell, print out all of the dictionary for the stock symbols beginning with 'A' - i.e., just that dictionary that the key 'A' points to.  Then print out the value for 'AX'

## Comparing
Unlike some of the other built-in data types that hold multiple values (e.g., lists and tuples), we can only test dictionaries for equality with `==` and `!=`. 

In [None]:
s1 = {'AXP': 'American Express', 'AMGN': 'Amgen', 'AAPL': 'Apple'}
s2 = s1.copy()
print (s1 == s2)
s2['AA'] = 'Alcoa Corp'
print (s1 != s2)
print (s1 != s1)

## Sorting
To sort dictionaries, we first need to ask what needs to be sorted?

To just get a list of the sorted values, we need to get all of the values and then use the builtin function `sorted()` just as we had done for lists. 

[Python's Guide for Sorting](https://docs.python.org/3/howto/sorting.html)

In [None]:
stock_prices = {'PG': 141.96, 'AXP': 154.42, 'AAPL': 137.13,'CSCO':43.49, 'HD':289.24}
print(stock_prices)
print(sorted(stock_prices.values()))
print(sorted(stock_prices.values(),reverse=True))

To sort by values, but produce a list of the corresponding keys, we will need to 
pass a function as the "key" argument.  (Later notebooks will discuss passing functions in more detail)

In [None]:
sorted(stock_prices,key=stock_prices.get)

To understand the previous command and what occurred, try running just `sorted(stock_prices)`.

We can also produce a list of tuples, where each tuple is a key-value pair.  As shown above, to get the key-value pairs, call `item()` on the dictionary:

In [None]:
print (stock_prices.items())
items = stock_prices.items()
sorted(items)

By default, sorted uses the first item in sequence.  Similar behavior to sorting dictionary, it just uses the key.

To sort by the second entry in tuple, we need to specify a function to the key argument

In [None]:
def sortAtSecondPosition(x):
    return x[1]

In [None]:
sorted(stock_prices.items(), key=sortAtSecondPosition)

Python provides a special capability to create small, anonmyous functions.  While these functions can take any number of arguments, they can consist of only a single expression
```
lambda p1{,pX}: expression
```
the ```{,pX}``` signifies the portion may repeat zero or more times.

In [None]:
sorted(stock_prices.items(), key=lambda x: x[1])

## Nesting different data structures
Most of the dictionaries shown in this notebook have been straightforward key-value pairs of strings.  However, realize that we can create arbitrarily complex data structures by nesting additional data structures as entries.

Lists can continue other lists, but that can also contain dictionaries for each entry.

Dictionaries can have lists as values and, also, other dictionaries as those values.  For example, to manage the organizational hierarchy for a company, we could have $n$-nested dictionary, where $n$ is the number of levels in the company's hierarchy. The keys in the nested dictionary are the direct reports of an individual.

## Case Study: ????
TODO

## Exercises
1. What's wrong with this code?


In [None]:
from_tuple_of_tuples = dict( ('MCD', 'McDonald\'s'), ('MRK', 'Merck'), ('MSFT', 'Microsoft'), 
                               ('NKE', 'Nike'), ('PG', 'Proctor and Gamble')  )

In [None]:
from_tuple_of_tuples

2. Does the following work?
update with three arguments ..

3. Create an dictionary maps [top level domain names](https://en.wikipedia.org/wiki/List_of_Internet_top-level_domains) into their description.  Include at least 3 entries.

4. Now add another entry into your dictionary.

5. now delete one of the first entries you created from your dictionary

6. Iterate through your dictionary printing the domain name and the corresponding description.
7. Create a dictionary where the keys are the numbers 1 to 10 and the values are the square of the corresponding key.
8. Find the minimum and maximum values in a dictionary
9. for a given list of keys, delete from the dictionary.
10. How could we check if a given value exists in the dictionary? How expensive is this operation? How could you measure it?

10. Physical money comes in many different denominations.  Write a function that for a given amount returns a dictionary containing the the smallest number of bills and coins that add up to that number.  You should not have any entries in the dictionary that are "0".    The possible denominations are 100.00, 50.00, 20.00, 10.00, 5.00, 2.00, 1.00, 0.25, 0.10, 0.05, and 0.01. 

11. Write a method to create a text-based horizontal histogram.  You should have the following function signature:
```
def create_histogram(data, title, sort=False, max_table_width=70, max_label_width=10):
```

So the following - 
```
data = { "apples": 58, "pears": 10, "grapes":35, "pineapple":70}
create_histogram(data,"Number of Fruits in Basket", max_table_width=80, max_label_width=7)
```
produces
```
                           Number of Fruits in Basket                           

 apples │*********************************************************** 58
  pears │********** 10
 grapes │************************************ 35
pineapp │************************************************************************ 70
        └───────────────────────────────────┰───────────────────────────────────┐
        0                                  35                                  70
```
Then
```
create_histogram(data,"Number of Fruits in Basket", sort=True, max_table_width=85, max_label_width=7)
```
produces
```
                              Number of Fruits in Basket                             

  pears │*********** 10
 grapes │************************************** 35
 apples │*************************************************************** 58
pineapp │***************************************************************************** 70
        └─────────────────────────────────────┰──────────────────────────────────────┐
        0                                    35                                     70
```

[Unicode Characters: Box Drawing](https://web.archive.org/web/20220403113744/https://jrgraphix.net/r/Unicode/2500-257F

Assume all values are positive.  As you start this problem, break into down into sub-problems and solve those individually.  Create a little success for yourself.  For example, how would you produce the row labels along the left hand side?  What other sub-problems exist?  What should you tackle next?

This exercise brings together many different topics: functions, default parameters, math operations, variables, string methods, string formatting iteration, dictionaries

In [None]:
def create_histogram(data, title, sort=False, max_table_width=70, max_label_width=10):
    max_data_value = int(max(data.values()))
    mid_data_value = max_data_value//2
    max_bar_length = max_table_width - max_label_width -1
    offset = max_bar_length % 2      #if bar length is odd, we need a bit more space on the last tick to avoid off by 1

    print(title.center(max_table_width)+"\n")

    items = sorted(data, key=data.get) if sort else data.keys()
    
    for item in items:
        print("{:>{w}.{w}} \u2502".format(item,w=max_label_width),end='')
        value = data[item]
        bar_length = int(value / max_data_value * max_bar_length)
        print("*"*bar_length, value)
        
    print(" "*(max_label_width+1) + "\u2514" + "\u2500"*(max_bar_length//2-1) + "\u2530" +
         "\u2500"*(max_bar_length//2 + offset -1)   + 
          "\u2510" )
    print(" "*(max_label_width+1) + "0"+"{:>{w}d}".format(mid_data_value,w=max_bar_length//2) + 
          "{:>{w}d}".format(max_data_value, w= max_bar_length//2+offset))

In [None]:
data = { "apples": 58, "pears": 10, "grapes":35, "pineapple":70}
create_histogram(data,"Number of Fruits in Basket", sort=True, max_table_width=80, max_label_width=7)

In [None]:
 print("{:>27d}".format(25))

In [None]:
data['apples']

In [None]:
max_value = 
max_value

In [None]:
max_value
 