### Working with Dictionaries

A very common operation when working with dictionaries is to test if a specific key is present in the dictionary.

We can use the `in` and `not in` operators for this:

In [1]:
data = {
    'open': 100,
    'high': 110,
    'low': 95,
    'close': 110
}

In [2]:
'open' in data

True

In [3]:
'volume' in data

False

In [4]:
'volume' not in data

True

Membership testing for dictionaries is extremely fast, no matter what the dictionary size.

If we have a list of items, checking for the presence of some value in the list means iterating through the list, element by element, until either the element is found, or the list has been searched completely.

By contrast, dictionary key lookups use a specialized structure (a hash map) that is very efficient at locating the presence of a key.

We've seen how to use the `del` keyword to remove specific key/value pairs from a dictionary.

To remove all the elements of a dictionary, we can use the `clear()` method:

In [5]:
data

{'open': 100, 'high': 110, 'low': 95, 'close': 110}

In [6]:
data.clear()

In [7]:
data

{}

We can get the number of entires in a dictionary by using the `len()` function:

In [8]:
len(data)

0

In [9]:
data = {
    'open': 100,
    'high': 110,
    'low': 95,
    'close': 110
}

In [10]:
len(data)

4

We can create copies of dictionaries - either shallow or deep.

One way to create a shallow copy is to use the `copy()` method:

In [11]:
data_copy = data.copy()

In [12]:
data_copy

{'open': 100, 'high': 110, 'low': 95, 'close': 110}

A deep copy can be achieved using the `deepcopy()` function in the `copy` module - just like we saw with lists for example. And the same differences between shallow and deep copies exists here.

We've seen two methods so far to create dictionaries:

1. use a literal
2. make a copy of an existing dictionary

Here are a few more ways we can create dictionaries.

We can use the `dict()` function, that uses named arguments to specify both the key and the value. The restriction here is that the keys will need to be strings that follow Python's requirements for variable names - but the keys in the dictionary will be the stringified version of the argument name.

This does mean we cannot use this method to create keys that are numbers, or strings with spaces in them, tuples, etc. 

But, it can be handy in some situations.

In [13]:
d = dict(high=100, low=95)

In [14]:
d

{'high': 100, 'low': 95}

Sometimes we want to create a dictionary that has some keys all initialized to the same value.

We could do something like this:

In [15]:
d = {
    'open': 0,
    'high': 0,
    'low': 0,
    'close': 0
}

But we have a slightly easier alternative using the `fromkeys()` method:

In [16]:
d = dict.fromkeys(['open', 'high', 'low', 'close'], 0)

In [17]:
d

{'open': 0, 'high': 0, 'low': 0, 'close': 0}

Note how the `fromkeys()` method is called from the `dict` type - not an existing dictionary instance.

The first argument of `fromkeys` can be any iterable. In the previous example we used a list, but it could have been a tuple as well:

In [18]:
keys = 'open', 'high', 'low', 'close'

In [19]:
d = dict.fromkeys(keys, 0)

In [20]:
d

{'open': 0, 'high': 0, 'low': 0, 'close': 0}

Of course a string is an iterable too, so we could do this as well:

In [21]:
d = dict.fromkeys('abc', 0)

In [22]:
d

{'a': 0, 'b': 0, 'c': 0}

Something like this could be useful to determine all the unique items in an iterable (since keys have to be unique):

In [23]:
symbols = ['AAPL', 'MSFT', 'AAPL', 'MSFT']

In [24]:
d = dict.fromkeys(symbols, 0)

In [25]:
d

{'AAPL': 0, 'MSFT': 0}

We could even use this to determine all the unique characters in a string:

In [26]:
d = dict.fromkeys('Python is an awesome language!', 0)

In [27]:
d

{'P': 0,
 'y': 0,
 't': 0,
 'h': 0,
 'o': 0,
 'n': 0,
 ' ': 0,
 'i': 0,
 's': 0,
 'a': 0,
 'w': 0,
 'e': 0,
 'm': 0,
 'l': 0,
 'g': 0,
 'u': 0,
 '!': 0}

We'll come back to this when we discuss sets, which are very closely related to dictionaries.

We can create empty dictionaries, using either literal, or the `dict()` function:

In [28]:
d1 = {}
d2 = dict()

In [29]:
d1

{}

In [30]:
d2

{}

This is often used in situations where we may start with an empty dictionary and mutate the dictionary as our code is running.

In [31]:
transactions = [
    {'item': 'widget', 'trans_type': 'sale', 'quantity': 10},
    {'item': 'widget', 'trans_type': 'sale', 'quantity': 5},
    {'item': 'widget', 'trans_type': 'refund', 'quantity': 2},
    {'item': 'license', 'trans_type': 'sale', 'quantity': 1},
    {'item': 'license', 'trans_type': 'sale', 'quantity': 1},
    {'item': 'license', 'trans_type': 'refund', 'quantity': 1},
]

Suppose we want to get the following information:

- total sold quantity per item
- net sold quantity per item

We could approach it this way:

In [32]:
total_sold = {}  # empty dictionary

for transaction in transactions:
    item = transaction['item']
    is_sale = transaction['trans_type'] == 'sale'
    # You could also write:
    # is_sale = True if transaction['trans_type'] == 'sale' else False
    # You might consider that more "readable", but most Python devs will 
    # use the first (preferred) approach.
    quantity = transaction['quantity']
    
    if is_sale:
        if item in total_sold:
            # item already present, update sold count by quantity
            total_sold[item] = total_sold[item] + quantity
        else:
            # item not present - create it and set sold count to quantity
            total_sold[item] = quantity
            
print(total_sold)

{'widget': 15, 'license': 2}


For net quantities, we could do this:

In [33]:
net_sales = {}

for transaction in transactions:
    item = transaction['item']
    is_sale = transaction['trans_type'] == 'sale'
    quantity = transaction['quantity']
    
    if not is_sale:
        # this was a refund - make quantity negative
        quantity = -quantity
        
    if item in net_sales:
            # item already present, update cnet_sales value by quantity
            net_sales[item] = net_sales[item] + quantity
    else:
        # item not present - create it and set sold count to quantity
        net_sales[item] = quantity
            
print(net_sales)

{'widget': 13, 'license': 1}


You'll notice that we had to use an `if` statement to do something different based on wherthger the key already existed in the dictionary or not.

The reason for this is that when a new item is encountered (not already in `net_sales` for example), we wanted to essentially have that item with an initial count set to the quantity, whereas if the item had already been encountered, it would already have a value that we just to need to update with the `quantity`.

But if we think of this a bit differently, what we really would like to do is to always update the current value in the dictionary with the quantity, using `0` as an initial value if this is the first time we encounter the item.

So we could rewerite this code as follows:

In [34]:
total_sold = {}  # empty dictionary

for transaction in transactions:
    item = transaction['item']
    is_sale = transaction['trans_type'] == 'sale'
    quantity = transaction['quantity']
    
    if is_sale:
        if item not in total_sold:
            total_sold[item] = 0  # create new item, initialized to 0
        total_sold[item] = total_sold[item] + quantity
            
print(total_sold)

{'widget': 15, 'license': 2}


The reason for this is that cannot just use:
```
total_sold[item] = total_sold[item] + quantity
```
if this is the first time we encounter the item, since `total_sold[item]` would raise a `KeyError`.

Ideally we like to be able to change the code to something like this:

```python
total_sold = {}  # empty dictionary

for transaction in transactions:
    item = transaction['item']
    is_sale = transaction['trans_type'] == 'sale'
    quantity = transaction['quantity']
    
    if is_sale:
        total_sold[item] = "<get current item value, or zero if not present>" + quantity
 ```

So if we could do `<get current item value, or zero if not present>`, we would be able to use the simplified code we have above.

Fortunately Python has that precise method available, the `get()` method.

The `get()` method takes two arguments: a **key** and a **default value**.

- if the key is **found** in the dictionary, `get()` will return the value of that key
- if the key is **not** found in the dictionary, `get()` returns the specified default value

Let's see a simple example:

In [35]:
d = dict.fromkeys('abc', 0)

In [36]:
d

{'a': 0, 'b': 0, 'c': 0}

If we try to get an existing key:

In [37]:
d.get('a', 100)

0

we get the value for `a`.

But if the key does not exist:

In [38]:
d.get('x', 100)

100

we get the default value back instead.

This `get()` method, unlike using `[]` does not produce a `KeyError` if the key does not exist.

This is a very handy method, and we can now apply it to our earlier example:

In [39]:
total_sold = {}  # empty dictionary

for transaction in transactions:
    item = transaction['item']
    is_sale = transaction['trans_type'] == 'sale'
    quantity = transaction['quantity']
    
    if is_sale:
        total_sold[item] = total_sold.get(item, 0) + quantity
        
print(total_sold)

{'widget': 15, 'license': 2}


If we don't specify a default value, then Python will use `None` as the default:

In [40]:
d = dict(a=1, b=2)

In [41]:
d.get('a')

1

In [42]:
print(d.get('x'))

None


Another common dictionary operation is merging two dictionaries together.

Something like this:

In [43]:
d1 = {'a': 1, 'b': 2}
d2 = {'c': 3, 'd': 4}

And we want to merge all these pairs into a single dictionary, to get this result:

In [44]:
combined = {'a': 1, 'b': 2, 'c': 3, 'd': 4}

We can use the `update()` method on a dictionary to update it with another dictionary:

In [45]:
d1 = {'a': 1, 'b': 2}
d2 = {'c': 3, 'd': 4}

In [46]:
d1.update(d2)

In [47]:
d2

{'c': 3, 'd': 4}

In [48]:
d1

{'a': 1, 'b': 2, 'c': 3, 'd': 4}

Notice that `d2` was unaffected, but `d1` **was mutated**!

So we can use `update()` to merge one dictionary into another - but what happens if the same key is present in both dictionaries?

The value in `d2` replaces the value in `d1`:

In [49]:
d = {'a': 1, 'b': 2}
d.update({'b': 200, 'c': 3})
print(d)

{'a': 1, 'b': 200, 'c': 3}


Dictionaries are used extensively in Python, so we'll encounter these and other methods and functions to work with them throughout this course. 