### Updating, Merging and Copying

Updating an existing key's value in a dictionary is straightforward:

In [None]:
d = {'a': 1, 'b': 2, 'c': 3}

In [None]:
d['b'] = 200

In [None]:
d

#### The `update` method

Sometimes however, we want to update all the items in one dictionary based on items in another dictionary.

For that we can use the `update` method.

The `update` method has three forms:
1. it can take another dictionary
2. it can take an iterable of iterables of length 2 (key, value)
3. if can take keyword arguments

You'll notice that the arguments we can use with `update` is very similar to the type of arguments we can use with the `dict()` function when we create dictionaries.

Let's look briefly at each of those forms:

In [None]:
d1 = {'a': 1, 'b': 2}
d2 = {'c': 3, 'd': 4}

In [None]:
d1.update(d2)
print(d1)

Note how the key order is maintained and based on the order in which the dictionaries were create/updated.

In [None]:
d1 = {'a': 1, 'b': 2}

In [None]:
d1.update(b=20, c=30)
print(d1)

Again notice how the key order reflects the order in which the parameters were specified when calling the `update` method.

In [None]:
d1 = {'a': 1, 'b': 2}

In [None]:
d1.update([('c', 2), ('d', 3)])

In [None]:
d1

Of course we can use more complex iterables. For example we could use a generator expression:

In [None]:
d = {'a': 1, 'b': 2}
d.update((k, ord(k)) for k in 'python')
print(d)

So far we have updated dictionaries with other dictionaries or iterables that do not contain the same keys. Sometimes that does happen - in that case, the corresponding key in the dictionary being updated has it's associated value replaced by the new value:

In [None]:
d1 = {'a': 1, 'b': 2, 'c': 3}
d2 = {'b': 200, 'd': 4}
d1.update(d2)
print(d1)

#### Unpacking dictionaries

We can also use unpacking to unpack the contents of one dictionary into the elements of another dictionary. This is very similar to how we can unpack iterables. Let's recall that first:

In [None]:
l1 = [1, 2, 3]
l2 = 'abc'
l = (*l1, *l2)
print(l)

We can do something similar with dictionaries:

In [None]:
d1 = {'a': 1, 'b': 2}
d2 = {'c': 3, 'd': 4}
d = {**d1, **d2}
print(d)

Again note how order is preserved.
What happens when there are conflicting keys in the unpacking?

In [None]:
d1 = {'a': 1, 'b': 2}
d2 = {'b': 200, 'c': 3}
d = {**d1, **d2}
print(d)

As you can see, the 'last' key/value pair wins.

Now the nice thing about unpacking is that we are not limited to just two dictionaries.

##### Example

In this example we have some dictionaries we use to configure our application.
One dictionary specifies some configuration defaults for every configuration parameter our application will need.
Another dictionary is used to configure some global configuration, and another set of dictionaries is used to define environment specific configurations, maybe dev and prod.

In [None]:
conf_defaults = dict.fromkeys(('host', 'port', 'user', 'pwd', 'database'), None)
print(conf_defaults)

In [None]:
conf_global = {
    'port': 5432,
    'database': 'deepdive'}

In [None]:
conf_dev = {
    'host': 'localhost',
    'user': 'test',
    'pwd': 'test'
}

conf_prod = {
    'host': 'prodpg.deepdive.com',
    'user': '$prod_user',
    'pwd': '$prod_pwd',
    'database': 'deepdive_prod'
}

Now we can generate a full configuration for our dev environment this way:

In [None]:
config_dev = {**conf_defaults, **conf_global, **conf_dev}

In [None]:
print(config_dev)

and a config for our prod environment:

In [None]:
config_prod = {**conf_defaults, **conf_global, **conf_prod}

In [None]:
print(config_prod)

##### Example

Another way dictionary unpacking can be really useful, is for passing keyword arguments to a function:

In [None]:
def my_func(*, kw1, kw2, kw3):
    print(kw1, kw2, kw3)

In [None]:
d = {'kw2': 20, 'kw3': 30, 'kw1': 10}

In this case, we don't really care about the order of the elements, since we'll be unpacking keyword arguments:

In [None]:
my_func(**d)

Of course we can even use it this way, but here the dictionary order does matter, as it will be reflected in the order in which those arguments are passed to the function:

In [None]:
def my_func(**kwargs):
    for k, v in kwargs.items():
        print(k, v)

In [None]:
my_func(**d)

As you can see the function's `kwargs` dictionary received the elements in the same order as the original dictionary we unpacked.

#### Copying Dictionaries

We can make copies of dictionaries. But as with iterables, we have to differentiate between **shallow** and **deep** copies.

The `copy` method that dictionaries implement is a shallow copy mechanism.
This means that a new container is created, but the item references within the collection are maintained.

Let's see a simple example:

In [None]:
d = {'a': [1, 2], 'b': [3, 4]}

In [None]:
d1 = d.copy()

In [None]:
print(d)
print(d1)

In [None]:
id(d), id(d1), d is d1

So `d` and `d1` are not the same objects, so we can add and remove keys from one dict without affecting the other. Also, we can completely replace an associated value in one without affecting the other.

In [None]:
del d['a']

In [None]:
print(d)
print(d1)

In [None]:
d['b'] = 100

In [None]:
print(d)
print(d1)

But let's see what happens if we mutate the value of one dictionary:

In [None]:
d = {'a': [1, 2], 'b': [3, 4]}
d1 = d.copy()
print(d)
print(d1)

In [None]:
d['a'].append(100)

In [None]:
print(d)

In [None]:
print(d1)

As you can see the mutation was also "seen" by `d1`. This is because the objects `d['a']` and `d1['a']` are in fact the **same** objects.

In [None]:
d['a'] is d1['a']

So if we have nested dictionaries for example, as is often the case with JSON documents, we have to be careful when creating shallow copies.

In [None]:
d = {'id': 123445,
    'person': {
        'name': 'John',
        'age': 78},
     'posts': [100, 105, 200]
    }

In [None]:
d1 = d.copy()

In [None]:
d1['person']['name'] = 'John Cleese'
d1['posts'].append(300)

In [None]:
d1

In [None]:
d

If we want to avoid this issue, we have to create a **deep** copy.
We can easily do this ourselves using recursion, but the `copy` module implements such a function for us:

In [None]:
from copy import deepcopy

In [None]:
d = {'id': 123445,
    'person': {
        'name': 'John',
        'age': 78},
     'posts': [100, 105, 200]
    }

In [None]:
d1 = deepcopy(d)

In [None]:
d1['person']['name'] = 'John Cleese'
d1['posts'].append(300)

In [None]:
d1

In [None]:
d

We saw earlier that we can also copy a dictionary by essentially unpacking the keys of one, or more dictionaries, into another.
This also creates a **shallow** copy:

In [None]:
d1 = {'a': [1, 2], 'b':[3, 4]}
d = {**d1}

In [None]:
d

In [None]:
d1['a'].append(100)

In [None]:
d1

In [None]:
d

At this point you're probably asking yourself, whether to use `**` or `.copy()` to create a shallow copy. We can even create a shallow of one dict by passing the dict to the `dict()` constructor.

Firstly, the `**` unpacking is more flexible because you can unpack multiple dictionaries into a single new one - `copy` is restricted to copying a single dictionary.

But what about timings? Is one faster than the other?

What about using a dictionary comprehension to copy a dictionary? Is that faster/slower?

Let's try it out and see:

In [None]:
from random import randint

big_d = {k: randint(1, 100) for k in range(1_000_000)}

In [None]:
def copy_unpacking(d):
    d1 = {**d}
    
def copy_copy(d):
    d1 = d.copy()

def copy_create(d):
    d1 = dict(d)
    
def copy_comprehension(d):
    d1 = {k: v for k, v in d.items()}

In [None]:
from timeit import timeit

In [None]:
timeit('copy_unpacking(big_d)', globals=globals(), number=100)

In [None]:
timeit('copy_copy(big_d)', globals=globals(), number=100)

In [None]:
timeit('copy_create(big_d)', globals=globals(), number=100)

In [None]:
timeit('copy_comprehension(big_d)', globals=globals(), number=100)

So, creating, unpacking and `.copy()` are about the same - certainly not significant enough to be concerned. A comprehension on the other hand is substantially slower - so, don't use comprehension syntax to do a simple shallow copy!