### Updating, Merging, and Copying Dictionaries

#### The *.update()* method

updates one dictionary based on items in something else

There are three forms
- d1.update(d2)
- d1.update(iterable)
  - iterable must contain iterable with 2 elements each: (key, value)
- d1.update(keyword-args)
  - argument name will become key
  - argument value will become value (similar to dict(a=10, b=20))

Lets take a look at the first method

In [None]:
d1.update(d2)

d1 and d2 are two dictionaries
- for every (k, v) in d2
  - if k not in d1, inserts (k, v) in d1
  - if k in d1, updates value for k in d1

In [1]:
d1 = {'a': 1, 'b': 2}
d2 = {'b':20, 'c': 30}

In [2]:
d1.update(d2)
print(d1)

{'a': 1, 'b': 20, 'c': 30}


Now lets look at updating using keyword arguments

This is similar to how keyword argmuents are used to create a dictionary
- argument names must be valid identifiers

Notice that order is preserved!

In [3]:
d1 = {'a': 1, 'b': 2}
d1.update(b=20, c=30)

In [4]:
print(d1)

{'a': 1, 'b': 20, 'c': 30}


Again, order is preserved!

And finally lets look at d1.update(iterable) use

We must have an iterable of iterables containing two elements
- key, value

In [5]:
iterable = (('b', 20), ['c', 3])
d1 = {'a':1, 'b': 2}
d1.update(iterable)
print(d1)

{'a': 1, 'b': 20, 'c': 3}


However we can use more complex iterables too! Such as generators and comprehensions

In [11]:
d1.update(((k, ord(k)) for k in 'bcd'))
print(d1)

{'a': 1, 'b': 98, 'c': 99, 'd': 100}


#### Unpacking Dictionaries

This works in a similar way to unpacking a dictionary into keyword arguments in function calls

For function arguments, keys must be valid identifiers

But we don't have this restriction for unpacking dictionaries in general

Check the code for some examples of how this all works

#### Copying Dictionaries

##### Shallow Copies

The container object is a new object

The copied container element keys/values are shared references with original object

You would use code like this to make a shallow copy:

In [None]:
d_copy = d.copy()

Another way is to use the unpacking:

In [None]:
d_copy = {**d}

You can also do this:

In [None]:
d_copy = dict(d)

And also like this:

In [None]:
d_copy = {k: v for k, v in d.items()}

However the (directly) above method is very slow in comparison to the others (so don't use it)

Regardless, all of these methods result in shallow copies!
- dictionaries are independent dictionaries
 - (inserts, deletes are independed)
- but the keys and values are shared references

##### Deep Copies

If a shallow copy is not sufficient, we can create deep copies of dictionaries
- no shared references
  - even with nested dictionaries

We can do this ourselves, and sometimes it requires recursion and we have to be careful with circular references

This might be needed if we don't want a true deep copy, but only a partial deep copy

It is much simpler to use *copy.deepcopy*

In [12]:
from copy import deepcopy

The deepcopy method works for custom objects, iterables, dictionaries, etc

#### Code Examples

In [13]:
d = {'a': 1, 'b': 2, 'c': 3}

In [15]:
d['b'] = 200

In [16]:
d

{'a': 1, 'b': 200, 'c': 3}

In [18]:
d1 = {'a': 1, 'b': 2}
d2 = {'c': 3, 'd': 4}

In [19]:
d1.update(d2)

In [20]:
d1

{'a': 1, 'b': 2, 'c': 3, 'd': 4}

In [21]:
d1 = {'a': 1, 'b': 2}

In [22]:
d1.update(b=20, c=30)

In [24]:
print(d1)

{'a': 1, 'b': 20, 'c': 30}


In [25]:
d1 = {'a': 1, 'b': 2}
d1.update([('c', 2), ['d', 3]])
print(d1)

{'a': 1, 'b': 2, 'c': 2, 'd': 3}


In [26]:
d1 = {'a': 1, 'b': 2}
d1.update((k, ord(k)) for k in "python")
print(d1)

{'a': 1, 'b': 2, 'p': 112, 'y': 121, 't': 116, 'h': 104, 'o': 111, 'n': 110}


In [28]:
l1 = [1, 2, 3]
l2 = 'abc'
l = (*l1, *l2)
print(l)

(1, 2, 3, 'a', 'b', 'c')


In [29]:
d1 = {'a': 1, 'b': 2}
d2 = {'c': 3, 'd': 4}
d = {**d1, **d2}

In [30]:
print(d)

{'a': 1, 'b': 2, 'c': 3, 'd': 4}


In [31]:
d1 = {'a': 1, 'b': 2}
d2 = {'b': 20, 'd': 4}

In [32]:
d = {**d1, **d2}

In [33]:
print(d)

{'a': 1, 'b': 20, 'd': 4}


In [34]:
d1 = {'a': 1, 'b': 2}
d2 = {'b': 20, 'd': 4}
d3 = {'b': 200, 'd': 40, 'e': 5}
d = {**d1, **d2, **d3}
print(d)

{'a': 1, 'b': 200, 'd': 40, 'e': 5}


In [43]:
conf_defaults = dict.fromkeys(('host', 'port', 'user', 'pwd', 'database'), None)

In [44]:
conf_defaults

{'host': None, 'port': None, 'user': None, 'pwd': None, 'database': None}

In [45]:
conf_global = {'port': 5432, 'database': 'deepdive'}

In [46]:
conf_dev = {
    'host': 'localhost',
    'user': 'test',
    'pwd': 'test'
}

conf_prod = {
    'host': 'prodpg.deepdive.com',
    'user': '$prod_user',
    'pwd': '$prod_pwd',
    'database': 'deepdive_prod'
}

So we want to start with the default configuration then overlay our global configuration then overlay either the dev or prod environments

In [49]:
conf = {**conf_defaults, **conf_global, **conf_dev} # This gives us the above with the dev environment!

In [50]:
print(conf)

{'host': 'localhost', 'port': 5432, 'user': 'test', 'pwd': 'test', 'database': 'deepdive'}


In [51]:
conf = {**conf_defaults, **conf_global, **conf_prod} # This gives us the above with the prod environment!

In [52]:
print(conf)

{'host': 'prodpg.deepdive.com', 'port': 5432, 'user': '$prod_user', 'pwd': '$prod_pwd', 'database': 'deepdive_prod'}


In [53]:
def my_func(*, kw1, kw2, kw3):
    print(kw1, kw2, kw3)

In [54]:
d = {'kw2': 20, 'kw3': 30, 'kw1': 10}

In [55]:
my_func(**d)

10 20 30


In [57]:
def my_func(**kwargs):
    for k, v in kwargs.items():
        print(k, v)

In [58]:
my_func(a=1, b=2)

a 1
b 2


In [59]:
my_func(**d)

kw2 20
kw3 30
kw1 10


In [67]:
d = {'a': [1, 2], 'b': [3, 4]}

In [68]:
d1 = d.copy()

In [69]:
d, d1

({'a': [1, 2], 'b': [3, 4]}, {'a': [1, 2], 'b': [3, 4]})

In [70]:
d is d1

False

In [71]:
id(d), id(d1)

(2608695581736, 2608695581176)

In [72]:
d['a'] is d1['a']

True

In [73]:
d['a'].append(100)

In [74]:
d, d1

({'a': [1, 2, 100], 'b': [3, 4]}, {'a': [1, 2, 100], 'b': [3, 4]})

In [75]:
d['x'] = 100

In [76]:
d

{'a': [1, 2, 100], 'b': [3, 4], 'x': 100}

In [77]:
d1

{'a': [1, 2, 100], 'b': [3, 4]}

In [78]:
del d['a']

In [79]:
d

{'b': [3, 4], 'x': 100}

In [80]:
d1

{'a': [1, 2, 100], 'b': [3, 4]}

In [81]:
from copy import deepcopy

In [83]:
d = {'id': 123445,
     'person': {
         'name': 'John',
         'age': 78},
     'posts': [100, 105, 200]   
    }

In [88]:
d_deep = deepcopy(d)
d_shallow = d.copy()

In [89]:
id(d), id(d_deep), id(d_shallow)

(2608695676584, 2608695689448, 2608695691528)

In [91]:
id(d['person']), id(d_shallow['person']), id(d_deep['person'])

(2608695669128, 2608695669128, 2608695556760)

In [92]:
id(d['posts']), id(d_shallow['posts']), id(d_deep['posts'])

(2608695546696, 2608695546696, 2608695686216)

In [93]:
d1 = {'a': [1, 2], 'b': [3, 4]}

In [94]:
d = {**d1, 'c': 100}

In [95]:
d

{'a': [1, 2], 'b': [3, 4], 'c': 100}

In [96]:
id(d), id(d1)

(2608695674984, 2608695633144)

In [97]:
id(d['a']), id(d1['a'])

(2608695704904, 2608695704904)

In [98]:
d1 = {'a': [1, 2], 'b': [3, 4]}
d2 = dict(d1)

In [99]:
id(d1), id(d2)

(2608695503352, 2608695676264)

In [100]:
d1['a'] is d2['a']

True

In [101]:
d1 = {'a': [1, 2], 'b': [3, 4]}
d2 = {k: v for k, v in d1.items()}

In [102]:
d1, d2

({'a': [1, 2], 'b': [3, 4]}, {'a': [1, 2], 'b': [3, 4]})

In [103]:
d1 is d2

False

In [106]:
d1['a'] is d2['a']

True

In [107]:
from random import randint

big_d = {k: randint(1, 100) for k in range(1_000_000)}

In [108]:
len(big_d)

1000000

In [110]:
def copy_unpacking(d):
    d1 = {**d}
    
def copy_copy(d):
    d1 = d.copy()
    
def copy_create(d):
    d1 = dict(d)
    
def copy_comprehension(d):
    d1 = {k: v for k, v in d.items()}
    
def copy_deepcopy(d):
    d1 = deepcopy(d)

In [111]:
from timeit import timeit

In [113]:
timeit('copy_unpacking(big_d)', globals=globals(), number = 100)

3.3145032000002175

In [114]:
timeit('copy_copy(big_d)', globals=globals(), number = 100)

1.9672720999997182

In [115]:
timeit('copy_create(big_d)', globals=globals(), number = 100)

3.2844294000005902

In [116]:
timeit('copy_comprehension(big_d)', globals=globals(), number = 100)

7.842730600000323

In [117]:
timeit('copy_deepcopy(big_d)', globals=globals(), number = 100)

85.51005119999991

Moral of story is that you should not use comprehensions for shallow copies and that deep copies take mad time