# 01 - Creating Python Dictionaries

#### Literals

For example:

In [1]:
d = {'john': ['John Cleese'],
     (0, 0): 'origin',
    }

#### Constructor

This approach is less flexible than using literals because the keys must be a valid identifier name (e.g. variable, function, class names, etc). The key will be converted into a string. We cannot create a dictionary with a tuple as a key using this approach:

It has the form `dict(key1=value1, key2=value2)`

In [8]:
d = dict(john=['John Cleese'], my_func='this is a function')

We can also use another form with the `dict()` constructor: `dict([(key1, value1), [key2, value2]])`. 

As you can see, the key-value pairs can be any iterables e.g. tuples, lists etc. Also they can be contained in any iterable. In the above, they are contained in a list.

In [1]:
d = dict([('a', 100), ['b', 200]])
d

{'a': 100, 'b': 200}

We can also pass dictionaries to `dict()`. This will produce a **shallow copy**:

In [2]:
d = {'a': 1, 'b': 2, 'c': [3, 4, 5]}

copy = dict(d)
d

{'a': 1, 'b': 2, 'c': [3, 4, 5]}

In [3]:
d['c'].append(100)

print(d)
print(copy)

{'a': 1, 'b': 2, 'c': [3, 4, 5, 100]}
{'a': 1, 'b': 2, 'c': [3, 4, 5, 100]}


#### Dictionary Comprehensions

For example:

In [11]:
d = {str(i): i ** 2 for i in range(5)}
d

{'0': 0, '1': 1, '2': 4, '3': 9, '4': 16}

Here's another example:

In [4]:
keys = ['a', 'b', 'c']
values = (1, 2, 3)

d = {k: v for k, v in zip(keys, values)}
d

{'a': 1, 'b': 2, 'c': 3}

#### `dict.fromkeys()`

This created a dictionary with `specified keys` all having the **same value**. It has the form `dict.fromkeys(iterable, value=None)` where the iterable must have **hashable elements**. These elements will become the keys.

In [13]:
d = dict.fromkeys(['a', (0,0), 250], 'N/A')
d

{'a': 'N/A', (0, 0): 'N/A', 250: 'N/A'}

Any iterable will do, so we can pass a generator expression if we like:

In [15]:
d = dict.fromkeys((i**2 for i in range(5)), False)
d

{0: False, 1: False, 4: False, 9: False, 16: False}

# 02 - Common Operations

Most common operations will be related to the keys not the values. For example `len(d)` will return the number of keys in `d`.

#### Membership Tests

Membership tests are seeing if keys are present in a dictionary - they're very efficient. All we need to do is hash the key and traverse the probe sequence.

We can use the `in` and `not in` operators to test the presence of a **key** in a dictionary:

In [27]:
d = dict(a=1, b=2, c=3)

In [28]:
'a' in d

True

#### Removing elements from a dictionary

We can use the `del` operator, `.pop(key)` method or the `.popitem()` method to remove a key from a dictionary:

In [29]:
d = dict.fromkeys('abcd', 0)

In [30]:
d

{'a': 0, 'b': 0, 'c': 0, 'd': 0}

We can remove a key this way. If it doesn't exist, we get a `KeyError` exception.

In [31]:
del d['a']
d

{'b': 0, 'c': 0, 'd': 0}

When the key is popped, the **value** is returned

In [32]:
print(d.pop('b'))
d

0


{'c': 0, 'd': 0}

We can specify a default value to `.pop()` so that a `KeyError` exception isn't thrown when we can't find the key:

In [34]:
print(d.pop('idontexist', None))

None


The `.popitem()` method will remove the **last** item that was inserted into the dictionary and return that item, i.e. a key-value pair. In other words, **last inserted - popped first -> LIFO**.

In [35]:
print(d.popitem())

('d', 0)


#### Inserting keys with a default

Sometimes we may want to insert an element in a dictionary with a default value, but only if the element is not already present. It has the form: `.setdefault(key, value)`.

In [37]:
d = {'a':1, 'b':2, 'c':3}

print(d.setdefault('a', 100))
print(d.setdefault('d', 100))
print(d)

1
100
{'a': 1, 'b': 2, 'c': 3, 'd': 100}


#### Examples

##### Example 1

Here we have a string where we want to count the number of each character that appears in the string.
Since we know the alphabet is a-z, we could create a dictionary with these initial keys - but maybe the string contains characters outside of that, maybe punctuation marks, emojis, etc. So it's not really feasible to take that approach.

In [39]:
text = 'Sed ut perspiciatis, unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam eaque ipsa, quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt, explicabo. Nemo enim ipsam voluptatem, quia voluptas sit, aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos, qui ratione voluptatem sequi nesciunt, neque porro quisquam est, qui dolorem ipsum, quia dolor sit amet consectetur adipisci[ng] velit, sed quia non-numquam [do] eius modi tempora inci[di]dunt, ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit, qui in ea voluptate velit esse, quam nihil molestiae consequatur, vel illum, qui dolorem eum fugiat, quo voluptas nulla pariatur?'

counts = dict()
for char in text:
    counts[char] = counts.get(char, 0) + 1

print(counts)

{'S': 1, 'e': 77, 'd': 22, ' ': 128, 'u': 69, 't': 65, 'p': 22, 'r': 38, 's': 43, 'i': 76, 'c': 19, 'a': 70, ',': 20, 'n': 37, 'o': 51, 'm': 43, 'v': 15, 'l': 33, 'q': 26, 'b': 5, 'h': 3, 'x': 3, '.': 2, 'N': 1, 'f': 2, 'g': 5, '[': 3, ']': 3, '-': 1, 'U': 1, '?': 2, 'Q': 1}


##### Example 2

This is a continuation of the first example. What we want to do is create a dictionary with three keys: upper, lower and other. The values of these keys should be any iterable that contains all the upper, lower and other values, respectively. 

Since we don't want repeat characters, the values are going to be sets.

The string module will come in handy:

In [41]:
import string

print(string.ascii_lowercase)
print(string.ascii_uppercase)

abcdefghijklmnopqrstuvwxyz
ABCDEFGHIJKLMNOPQRSTUVWXYZ


In [43]:
import string

text = 'Sed ut perspiciatis, unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam eaque ipsa, quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt, explicabo. Nemo enim ipsam voluptatem, quia voluptas sit, aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos, qui ratione voluptatem sequi nesciunt, neque porro quisquam est, qui dolorem ipsum, quia dolor sit amet consectetur adipisci[ng] velit, sed quia non-numquam [do] eius modi tempora inci[di]dunt, ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit, qui in ea voluptate velit esse, quam nihil molestiae consequatur, vel illum, qui dolorem eum fugiat, quo voluptas nulla pariatur?'

categories = {}

for char in text:
    if char in string.ascii_lowercase:
        key = 'lower'
    elif char in string.ascii_uppercase:
        key = 'upper'
    else:
        key = 'other'

    if key not in categories:   
        categories[key] = set()

    categories[key].add(char)

print(categories)

{'upper': {'N', 'Q', 'S', 'U'}, 'lower': {'f', 's', 'e', 'h', 'c', 'p', 'a', 'd', 'n', 'g', 'i', 't', 'v', 'r', 'u', 'o', 'b', 'x', 'q', 'l', 'm'}, 'other': {' ', ']', '.', ',', '?', '[', '-'}}


To make the output more readable:

In [47]:
for key, value in categories.items():
    print(f"{key}: {''.join(value)}")

upper: NQSU
lower: fsehcpadngitvruobxqlm
other:  ].,?[-


We can improve this by using `setdefault()` so that, if the key doesn't exist, we create it with a default value of `set()` and return that set. If the key does exist, we just get the set back.

In [50]:
import string

text = 'Sed ut perspiciatis, unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam eaque ipsa, quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt, explicabo. Nemo enim ipsam voluptatem, quia voluptas sit, aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos, qui ratione voluptatem sequi nesciunt, neque porro quisquam est, qui dolorem ipsum, quia dolor sit amet consectetur adipisci[ng] velit, sed quia non-numquam [do] eius modi tempora inci[di]dunt, ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit, qui in ea voluptate velit esse, quam nihil molestiae consequatur, vel illum, qui dolorem eum fugiat, quo voluptas nulla pariatur?'

categories = {}

for char in text:
    if char in string.ascii_lowercase:
        key = 'lower'
    elif char in string.ascii_uppercase:
        key = 'upper'
    else:
        key = 'other'

    value = categories.setdefault(key, set())
    value.add(char)

for key, value in categories.items():
    print(f"{key}: {''.join(value)}")

upper: NQSU
lower: fsehcpadngitvruobxqlm
other:  ].,?[-


To further improve the efficiency and also reduce the number of lines of code, we can wrap the if-elif-else into a function.

What's more efficient than `if char in string.ascii_lowercase`? This has to iterate through each character in `string.ascii_lowercase` -> O(n) time complexity.

It would be faster if each character, either upper, lower or other, was a key whose value was 'upper', 'lower' or 'other' -> O(1) time complexity

In [52]:
import string

def key_category_from_char(char):
    lower = dict.fromkeys(string.ascii_lowercase, 'lower')
    upper = dict.fromkeys(string.ascii_uppercase, 'upper')
    char_to_category = {**lower, **upper}

    return char_to_category.get(char, 'other')

text = 'Sed ut perspiciatis, unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam eaque ipsa, quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt, explicabo. Nemo enim ipsam voluptatem, quia voluptas sit, aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos, qui ratione voluptatem sequi nesciunt, neque porro quisquam est, qui dolorem ipsum, quia dolor sit amet consectetur adipisci[ng] velit, sed quia non-numquam [do] eius modi tempora inci[di]dunt, ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit, qui in ea voluptate velit esse, quam nihil molestiae consequatur, vel illum, qui dolorem eum fugiat, quo voluptas nulla pariatur?'

categories = {}

for char in text:
    key = key_category_from_char(char)

    char_set = categories.setdefault(key, set())
    char_set.add(char)

for key, value in categories.items():
    print(f"{key}: {''.join(value)}")

upper: NQSU
lower: fsehcpadngitvruobxqlm
other:  ].,?[-


# 03 - Dictionary Views

#### Basics

We already know `d.keys()`, `d.values()` and `d.items()`, each of which produce an iterable. Since order is maintained, zipping up `.keys()` and `.values()` will produce the same output as `.items()`. 

All of these views are **read-only**. We cannot modify the dictionary by modifying the views.

**Dictionary Views are Dynamic**

**Views are more than just iterables**. This is something unintuitive. If we store the result of *any* of these views in a variable and then modify the dictionary, the variable will reflect this modification. That is to say, looking up the variable performs a dictionary lookup too. 

In [54]:
d = {'a': 1, 'b': 2}

my_items = d.items()
print(my_items)

d['a'] = 100
d['b'] = 200
d['c'] = 300

print(my_items)

dict_items([('a', 1), ('b', 2)])
dict_items([('a', 100), ('b', 200), ('c', 300)])


The `keys()` view behaves like a **set**. 

This makes sense since `sets` are essentially dictionaries with no values. The elements in a set are guaranteed to be unique and hashable. So to reiterate, the **keys of a dictionary are a set**. 

Therefore, the `.keys()` view has set-like functionality. We can perform unions, intersections and differences.

**The `values()` view does *not* behave like a set**. 

This makes sense since it doesn't satisfy either condition of uniqueness and hashability above.

**The `items()` view *may* behave like a set**.

We know that each key-value tuple will be unique from one another because each key is guaranteed to be unique. 

The only thing we need to check is if **all** the values are hashable - Python does this. If they are, then the `items()` view will also have set-like behaviour. 

#### Set Operations

We already know the basics of this:

In [1]:
s1 = {1, 2, 3}
s2 = {2, 3, 4}

Unions:

In [2]:
s1 | s2

{1, 2, 3, 4}

In [3]:
s1 & s2

{2, 3}

Differences: 

What is in `s1` that isn't in `s2`:

In [4]:
s1 - s2

{1}

What is in `s2` that isn't in `s1`:

In [5]:
s2 - s1

{4}

To demonstrate the set-like behaviour of dictionary keys:

In [56]:
d1 = {1: None, 2: None, 3: None}
d2 = {2: None, 3: None, 4: None}

In [57]:
d1.keys() | d2.keys()

{1, 2, 3, 4}

In [58]:
d1.keys() & d2.keys()

{2, 3}

In [59]:
d1.keys() - d2.keys()

{1}

In [60]:
d2.keys() - d1.keys()

{4}

# 04 - Updating, Merging and Copying

# 05 - Custom Classes and Hashing