## Introduction
The python programming language implements a number of data structures natively. Lists, tuples, sets, dictionaries are but some of them. We will be looking at the dictionary data type in subsequent sections.
### What are python dictionaries ?
![dictionary image](dictionaries.jpg)

A dictionary in python is a mapping object that maps keys to values, where the keys are unique within a collection and the values can hold any arbitrary value. In addition to being unique, keys are also required to be hashable. An object is said to be hashable if it has a hash value (implemented by a `__hash__()` method) that does not change during the object’s lifetime. Most commonly, we use immutable data types, such as strings, integers, and tuples (only if they contain similarly immutable types) as dictionary keys. A dictionary’s data is always enclosed by a pair of curly braces `{ }`.
Typically dictionaries look like this:

In [None]:
my_dict = {"first_name":"John","last_name":"Snow","age":16,"gender":"Male"}

We have created a dictionary named `my_dict` where each key-value pair is separated by a full colon, with the keys as:
- `first_name`
- `last_name`
- `age`
- `gender`

The values from `my_dict` are:
- `John`
- `Snow`
- `16`
- `Male`


### How different are dictionaries from other common data structures in python?
Unlike sequenced data types like lists and tuples, where indexing is achieved using positional indices, dictionaries are indexed using their keys. Therefore, individual values can be accessed using these keys.
### How similar are dictionaries to other data structures in other languages?
Dictionaries are an implementation of [Associative Arrays](https://en.wikipedia.org/wiki/Associative_array). All Associative arrays have a structure of (key, value) pairs, where each key is unique for every collection. Other languages also have similar executions, such as:
- Maps in Go
- std::map in C++
- Maps in Java
- JavaScript objects

### Use cases of dictionaries
Typically dictionaries are used to store associative data, i.e data that is related. Examples include:
 - The attributes of an object.
 - A row of `SQL` data.
 
In this particular scenario, we will be using using dictionaries to store job listing details from [Kaggle](https://www.kaggle.com/ardenn/brighter-monday-job-listings/data)

## Dictionary Operations
### Creating a Dictionary
- To create an empty dictionary, we use a pair of curly braces with nothing between them


In [None]:
empty_dict = {}

In the line above, we have created an empty dictionary named `empty_dict`.

- To create a dictionary with items,we use a pair of curly braces with the key-value pairs. We can now create a dictionary to represent the second row of data in the `jobs.csv` file.

In [None]:
job1 = {"title":"Production Manager","location":"Rest of Kenya","job_type":"Full Time",
             "employer":"The African Talent Company (TATC)","category":"Farming"}

We just created a dictionary with the keys `title`,` location`, `job_type`, `employer`, `category` and assigned it to the variable `job1`.

- Dictionaries can also be created using the `dict()` constructor. To do this we pass the constructor a sequence of key-value pairs. We could also pass in named arguments. Let's create a dictionary to represent the third row of data in the `jobs.csv` file, using both of these methods.

In [None]:
# create an empty dictionary
empty_property = dict()

# create dictionary using a list of key-value tuples
job2 = dict([("title","Marketing & Business Development Manager"),("location","Mombasa"),\
("job_type","Full Time"),\
("employer","KUSCCO Limited (Kenya Union of Savings & Credit Co-operatives Limited)"),\
("category","Marketing & Communications")])

We passed a sequence, in this case a list of key-value tuples, to the `dict()` constructor to create our dictionary, and assigned it to the variable `job2`.

In [None]:
# Using keyword arguments
dict(title="Marketing & Business Development Manager",location="Mombasa",job_type="Full Time",
     employer="KUSCCO Limited (Kenya Union of Savings & Credit Co-operatives Limited)",
     category="Marketing & Communications")

Here, we created a dictionary using named arguments. The keys are the argument names, while the values are the argument values. It is however important to note that this method is only suitable when our keys are just simple strings.

### Accessing Items
- As we mentioned earlier on, dictionaries are indexed using their keys.To access a particular value in a dictionary we use the indexing operator (key inside square brackets). 
- Similarly, we can use the `get()` method of dictionaries to get the value associated with a particular key. Let's use these methods to access the `title` from `job2`. 

In [None]:
# Using key indexing
job2["title"] #return 'Marketing & Business Development Manager'

# Using get() method
job2.get("title") #return 'Marketing & Business Development Manager'
job2.get("salary") #return None

# Passing a second argument to get()
job2.get("salary", 5000) #return 5000

# Using the in operator
"salary" in job2 #returns False
"title" in job2 #returns True

- In the example above we use indexing to access the `title` from `job2`. Alternatively, we could use `get()`  to access the `title` and `salary` values. 
- In the event that we attempt to access a key that doesn't exist by indexing, we will get a `KeyError`. To counter this, we use the `get()` method, and pass it the key name. `get()` also takes in an optional second argument to be returned if the key is not found. 
- However, `job2` doesn't have a `salary` key and as such the value is `None`. However, when we add a second argument, to `get()`, the return value is now `5000` instead of `None`.
- Suppose we do not want to use `get()`, we could check the availability of a certain key using the `in` operator.

### Adding and Modifying Items
Dictionaries can be modified directly using the keys or using the `update()` method. `update()` takes in a dictionary with the key-value pairs to be modified or added. For our demonstration, let's:
- add a new item (salary) to `job2` with a value of 10000, 
- modify the `job_type` to be "Part time", 
- update the `salary` to 20000 and
- update the dictionary to include a new item (available) with a value of `True`.

In [None]:
# Adding a new entry for salary using the index
job2["salary"] = 10000

# Modifying the entry for job_type using the index
job2["job_type"] = "Part time"

# Modifying the salary entry using update
job2.update({"salary":20000})

# Adding the available entry using update
job2.update({"available":True})

To add a new entry, we use syntax similar to indexing. If the key exists, then the value will be modified, however, if the key doesn't exist, a new entry will be created with the specified key and value. 
- In the first example we assigned a value of 10000 to the `salary` key, but since that `salary` doesn't exist, a new entry is created, with that value. 
- For the second scenario, the `job_type` key exists, the value is modified to "Part time".
- Next, we use the `update()` method to change the `salary` value to 20000, since `salary` is already a key in the dictionary.
- Finally, we apply update to the dictionary, a new entry is created with a key of `available` and value of `True`.

### Removing Items
 We can now remove the just created `salary` entry from `job2`, and remove everything from `job1`.

In [None]:
del job2["salary"]
del job2["available"]
print(job2) #return a dictionary without 'salary' and 'available' entries

job1.clear()
print(job1) #return an empty dictionary

del job1

- To remove the entry associated with the `salary` and `available` keys from `job2`, we use the `del` keyword. Now if we go ahead and print `job2`, the `salary`,`available` entries are gone.
- To remove all items from `job1` we use the `clear()` method. Printing `job1` afterwards gets us an empty dictionary.
- If we don't need a dictionary anymore, say `job1`, we use the `del` keyword to delete it. Now if we print `job1` we get a `NameError` since `job1` is no longer defined.

### Iterating Through Dictionaries
Dictionaries are iterable, and we can iterate through them in 3 different ways:
 - `dict.values()` - this returns an iterable of the dictionary's values.
 - `dict.keys()` - this returns an iterable of the dictionary's keys.
 - `dict.items()` - this returns an iterable of the dictionary's (key,value) pairs.
 
Let's iterate over `job2` using a `for-loop` using all the three methods.

In [None]:
# Using values()
for val in job2.values():
    print(val) #prints the values of job2
    
# Using keys()
for key in job2.keys():
    print(key) #prints the keys of job2
    
# Using items()
for key,val in job2.items():
    print(key, val) #prints the keys and values of job2
    
for _ in job2:
    print(_) #prints the keys of job2

- First, we loop through the iterable `job2.values()` and print out the value during each iteration.
- Secondly, we iterate through `job2.keys()` while printing out the key. This is the default behaviour when we loop through the entire dictionary, without specifying whether we want the keys, values or items.
- Thirdly, we loop through the keys and values simultaneously. We include both key and value in our for-loop constructor since `job.items()` yields a tuple of key and value during each iteration. Our loop therefore prints out the pair at each step.
- The last example produces an output similar to the second.

### Sorting Dictionaries
Borrowing from our description of dictionaries earlier, this data type in meant to be unordered, and doesn't come with the sorting functionality. Calling the `sorted()` function and passing it a dictionary only returns a list of the keys in a sorted order.

If we use the `items()` iterable we could sort the items of our dictionary as we please. However, this doesn't give us our original dictionary, but an array of key-value tuples in a sorted order.

In [None]:
# Using sorted() to sort a dictionary's items on the keys
sorted(job2.items(),key=lambda item:item[0])

In this example we use python's inbuilt `sorted()` function which takes in an iterable (our dictionary's items). The key argument of the `sorted()` function instructs `sorted()` to use the value at index 0 for sorting. Similarly, to sort by the values, we use index 1 instead of index 0.

### Other Dictionary Methods
Dictionaries have other methods that could be used on demand. To read up further on these, please consult the [python documentation](https://docs.python.org/3/library/stdtypes.html#typesmapping). Here are some other useful methods:
- `pop(key,default)` - deletes the key `key` and returns it, or returns an optional `default` when the key doesn't exist.
- `len(d)` - returns the number of items in a dictionary `d`.
- `copy()` - returns a shallow copy of the original. This shallow copy has similar references to the original, and not copies of the original's items. 
- `setdefault(key,default)` - returns the value of `key` if in the dictionary, or sets the new key with an optional `default` as its value then returns the value.

## Speeding Up your Code with Dictionaries
### Dictionary Unpacking
Dictionary unpacking involves destructuring a dictionary into individual keyword arguments with values. This is especially useful for cases that involve supplying multiple keyword arguments for example in function calls. To implement this functionality we use the iterable unpacking operator (`**`). 

Let's use this approach in the constructor function of a class. 

In [None]:
from jobs import Job

# Creating a job object without unpacking
Job("Marketing & Business Development Manager","Mombasa","Full Time",\
     "KUSCCO Limited (Kenya Union of Savings & Credit Co-operatives Limited)",\
     "Marketing & Communications")

# Creating a job object with unpacking
Job(**job2)

To instantiate a new `Job` object, traditionally, we would need to pass in all the required arguments. However, with unpacking, we just pass in a dictionary with the `**` operator before it. The operator unpacks the dictionary in to an arbitrary number of named arguments. This approach is much cleaner and involves less code.

## Downside Of Using Dictionaries
Compared to lists and tuples, dictionaries take up more space in memory, since they need to store both the key and value, as opposed to just values. Therefore, dictionaries should only be used in cases where we have associative data, that would lose meaning if stored in lists.

### When not to use dictionaries
- Since dictionaries are unordered, it would not be useful for us to use it to store data that is strictly arranged.
- Dictionaries are also mutable, and not suitable for storing data than shouldn't be modified in place.

### How not to use dictionaries
Dictionaries are well-designed to let us find a value instantly without necessarily having to search through the entire dictionary, hence we should not use loops for such an operation.

In [None]:
# How not to search for a value and return it
key_i_need = "location"
target = ""
for key in job2:
    if key == key_i_need:
        target = job2[key]
        
# How to search efficiently
target = job2.get("location")

We have a variable `key_i_need` containing the key we want to search for. We then use a for loop to traverse the collection, comparing the key at each step with our variable. If we get a match, we assign that key's value to the variable `target`. This is the wrong approach. We should instead use `get()`, and pass it the desired key.

## Performance Tradeoffs
In this section, we will be comparing the space-time trade-offs between dictionaries and objects. We will use the `timeit` module and `getsizeof` function to determine the time and space aspects. The operations we will be testing are:
- Accessing an entry in a dictionary versus accessing a field in a object.
- Adding a new entry to a dictionary versus adding a new field to an object.

In terms of space, we will be testing the following operations:
- Size of object classes.
- Size of the default object.
- Size of an empty dictionary.
- Size of dictionary and object after adding one entry and one field respectively.

### Speed Tests
#### Accessing a single entry from a dictionary vs accessing a field in a object

In [None]:
import timeit
# Accessing entry using indexing operator
timeit.timeit('string="Random String"+job3["employer"]',setup='job3 = {"title":"Loans Manager","location":"Mombasa",\
"job_type":"Full Time","employer":"KUSCCO Limited (Kenya Union of Savings & Credit Co-operatives Limited)",\
"category":"Accounting & Auditing"}',number=10000)

In [None]:
# Accessing entry using get() method
timeit.timeit('string="Random String"+job3.get("employer")',setup='job3 = {"title":"Loans Manager","location":"Mombasa",\
"job_type":"Full Time","employer":"KUSCCO Limited (Kenya Union of Savings & Credit Co-operatives Limited)",\
"category":"Accounting & Auditing"}',number=10000)

In [None]:
# Accessing object field using namespace operator
timeit.timeit('string="Random String"+job3.employer',setup='from jobs import Job; job3 = Job("Loans Manager","Mombasa","Full Time",\
        "KUSCCO Limited (Kenya Union of Savings & Credit Co-operatives Limited)",\
        "Accounting & Auditing")',number=10000)

For the three tests, we look at the time it takes for each of the operations to execute 10000 loops. Based on the output, we can rank them as follows:
- Fastest - accessing an entry from a dictionary using indexing operator
- Faster - accessing a field from an object using the namespace operator
- Slowest - accessing an entry from a dictionary using `get()` method

#### Adding a new entry to a dictionary vs adding a new field to an object

In [None]:
# Adding a new entry using indexing and assignment operators
timeit.timeit('job3["salary"]=50000',setup='job3 = {"title":"Loans Manager","location":"Mombasa",\
"job_type":"Full Time","employer":"KUSCCO Limited (Kenya Union of Savings & Credit Co-operatives Limited)",\
"category":"Accounting & Auditing"}',number=10000)

In [None]:
# Adding a new entry using update() method
timeit.timeit('job3.update({"salary":50000})',setup='job3 = {"title":"Loans Manager","location":"Mombasa",\
"job_type":"Full Time","employer":"KUSCCO Limited (Kenya Union of Savings & Credit Co-operatives Limited)",\
"category":"Accounting & Auditing"}',number=10000)

In [None]:
# Adding a new field to the object
timeit.timeit('job3.salary = 50000',setup='from jobs import Job; job3 = Job("Loans Manager","Mombasa",\
"Full Time",\"KUSCCO Limited (Kenya Union of Savings & Credit Co-operatives Limited)",\
"Accounting & Auditing")',number=10000)

For this round of tests, we measure the time it takes to add a new entry to an existing dictionary and the time it takes to add a new field to an object. From the results we can rank the operations as:
- Fastest - adding a new entry to a dictionary using the indexing and assignment operators
- Faster - adding a new field to an object
- Slowest - updating a dictionary with a new entry

Overally, as far as time is concerned, we can tell that dictionary operations are faster than object operations.

## Space Tests
Let's setup test evironment, by initialising a `Job` object and creating a new dictionary with the same data.

In [None]:
# create a job object
from jobs import Job
job_object = Job("Loans Manager","Mombasa","Full Time",\
        "KUSCCO Limited (Kenya Union of Savings & Credit Co-operatives Limited)",\
        "Accounting & Auditing")

# Create a job dictionary
job_dict = {"title":"Loans Manager","location":"Mombasa",\
"job_type":"Full Time","employer":"KUSCCO Limited (Kenya Union of Savings & Credit Co-operatives Limited)",\
"category":"Accounting & Auditing"}

# show that the internal values are the same
job_dict == job_object.__dict__

We will use `getsizeof()` function to measure the size of our objects, and compare them. `getsizeof()` only measures the size of an object, excluding its fields. However, for dictionaries, it includes the size of the hash table.

In [None]:
from sys import getsizeof

In [None]:
# Size of Job class
getsizeof(Job)

In [None]:
# Size of original Job() object
getsizeof(Job())

In [None]:
# Size of job_object
getsizeof(job_object)

In [None]:
# Size of dict object
getsizeof(dict)

In [None]:
# Size of empty dictionary {}
getsizeof({})

In [None]:
# Size of job_dict
getsizeof(job_dict)

From our results, we can clearly tell that:
- The `dict` class has a smaller memory footprint (400) compared to the `Job` class (1184).
- The `job_object` has a smaller size (56) compared to `job_dict` (240)

### Adding one entry to the original dictionary

In [None]:
# Size of job_dict with one additional entry using update()
job_dict2 = job_dict.copy()
job_dict2.update({'salary':50000})
getsizeof(job_dict2)

In [None]:
# Size of job_dict with one additional entry using index and assignment operator
job_dict['salary']=50000
getsizeof(job_dict)

### Adding one extra field to the original Job class
`Job2` is an implementation of `Job` with one extra field

In [None]:
from jobs import Job2

In [None]:
# Size of Job2 class - with one extra field
getsizeof(Job2)

In [None]:
# Size of Job2 default object - with one extra field
getsizeof(Job2())

### Adding two extra fields to the original Job class
`Job3` in another implementation of `Job` with two additional fields

In [None]:
from jobs import Job3

In [None]:
# Size of Job3 class - with two extra fields
getsizeof(Job3)

In [None]:
# Size of Job3 default object - with two extra fields
getsizeof(Job3())

From our numbers above, we can tell that objects tend to take up smaller memory spaces compared to dictionaries, despite the class definitions being rather large. Dictionaries appear to have a bigger memory fingerprint due to their hash tables which are re-evaluated during every add operation.

## Conclusion
Dictionaries come in very handy for regular python usage. They are suitable for use with unordered data that relies on relations. Caution should however be exercised to ensure we do not use dictionaries in the wrong way and end up slowing down execution of our code. For further reading please refer to the official python documentation on [mapping types](https://docs.python.org/3/library/stdtypes.html#typesmapping).