# SET AND DICTIONARIES

## TABLE OF CONTENT

|S/N| CONTENT                                              |
|---|----------------------------------------------------- |
|1| [SET](#set)                                            |
|2| [SET METHODS](#set-operations)                         |
|3| [DICTIONARIES](#dict)                                  |
|4| [DICTIONARIES METHODS](#dict-operations)               |
|5| [SWITCHING AND COMBINATION OF DICT](#loop-in-dict)     | 
|6| [REFERENCE](#reference)                                |

## Ground Rules to Follow

- Sign up on [Ananconda Cloud](https://anaconda.cloud/)
- Navigate to you notebook side of the website.
- ![capture.PNG](attachment:5b98258e-e038-4278-a3ed-183b02a5a7ee.PNG)
- Create a folder and import the files from the Team's folder.
- Run your notebook along.

### SET <a class="anchor" id="set"></a>

#### What is a Set?

- A Python set is also similar to a list, except it is unordered. It can store heterogeneous data and it is mutable, but what does it mean to be unordered? 
The simplest explanation is simply to look at an example. We can create a set by enclosing our data with curly brackets {}.

In [6]:
example_set = {'Dylan', 26, 167.6}
print(example_set)

{'Dylan', 26, 167.6}


In [1]:
basket = {'apple', 'orange', 'apple', 'pear', 'orange', 'banana'}
print(basket)    

{'apple', 'orange', 'pear', 'banana'}


### SET METHODS <a class="anchor" id="set-operations"></a>

#### Querying a set

In [None]:
'orange' in basket 

#### Adding to a set

- The add method of a set works similarly to the append method of a list. The update method of a set works similarly to the extend method of a list

In [7]:
example_set.add('True')
print(example_set)

{'True', 'Dylan', 26, 167.6}


#### Update to a set

In [None]:
example_set.update([58.1, 'brown'])
print(example_set)

#### Deleting an element in  a set

In [3]:
print(example_set.pop())

TypeError: 'set' object doesn't support item deletion

#### Operations in set

In [None]:
student_a_courses = {'history', 'english', 'biology', 'theatre'}
student_b_courses = {'biology', 'english', 'mathematics', 'computer science'}

print(student_a_courses.intersection(student_b_courses))
print(student_a_courses.union(student_b_courses))
print(student_a_courses.difference(student_b_courses))
print(student_b_courses.difference(student_a_courses))
print(student_a_courses.symmetric_difference(student_b_courses))

In [None]:
a = set('abracadabra')
b = set('alacazam')
a                                  # unique letters in a
{'a', 'r', 'b', 'c', 'd'}
a - b                              # letters in a but not in b
{'r', 'd', 'b'}
a | b                              # letters in a or b or both
{'a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'}
a & b                              # letters in both a and b
{'a', 'c'}
a ^ b                              # letters in a or b but not both
{'r', 'd', 'b', 'm', 'z', 'l'}

#### Why is set useful?

It seems strange that we might want an unordered data structure. We can't access or modify the data through indexing. How does giving up order benefit us? 
The answer is that it gives us flexibility about how the data is stored in memory, and that flexibility can make data retrieval much faster.

Imagine we have ten boxes and ten piles of money. We put the ten piles of money in the ten boxes. Now say we want to find the box that has $5.37 in it.
We don't know which box this is, so we start with the first box and check. If it isn't in the first box, we move on to the second box. 
We keep checking boxes until we find it. This might take awhile.

Now imagine we have the same ten piles of money, but we have 31 boxes. Instead of putting each pile of money into the boxes in order, 
instead put each pile into a box based on the amount of money in the pile. First we multiply the amount of money by 100, and then take modulus division by 31.
This gives the number of the box we should put the pile of money in.

In [None]:
piles = [2.83, 8.23, 9.38, 10.23, 25.58, 0.42, 5.37, 28.10, 32.14, 7.31]

In [None]:
def hash_function(x):
    return int(x * 100 % 31)

[hash_function(pile) for pile in piles]

In [None]:
sum([hash_function(pile) for pile in piles])

In [None]:
#Now say we want to find the box with \$5.37 in it. We don't have to search through box after box. We can compute:
print(int(5.37 * 100 % 31))

![set-box.PNG](attachment:dac833f5-af74-43fb-99fd-7438fa925cc7.PNG)

-Box number 10 contains the \$5.37 pile.

This technique of assigning boxes (i.e. memory) based on the object it contains is called **hashing**. It makes searching for data very fast (as we've illustrated), but at the cost of increase memory allocation (we needed more boxes). It also means that we cannot assign an order to the objects as they are stored in memory.

Hashing also puts two major restrictions on the `set`. First of all, objects in a `set` must be immutable. If an object were to change, its position in memory would no longer correspond with its **hash**. Secondly, the objects in a `set` must be unique. Identical objects end up with the same hash. Since we can't store multiple objects in the same chunk of memory, we simply discard any duplicates.

This second restriction means we can use a `set` to easily determine the unique objects in a `list` or `tuple`.
- suppose we have 50 boxes with with we want to put in 10 piles of money, we can use the code below

In [None]:
def hash_function(x):
    return int(x * 100 % 50)

[hash_function(pile) for pile in piles]

In [None]:
(print (int(7.3 *100 % 50)))

##### Set Comprehension

In [None]:
a = {x for x in 'abracadabra' if x not in 'abc'}
a

### DICTIONARIES <a class="anchor" id="dict"></a>

#### What are Dictionaries?

- Dictionaries are sometimes found in other languages as “associative memories” or “associative arrays”.
- Unlike sequences, which are indexed by a range of numbers, dictionaries are indexed by keys, which can be any immutable type; strings and numbers can always be keys. 
- To understand the Python dict, let's start again with the Python list.

In [None]:
me = ['chisom', 25, 175.6, 60.1, 'black', 'grey', True]

- This list describes me: my name, my age, my height (in centimeters), my weight (in kilograms), my hair color, my eye color, and whether or not I have a dog. 
- We know we can access this information individually by index.

In [None]:
print('My name is %s' % me[0])
print('I have %s hair' % me[4])

In [None]:
It would be easy to get mixed up about which data is which (for example, which 'brown' is hair color and which is eye color?), or where I should find it (will age always be at index 1?).

A better solution would be a data structure where we could index using meaningful values. For example instead of using me[0] to recover Dylan, I could use me['name']. 
Instead of hair color being me[4], it could be me['hair']. This feature is the central characteristic of the Python dict.

In [None]:
me_dict = {'name': 'Chisom', 'age': 25, 'height': 170.1, 'weight': 60.1, 
           'hair': 'black', 'eyes': 'black', 'has dog': True}

print('My name is %s' % me_dict['name'])
print('I have %s hair' % me_dict['hair'])
print('I will be %d years old next month' % me_dict['age'])

In [None]:
me_dict.values()

In [None]:
me_dict.keys()

### DICT METHODS <a class="anchor" id="dict-operations"></a>

##### Zip

- The zip function can be very handy for creating a dict. 
Let's go back to the list we made before that contained all the values describing me. We'll make a second list containing all the keys we would want for putting these values in a dictionary

In [None]:
value_list = me
key_list = ['name', 'age', 'height', 'weight', 'hair', 'eyes', 'has dog']

print(value_list)
print(key_list)

- Currently we have two lists: one of values and one of keys. They have no relationship to each other within Python, but we can see that they belong together logically. How do we combine them in Python? By using the zip function.

In [None]:
key_value_pairs = list(zip(key_list, value_list))
print(key_value_pairs)

- We now have a list of tuples. We interpret the first element of each tuple as a key, and the second element as a value. We can turn this list of tuples directly into a dict.

In [None]:
me_dict = dict(key_value_pairs)
print(me_dict)

- You may have noticed that even though our list of tuples began with ('name', 'Dylan'), when we printed me_dict it started with 'eyes': 'brown'. 
If you guessed this means that a dict is unordered, you are correct! The keys are hashed to assign key-value pairs to memory. Therefore, keys must be immutable and unique, similar to the elements of a set. 
However, values don't have these restrictions.

- Another Example

In [None]:
my_house = ['TV', "AC", "ElectricCooker", "LibraryStand", 'Bed']

In [None]:
value_list_1 = my_house
key_list_1 = ['Philips', 'Smasung', 'Mechnaical', 'Mechanicses']

In [None]:
key_value_pairs_1 = list(zip(key_list_1, value_list_1))
print(key_value_pairs_1)

In [None]:
me_dict_1 = dict(key_value_pairs_1)
print(me_dict_1)

In [None]:
-!! Error: Wrong Way of Building Dictionaries

In [None]:
%%expect_exception TypeError

# this doesn't work
invalid_dict = {[1, 5]: 'a', 5: 23}

- Update

In [None]:
The dict is also mutable. We can add new key-value pairs by simple assignment.

In [None]:
print(me_dict)
me_dict['favorite book'] =  'The Little Prince'
print(me_dict)

In [None]:
print(me_dict)
me_dict.update({'favorite color': 'orange', 'siblings': 3, 'nieces/nephews': 0})
print(me_dict)

In [None]:
- Delete in Dict

In [None]:
del me_dict['favorite book']
print(me_dict)

In [None]:
print(me_dict.keys())
print(me_dict.values())
print(me_dict.items())

### SWITCHING AND COMBINATION OF DICT<a class="anchor" id="loop-in-dict"></a>

- Switching data structures

Each of the containers we've introduced has different properties and characteristics. Sometimes we will want to change one data structure into another to take advantage of these differences. 
We've already seen some methods for transforming a `dict` into a `list` of `tuple`s or vice versa. We can easily transform between `list`, `tuple`, and `set`.

In [None]:
example_list = ['a', 'b', 23, 10, True, 'a', 10]
example_tuple = tuple(example_list)
example_set = set(example_tuple)
example_list = list(example_set)

print(example_tuple)
print(example_set)
print(example_list) # note we lost the duplicates because of set

- Looping Techniques

As we've seen in some examples already, it will often be useful to iterate through a data structure, whether to execute some task based on the information contained or to transform or analyze a data set. We will most often use for loops to iterate over data structures. With a list, tuple, or set the elements of the container are returned one after another. With a dict things are a little more complicated: do we want to iterate over keys, values, or key-value pairs?

In [None]:
# by default we iterate over keys of a dict
for f in me_dict:
    print(f)

In [None]:
# to iterate over values...
for y in me_dict.values():
    print(y)

In [None]:
# or to iterate over key-value pairs...
for k, v in me_dict.items():
    print('%s:%s' % (k, v))

In [None]:
for i in reversed(range(1, 10, 2)):
    print(i)

In [None]:
basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
for i in sorted(basket):
    print(i)

- It is sometimes tempting to change a list while you are looping over it; however, it is often simpler and safer to create a new list instead.

In [None]:
import math
raw_data = [56.2, float('NaN'), 51.7, 55.3, 52.5, float('NaN'), 47.8]
filtered_data = []
for value in raw_data:
    if not math.isnan(value):
        filtered_data.append(value)

#### Mapping Data Structures

### REFRENCES <a class="anchor" id="reference"></a>

|S/N| links                                                                            |
|---|--------------------------------------------------------------------------------- |
|-  |https://docs.python.org/3/tutorial/introduction.html                              |
|-  |https://apps.cognitiveclass.ai/learning/course                                    |
|-  |https://worldquantuniversity.org                                                  |