# Exercise Set - Python Fundamentals and Essentials II

In this notebook we will work through

1. Additional Collections in Python
   1. Associative (Dictionaries)
   2. Sets
2. Additional Exercises (comparing `dict` and `pd.Series`)

Next Exercise Notebook: `Conditional Logic and Control Flow` and `Functions`

---

## Associative Collections

### Dictionaries

A dictionary (or `dict`) associates keys with values.

It will feel similar to a dictionary for words, where the keys are words and
the values are the associated definitions.

The most common way to create a `dict` is to use curly braces — `{`
and `}` — like this:

```python
{"key1": value1, "key2": value2, ..., "keyN": valueN}
```

where the `...` indicates that we can have any number of additional
terms.

The crucial part of the syntax is that each key-value pair is written
`key: value` and that these pairs are separated by commas — `,`.

Let’s see an example using our aggregate data on China in 2015.

In [1]:
china_data = {"country": "China", "year": 2015, "GDP" : 11.06, "population": 1.371}
print(china_data)

{'country': 'China', 'year': 2015, 'GDP': 11.06, 'population': 1.371}


Unlike our example (from Exercise Set 2) using a `tuple`, a `dict` allows us to
associate a name with each field, rather than having to remember the
order within the tuple.

Often, code that makes a dict is easier to read if we put each
`key: value` pair on its own line.

The code below is equivalent to what we saw above.

In [2]:
china_data = {
    "country": "China",
    "year": 2015,
    "GDP" : 11.06,
    "population": 1.371
}

Most often, the keys (e.g. “country”, “year”, “GDP”, and “population”)
will be strings, but we could also use numbers (`int`, or
`float`) or even tuples (or, rarely, a combination of types).

The values can be **any** type and different from each other.

You can access a `value` using the `key` such as

In [3]:
china_data['country']

'China'

**Exercise #1:** How can you access the `population` value from `china_data`?

**Exercise #2:** What do you think would happen if you looked at the length of the `china_data` object using `len(china_data)`?

**Exercise #3:** What do you think the number above represents?

**Exercise #4:** 

Create a new dict which associates stock tickers with its stock price.

Here are some tickers and a price.

- AAPL: 175.96  
- GOOGL: 1047.43  
- TVIX: 8.38  

#### Getting, Setting, and Updating dict Items

As you saw above we can ask Python to tell us the value for a particular key by using
the syntax `d[k]`,  where `d` is our `dict` and `k` is the key for which we want to
find the value.

For example,

In [4]:
china_data["year"]

2015

**Exercise #5:** What happens if you specify a key that is not in the dictionary such as `inflation`?

### Adding items

We can also add new items to a dict using the syntax `d[new_key] = new_value`.

Let’s see some examples.

In [5]:
print(china_data)
china_data["unemployment"] = "4.05%"  # add new key:value pair to the dictionary
print(china_data)

{'country': 'China', 'year': 2015, 'GDP': 11.06, 'population': 1.371}
{'country': 'China', 'year': 2015, 'GDP': 11.06, 'population': 1.371, 'unemployment': '4.05%'}


To update the value, we use assignment in the same way (which will
create the key and value as required).

In [6]:
print(china_data)
china_data["unemployment"] = "3.60%"
print(china_data)

{'country': 'China', 'year': 2015, 'GDP': 11.06, 'population': 1.371, 'unemployment': '4.05%'}
{'country': 'China', 'year': 2015, 'GDP': 11.06, 'population': 1.371, 'unemployment': '3.60%'}


Or we could change the type.

In [7]:
china_data["unemployment"] = 4.051
print(china_data)

{'country': 'China', 'year': 2015, 'GDP': 11.06, 'population': 1.371, 'unemployment': 4.051}


Python dictionaries are flexible data containers and are very helpful objects. 

**Exercise #6:** How might you choose to represent time series data in a dictionary

| year | value |
| -----|-------|
| 2010 | 6     |
| 2011 | 9 | 
| 2012 | 12 |

**Exercise #7:** How would you represent this same time series data using `list` objects? 

**Exercise #8:** How do the two data structures compare (discussion)?

#### Common `dict` Methods

We can do some common things with dicts.

We will demonstrate them with examples below.

In [8]:
# number of key-value pairs in a dict
len(china_data)

5

In [9]:
china_data.keys()

dict_keys(['country', 'year', 'GDP', 'population', 'unemployment'])

In [10]:
# get a list of all the keys
list(china_data.keys())

['country', 'year', 'GDP', 'population', 'unemployment']

In [11]:
# get a list of all the values
list(china_data.values())

['China', 2015, 11.06, 1.371, 4.051]

In [12]:
more_china_data = {"irrigated_land": 690_070, "top_religions": {"buddhist": 18.2, "christian" : 5.1, "muslim": 1.8}}

# Add all key-value pairs in mydict2 to mydict.
# if the key already appears in mydict, overwrite the
# value with the value in mydict2
china_data.update(more_china_data)
china_data

{'country': 'China',
 'year': 2015,
 'GDP': 11.06,
 'population': 1.371,
 'unemployment': 4.051,
 'irrigated_land': 690070,
 'top_religions': {'buddhist': 18.2, 'christian': 5.1, 'muslim': 1.8}}

In [14]:
# Get the value associated with a key or return a default value
# use this to avoid the NameError we saw above if you have a reasonable
# default value
china_data.get("irrigated_land", "Data Not Available")

690070

In [16]:
china_data.get("inflation", "Data Not Available")

'Data Not Available'

**Exercise #9:** Use Jupyter’s help facilities to learn how to use the `pop` method to
remove the key `"irrigated_land"` (and its value) from the dict. (i.e. `china_data.pop?`)

**Exercise #10:** Explain what happens to the value you popped. Experiment with calling `pop` twice.

### Sets

Python has an additional way to represent collections of items: sets.

Sets are used relatively infrequently, but you should be aware of them as they can be useful.

If you are familiar with the mathematical concept of sets, then you will
understand the majority of Python sets already.

If you don’t know the math behind sets, don’t worry: we’ll cover the
basics of Python’s sets here.

A set is an *unordered* collection of *unique* elements.

The syntax for creating a set uses curly bracket `{` and `}`.

```python
{item1, item2, ..., itemN}
```

Here is an example.

In [17]:
s = {1, "hello", 3.0}          # create a set and save in variable s

print("s has type", type(s))
s

s has type <class 'set'>


{1, 3.0, 'hello'}

**Exercise #11:**

Try creating a set with repeated elements (e.g. `{1, 2, 1, 2, 1, 2}`) and save it in variable `a`

What happens?

Why?

As with lists and tuples, we can check if something is `in` the set
and check the set’s length:

In [18]:
s

{1, 3.0, 'hello'}

In [19]:
print("len(s) =", len(s))

len(s) = 3


In [20]:
"hello" in s

True

Unlike lists and tuples, we can’t extract elements of a set `s` using
`s[N]` where `N` is a number.

**Exercise 12:** What happens if you run `s[1]`?

This is because sets are not ordered, so the notion of getting the
second element (`s[1]`) is not well defined.

We add elements to a set `s` using `s.add`.

In [21]:
s.add(100)
s

{1, 100, 3.0, 'hello'}

**Exercise 13:** What happens if you run `s.add('hello')`?

We can also do set operations.

Consider the set `s` from above and the set
`s2 = {"hello", "world"}`.

- `s.union(s2)`: returns a set with all elements in either `s` or
  `s2`  
- `s.intersection(s2)`: returns a set with all elements in both `s`
  and `s2`  
- `s.difference(s2)`: returns a set with all elements in `s` that
  aren’t in `s2`  
- `s.symmetric_difference(s2)`: returns a set with all elements in
  only one of `s` and `s2`

**Exercise 14:**

Test out the operations described above using the original set we
created, `s`, and the set created below `s2`.

In [22]:
s2 = {"hello", "world"}

As with tuples and lists, a `set` function can convert other
collections to sets.

In [23]:
x = [1, 2, 3, 1]
set(x)

{1, 2, 3}

In [24]:
t = (1, 2, 3, 1)
set(t)

{1, 2, 3}

Likewise, we can convert sets to lists and tuples.

In [25]:
s

{1, 100, 3.0, 'hello'}

In [26]:
list(s)

[1, 3.0, 100, 'hello']

In [27]:
tuple(s)

(1, 3.0, 100, 'hello')

**Exercise 15:** What happens if you run `dict(s)`?

Why do you get the above error?

---

## Additional Exercises


Understanding how dictionaries work helps you to understand other objects you may use such as more advanced `pandas` DataFrame and Series objects. 

They behave in similar ways.

In fact dataframes and series can be constructed directly from `dict` data structures.

In [28]:
import pandas as pd
data = {
    'a' : 2,
    'b' : 1,
    'c' : 3
}
series = pd.Series(data)

In [29]:
series

a    2
b    1
c    3
dtype: int64

**Exercise #16:** Use the `series` object to sort the values 
(*Hint:* use the `series.<<tab>>` to look at methods available to series objects, or use google)

**Exercise #17:** Use the original `data` object (of type `dict`) to print a list of sorted values such as

```
1
2
3
```

You can print the values of a `dict` object using the following code

In [None]:
for k,v in data.items():
    print(v)

**Hint:** think about using a new object to collect your results. 

**Exercise #18:** Use the `series` object to make a plot of the data (*Hint:* use the `series.<<tab>>` to look at methods available to series objects, or use google) 

**Exercise #19:** Use the `data` object to make a plot

**Exercise #20:** What do you notice about the differences between using the `data` (as a `dict`) and the `series` (as a `pd.Series`)? 