# Data Structures 

This notebook goes over material covered in [Chapter 3 of Python for Data Analysis](https://wesmckinney.com/book/python-builtin). Use **Code** cells to write and run any code you need to answer the question and **Markdown** cells to write out answers in words. After you are finished with the assignment, remember to download it as an **HTML file** and submit it in **ELMS**.

In [20]:
import numpy as np
import pandas as pd

Python has many ways to store **data**. They each have different properties, and thus will be useful in their own ways. For example, we might want to look at the heights of everyone in a group of 100 people. In that case, we might store the heights data in a *list* (or some other type of sequence) with 100 values. However, as we'll see, if we want to convert everyone's heights into diffent units, a list might not actually be the best. 

In this class, we will use many different ways to store sequences:
- tuples
- dictionaries
- lists
- arrays (numpy)
- Series (pandas)

**Note**: They are sometimes interchangeable, so it can be easy to get mixed up. If you are getting an error, check what type of object you have!

## Tuples

Tuples are fixed-length, immutable sequences. 
- Fixed-length: cannot change length in place
- Immutable: cannot change values in place

We'll use tuples a lot with functions to output and store multiple values. Note that we would not generally use tuples to store data within loops because they are fixed-length and immutable. 

In [4]:
tup = (1,2,3)
tup

(1, 2, 3)

You can access individual values of a tuple using the **index** of that value. Remember, Python starts indexing at 0!

In [None]:
tup[2]

Note that you can use negative numbers to count backwards from the end. So, the last value would be the "-1" index, the second to last would be the "-2" index, and so on.

In [None]:
tup[-1]

Generally, Python interprets values separated by commas to be tuples, even if they don't have the associated parentheses with them. 

In [None]:
tup2 = 1, 2, 3
tup2

Tuples are immutable, meaning they cannot be changed in place.

In [None]:
# This does not work
tup2[2] = 5

You can also unpack tuples. This gives us an easy way of accessing individual elements of a tuple without necessarily needing to assign everything individually. 

In [1]:
a, b, c = (1, 2, 3)
print(a)
print(b)
print(c)

1
2
3


We'll see this happening with functions and having multiple outputs that we want to unpack.

<font color = 'red'>**Question 1: Consider the tuple below. What are the number of elements in the tuple? How would you access the value of `'example'`?**</font>

Think about the answers to these questions. Try using the `len` function to get the answer to the first question. Is this what you expected? 

In [5]:
some_tuple = (1,2,3), ('this','is','an','example')

2

## Lists

Lists are variable length and mutable. This means you can add or change values within lists. This makes lists very useful for storing values when using a loop. Lists are one of the most common ways of storing sequences that we'll use, so make sure you are familiar with the properties of lists!

In [None]:
example = [1, 2, 3]

Just like tuples, you can access individual values of the list using the **index** of the value with the bracket notation. 

In [None]:
example[0]

However, unlike tuples, lists are mutable.

In [None]:
example[0] = 3
example

You can also create lists by using the `list` function. The same is true using `tuple` to turn something into a tuple.

> This might be useful if you have a tuple of some sort and need it to have properties of a list. For example, if you want to change some values within a tuple and need it to be mutable, then converting it to a list might make sense.

In [None]:
list((1,2,3))

In [None]:
tuple([1,2,3])

<font color = 'red'>**Question 2: Consider the list below. What are the number of elements in the list? What types of objects are in the list? How would you access the value of `'example'`? Can you change the value of `'an'` to `'another'`?**</font>

In [2]:
some_list = [(1,2,3), ('this','is','an','example'),[1, 2, 3]]

### Notes about lists 
- Indexed positionally, and therefore has a notion of position / order
- This allows you to do things like sort, find by position (e.g., "first" or "last")
- Mutable, so you can change the values inside. 
- The `+` operator combines lists.

In [None]:
[1, 2, 3] + [4, 5, 6]

### Some List methods
|Method | Description|
|---|---|
|.append() | Append a value to the list|
|.count() | Returns the number of elements with the specified value|
|.pop() | Remove an element at the specified position|
|.sort()|Sort the list|

The `append` method is commonly used with loops. We first initialize an empty list using `[]`, then append values as we go through the loop. 

In [None]:
output = []
for i in range(10):
    output.append(i * 2)
output

The `pop` method does the opposite of `append` and removes an element at the specified position and outputs that element.

In [None]:
example = [1, 2, 3]
example.pop(1)

In [None]:
example

<font color = 'red'>**Question 3: Create a list that contains all the powers of 3 starting with 3^0, all the way up to 3^10. Call this list `powers_of_three`.**</font>

In [18]:
powers_of_three = []
for i in range(10):
    powers_of_three.append(3**i)
powers_of_three

[1, 3, 9, 27, 81, 243, 729, 2187, 6561, 19683]

<font color = 'red'>**Question 4: Calculate the mean of `powers_of_three` from the previous question.**</font>

**Hint:** You can use the `np.mean` function from numpy for this.

Help on built-in function sum in module builtins:

sum(iterable, /, start=0)
    Return the sum of a 'start' value (default: 0) plus an iterable of numbers
    
    When the iterable is empty, return the start value.
    This function is intended specifically for use with numeric values and may
    reject non-numeric types.



## Dictionaries

A **dictionary** stores a collection of *key-value* pairs. Each key is associated with a value, and you can access the values that are stored within dictionaries by using its key. Keys and values can be any type of object. 

Intuitively, a dictionary works similar to how a dictionary works in real life, with words. You look up a word (a key) to find the definition (the value). In the same way, we can access the data inside a dictionary by looking up the key associated with that piece of data.

Dictionaries are useful because they allow us to store data using informative keys. Rather than trying to remember which position we happened to have decided to use to store a particular attribute (if we used lists), we can use semantically meaningful indices for values, i.e., keys!

In [34]:
example_dict = {'a': (1, 2, 3), 'b': (2, 3, 4)}

The key parts of a dictionary are:

- The `{ }` curly braces which indicate that it's a dictionary (similar to `""` for strings, or `[]` for lists)
- Each entry has maps a value on the right of a `:` to a key on the left. For example, our first entry maps the value `(1,2,3)` to the key `'a'`.
- We include commas, similar to lists, to separate entries in the dictionary.

Dictionaries are very flexible, and the objects stored inside a dictionary do not all need to be the same type.

In [32]:
example_dict2 = {'a': (1, 2, 3), 'b': 'some text', 3: 'c' }

<font color = 'red'>**Question 5: What type of object is `example_dict['a']`? What about `example_dict2['b']`?**</font>

tuple

Note: Unlike lists, dictionaries are **not ordered**. They are organized according to their *keys*, rather than by *indices*, so there's no way to access dictionary items by their index.

<font color = 'red'>**Question 6: Consider the list `list_of_numbers` below. Create a dictionary called `even_odd_dict` with two lists: a list of even numbers from `list_of_numbers` and a list of odd numbers from `list_of_numbers` . Use the keys `even` and `odd` for these two lists. Write your code to be generalizable so that you could perform the same calculation again for a different list of numbers**</font>

*Hint:* One useful operator might be `%` which is the modulo (returns the remainder after division) operator. 

In [2]:
list_of_numbers = [1, 2, 2, 3, 4, 5, 17, 4, 12, 0, 32, 23, 15, 12, 121, 44, 21, 53] 

<font color = 'red'>**Question 7: Check your code above by creating a list with integers from 1 to 100 and finding the even and odd numbers from that list**</font>

### List Comprehension

List comprehension is a quick, concise way of constructing a list based on a specific structure. It looks similar to a for loop, but is constructed completely inside a list.

In [None]:
[2*x for x in range(5)]

Recall: Loop structure looks like:

    for i in <range>:
        <some expression>
        
List comprehension would look something like this:

    [<some expression> for i in <range>]

<font color='red'>**Question 8: Consider the following code. Use list comprehension to get the same result in one line of code.**</font>

In [56]:
values = []
for i in range(10):
    new_value = 2*i + 5
    values.append(new_value)
values

[5, 7, 9, 11, 13, 15, 17, 19, 21, 23]

In [55]:
[2*i+5 for i in range(10)]





[5, 7, 9, 11, 13, 15, 17, 19, 21, 23]

<font color = 'red'>**Question 9: Using list comprehension, create a list that contains all of the powers of 2 from 0 to 10. In other words, the list should contain the values 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, and 1024.**</font>

### List Comprehension with Conditionals

We might also have a loop structure that looks like this:

    for i in <range>:
        if <some conditional>:
            <some expression>
        
List comprehension can be used in this case:

    [<some expression> for i in <range> if <some conditional>]

In [4]:
[2*x for x in range(10) if x > 5] 

[12, 14, 16, 18]

<font color = 'red'>**Question 10: Repeat Question 6 using list comprehension to create the dictionary with two lists.**</font>