In [None]:
# Note: you don't need to know what this cell does! We're simply providing you with some functions you can use to look at tables.
from otd_util import *
import json

with open('data.json') as json_data:
    data = json.load(json_data)

## Problem 1: Tables

Sometimes, it's helpful to organize data differently! One way to do this is a **table**. With tables, you can organize things in **rows** and **columns**. Rows denote a single piece of data, while columns denote specific characteristics. For example, let's say we collected some data from a class of students! It's stored in `data`. Run the following cell to see what that data might look like:

In [None]:
visualize_table(data)

But what, exactly, is `data`? How are we creating this table?

One way to create a table is using lists and dictionaries!

For example, we can think of each column as a list of data. The position of an item in the list denotes what row it belongs to: if an item is at the 0th index, it is in row 0!

For example, the list of favorite colors from the table above would look like `[blue, teal, purple, green, pink, red, black]`.

Then we can say that each table is a *collection of columns*. We can put these columns together by putting them in a dictionary where each key is a column name and each value is the list that represents that column.

Let's take a look at what the table we visualized above looks like in its normal form:

In [None]:
data

To get a column out of this table, we can do what we do with any dictionary: key by column name!

For example, to get the favorite color, we can do `data['Favorite Color']` (note: `Favorite Color` is a string, so we need to put quotes around it when we're indexing by it!).

How can you get the list of favorite fast foods out of `data`? Set `fast_foods` equal to the list corresponding to the favorite fast foods.

In [None]:
fast_foods = ...

What if you wanted to get the restaurant `Subway` out of the `fast_foods`?

In [None]:
subway = ...

Now, try writing one line of code to get the string `History` out of this table!

In [None]:
history = ...

## Problem 2: Data Manipulation

What if we wanted to find the average age in this table? How would we do that?

First, let's isolate the ages:

In [None]:
ages = data['Age']
ages

Now that we have the ages, how can we take the average? Remember, the average is just the `sum` of a set of numbers over the number of elements! We have some handy list functions that will give us these values: `sum` and `len`. Let's try using them:

In [None]:
average_age = ... # Your code here

Now, what if we wanted to find the value of a specific cell in a table? For example, let's take the following table of names and ages:

In [None]:
table = {'Name': ['Sally', 'Amanda', 'Peony'], 'Age': [13, 14, 15]}
visualize_table(table)

What if we wanted to find Amanda's age? We can use the `index` function, which will find the index of the first element that matches a value. Or, more simply, if we give `index` the word 'Amanda' and a list, it will tell you the position Amanda is in! This is also Amanda's row, which we can then use on the `Age` column.

To see this in action, look at this piece of code:

In [None]:
names = table['Name']
amanda = names.index('Amanda')
table['Age'][amanda]

Now, try it yourself! Set `food` to the fast food restaurant that the person whose favorite color is teal likes to go to.

In [None]:
# Your code here!
...
food = ... 

## Problem 3: Visualization

Let's try visualizing some of the data we've collected using some of the graphs we learned about last module!

You have access to four functions:

* `scatter` takes in three arguments; the first should always be a table, and the other two should be the names of columns you want to compare as `x` and `y`. It then produces a scatter plot of them.
* `line` takes in three arguments; the first should always be a table, and the other two should be the names of columns you want to compare as `x` and `y`. It then produces a line plot of them.
* `bar` takes in three arguments; the first should always be a table, and the other two should be the names of columns you want to compare as `x` and `y`. It then produces a bar chart of them.
* `histogram` takes in three arguments; the first should always be a table, and the other two should be the name of the column you would like to produce a histogram of and (optionally) the number of bins you would like to use, and produces a histogram of that column's values.


Run the following cell to see what a histogram of fast foods might look like!

In [None]:
histogram(data, 'Favorite Fast Food')

Let's take a look at the correlation between age and height:

In [None]:
scatter(data, 'Age', 'Height')

## Problem 4: Analyze It!

Now, try making some of your own graphs in the cells below!
