A few things you should keep in mind when working on assignments:

1. Make sure you fill in any place that says `YOUR CODE HERE`. Do **not** write your answer in anywhere else other than where it says `YOUR CODE HERE`. Anything you write anywhere else will be removed or overwritten by the autograder.

2. Before you submit your assignment, make sure everything runs as expected. Go to menubar, select _Kernel_, and restart the kernel and run all cells (_Restart & Run all_).

3. Do not change the title (i.e. file name) of this notebook.

4. Make sure that you save your work (in the menubar, select _File_ → _Save and CheckPoint_)

5. You are allowed to submit an assignment multiple times, but only the most recent submission will be graded.

# Problem 2. Scatter Plots

In this problem, we will create a simple two-dimensional scatter plot using Python.

In [None]:
%matplotlib inline

# plotting tools
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns

# testing tools
from nose.tools import assert_equal, assert_is_instance, assert_is_not
from numpy.testing import assert_array_equal

Suppose we are given arrival delays and departure delays of 15 different flights, and we want to visualize the relationship between the arrival and departure delays using a scatter plot.

In [None]:
arr_delay = [-3, 4, 23, 10, 20, -3, -10, -12, -9, -1, -6, 0, -12, -7, -10]
dep_delay = [-4, -5, 11, -3, 0, -3, -8, -6, 2, 2, 2, -6, -8, -3, -5]

Looking at the first data points of `arr_delay` and `dep_delay` (i.e., `arr_delay[0]` and `dep_delay[0]`), the first flight arrived 3 minutes earlier than scheduled, and it arrived 4 minutes earlier than schedule; the second data points (i.e., `arr_delay[1]` and `dep_delay[1]`) indicate that the second flight departed 5 minutes earlier than scheduled, but it arrived 4 minutes late; and so on.

Your task in this problem is to create a scatter plot with `arr_delay` in the $x$-axis and `dep_delay` in the $y$-axis.

Here's a sample plot.

![](scatter.png)

You don't have to make your plot look exactly like this example. If your plot looks visually OK, and if the test code cell doesn't produce any errors, your solution is correct.

## Write a function named `make_scatter_plot` that creates a two-dimensional scatter plot given two lists, `x` and `y`.

- Note that code for creating a `Figure` object and an `Axes` object is already provided:
```python
fig, ax = plt.subplots()
```
Also note that the `plot_sine` function returns an instance (named `ax`) of the `Axes` object. You should use `ax` to create your plot (check out the [lesson notebook on Python Plotting](https://datascience.business.illinois.edu/user/jkim575/notebooks/accy570-fa16_RO/Week3/notebooks/intro2plotting.ipynb)). The reason we write our function to return an `Axes` instance is because we want to use it for testing our function.

- Your plot should have a title and axis labels.

In [None]:
def make_scatter_plot(x, y):
    """
    Creates a two-dimensional scatter plot.
    
    Parameters
    ----------
    x: A list of integers. Data points for the x-axis.
    y: A list of integers. Data poitns for the y-axis.
    
    Returns
    -------
    A matplotlib.Axes object.
    """
    
    fig, ax = plt.subplots()
    
    # YOUR CODE HERE
    
    return ax

In [None]:
ax = make_scatter_plot(arr_delay, dep_delay)

In [None]:
assert_is_instance(
    ax, mpl.axes.Axes,
    msg="Your function should return a matplotlib.axes.Axes object."
)

assert_equal(
    len(ax.collections), 1,
    msg="Your plot does not have any data points."
)

assert_is_not(
    len(ax.title.get_text()), 0,
    msg="Your plot doesn't have a title."
)
assert_is_not(
    ax.xaxis.get_label_text(), '',
    msg="Change the x-axis label to something more descriptive."
)
assert_is_not(
    ax.yaxis.get_label_text(), '',
    msg="Change the y-axis label to something more descriptive."
)
    
xdata, ydata = ax.collections[0].get_offsets().T
assert_array_equal(xdata, arr_delay)
assert_array_equal(ydata, dep_delay)

# If your function can only plot the delays and
# cannot handle other data sets, the following test will fail.
x1 = np.random.randint(100, size=100)
y1 = np.random.randint(100, size=100)

ax1 = make_scatter_plot(x1, y1)

x1data, y1data = ax1.collections[0].get_offsets().T
assert_array_equal(x1, x1data)
assert_array_equal(y1, y1data)

plt.close()