# Worksheet 3 - Reading and Plotting CSV files

We are going to use the file **"`bh_m_sigma.csv`"** from "`python-week-1/day-3/`"

## 1) Download the Data file **"`bh_m_sigma.csv`"**

Download/clone the github repository: https://github.com/banneker-aztlan/python-week-1

Or use this link: https://raw.githubusercontent.com/banneker-aztlan/python-week-1/master/day-3/bh_m_sigma.csv

In [None]:
import os
print("This jupyter notebook is in: '{}'".format(os.getcwd()))

## 2) Make sure this notebook can locate the file

Either change the "`data_fname`" variable below to where you downloaded the file, or move the file to the current directory.

In [None]:
data_fname = "./bh_m_sigma.csv"
print("Was the notebook able to find the file? ==> {} <==".format(os.path.exists(data_fname)))

## 3) Load the data from the file

We're going to use the **"`csv`"** package to load the data.

### 3a) Read through and run the following code block to understand what it does.

### 3b) Modify the code to store the values from each line into two lists: "`masses`" and "`vels`".  
Make sure to convert each value into a `float` (from a `str`)

### 3c) Make sure each of the lists contains `93` elements

The following values should be there:  
```
masses[0] = 9.59
masses[-1] = masses[92] = 7.033

vels[0] = 288.0
vels[-1] = vels[92] = 107.0
```

In [None]:
import csv

with open(data_fname, 'r') as file_in:
    # The first line of the file is a "header", it describes the data
    header = file_in.readline()
    # Remove the trailing newline character
    header = header.strip()
    print("header: '{}'".format(header))

    # Now we'll create a "reader" object, to read in the actual data
    reader = csv.reader(file_in)
    # This iterates over each line in the file, and will print it.
    for row in reader:
        print(row)
    

## 4) Plot the data using a scatter plot

The standard package for plotting is **"`matplotlib`"**, it is (unfortunately) not very intuitive, but once you get used to it, it works well.

The basic objects `matplotlib` uses are **"`figures`"** and **"`axes`"**.  

**`figure`** objects are the overall container for everything we want to plot.  An individual `figure` can have one plot on it, or many different ones.

Data is actually plotted onto an **`axes`** object, which is added to the `figure`.  The `axes` object is also responsible for the axes labels, the scalings (linear or logarithmic), the plot title, etc.

### 4a) Read and run the code below to produce a figure with an empty axes

**NOTE**: *if an image doesn't appear, and there is no error-message, try adding the line* "`%matplotlib inline`"  *to the top of your file (and run that cell).  If that still doesn't work, holler.*

### 4b) Use the "`ax.scatter()`" function to plot `masses` on the y-axis, versus `vels` on the x-axis.

The documentation for the `scatter()` function can be found here: http://matplotlib.org/api/_as_gen/matplotlib.axes.Axes.scatter.html

It takes a while to get used to reading "documentation", but it's an important skill.

The basic message is that `scatter()` requires two arguments: the x-points, and the y-points.  Try calling `ax.scatter(...)` filling in the appropriate arguments.

### 4c) Try changing the size, color, and "marker" (symbol) of the scatter points

### 4d) Search google for how to add labels to the x and y axes.

There are a few different ways to do this, but I'd recommend using the "`axes.set(...)`" function with the appropriate arguments put in.

**BONUS:** *Make the size of each scatter point proportional to the blackhole mass!*

### 4e) Search google for how to make the figure bigger, and then how to save the figure to a file.

Use the "`fig.savefig()`" command to save.

**BONUS**: *Figure out how to add a legend to the figure (even though we're only plotting one thing...)*

In [None]:
# `pyplot` is the standard interface to `matplotlib`; renaming it to `plt` is common
import matplotlib.pyplot as plt

# Create figure and axes objects using the `subplots()` function
fig, ax = plt.subplots()

# This command will display the image in the jupyter notebook
plt.show()

## 5) Plot a histogram of the Black Hole masses

Create a new figure and axes, and use the **"`axes.hist()`"** function to construct a histogram.

### 5a) Start out making the plot using the argument "`bins=10`" in the `hist()` function.

### 5b) Try creating your own bins (i.e. a list of masses which you want to separate the bins) and use those in the `hist()` function.