# a4 - Data Grapher!
In this assignment you will be using Python's [**turtle**](https://docs.python.org/3/library/turtle.html) module to create a program that can visualize a _time series_ of data drawn from a CSV file.

Fill in the below code cells as specified to implement this program. Note that cells may utilize variables and functions defined in previous cells.

## Setup
Run the below cell in order to load the `turtle` module (and some additional helper functions).

In [1]:
import turtle as turtle_module # import module
from ipywidgets import widgets # for buttons

def make_turtle(width, height, title):
    """ Create a turtle on a window with given width, height, title
    """
    t = turtle_module.Turtle()
    t.getscreen().title(title)
    t.getscreen().setup(width, height)
    t.getscreen().setworldcoordinates(0, 0, width, height)
    return t

def stop_turtle():
    """ Stops the currently running turtle window, to be closed manually
    """
    turtle_module.done() # stop the mainloop
    turtle_module.TurtleScreen._RUNNING = True # manually indicate it's stopped to enable repeat

def save_drawing():
    """ For fun: saves a copy of the drawing done on the canvas, 
        in both .eps and .png format
    """
    cv = turtle.getscreen().getcanvas()
    cv.postscript(file="turtle_chart.eps", colormode='color')
    from PIL import Image
    img = Image.open("turtle_chart.eps") 
    img.save("turtle_chart.png", "png")    
    
    
stop_button = widgets.Button(description = "Stop Turtle")
stop_button.on_click(lambda e: stop_turtle())
save_button = widgets.Button(description = "Save Drawing")
save_button.on_click(lambda e: save_drawing())
display(widgets.HBox([stop_button, save_button]))

HBox(children=(Button(description='Stop Turtle', style=ButtonStyle()), Button(description='Save Drawing', styl…

You can use the **`make_turtle()`** function to create the window for the turtle to draw in. This function takes as arguments the size and title for the window. You can run the below cell as an example; feel free to adjust the window size if you wish.
- Note that for your convenience, the turtle "starts" at the lower button corner, which is treated as coordinates (0,0).

Note that this function returns a reference to the turtle object that you will need to call methods on to move.

In [3]:
# arguments: window width, window height, window title
turtle = make_turtle(800, 600, "Turtle Chart")
turtle.speed(10) # use this method to change how fast the turtle moves. See documentation.

You can call methods on the turtle object in order to use it to draw (for example `turtle.forward(10)`). If you want to "clear" your drawing, use the `turtle.reset()` method. Note that many cells include calls to reset the drawing so that you can test each one "fresh".

In order to close the window, you will **first need to "stop"** the turtle program. You can do this by calling the provided `stop_turtle()` function, or by clicking the button above. You will then need to **close the window** (by clicking on the red x in the corner). You can re-create the window by re-running the `make_turtle()` function.

In [None]:
# You can also click the above button instead of running this cell!
#stop_turtle()

If you have any problems, you can use the `Kernel > Restart & Clear Output` menu option to restart the entire program.

## Part 1. Chart Axes
In the first set of steps, you'll write functions to draw the x- and y- axes for the chart. This will let you practice using the turtle, as well as doing some basic looping.

Define a function **`draw_y_axis_line()`**. This function should move the turtle to the "bottom left corner" of the chart, and then use it to draw a vertical line to represent the y-axis.
- I have provided some variabls that you can use to measure both the size of the chart area, as well as the coordinates of the bottom corner. These make sure there is some space on the sides of the chart for labels, etc. You are welcome to change the values of these variables if you wish.
- _Tip:_ The turtle's [goto()](https://docs.python.org/3/library/turtle.html#turtle.goto) method can be really useful for this assignment.

In [4]:
# Variables you can use to size your chart
CHART_HEIGHT = 500
CHART_WIDTH = 600
X_ORIGIN = 80
Y_ORIGIN = 50

# your code here!
def draw_y_axis_line():
    turtle.penup()
    turtle.goto(X_ORIGIN, Y_ORIGIN)
    turtle.pendown()
    turtle.goto(X_ORIGIN, Y_ORIGIN + CHART_HEIGHT)




In [5]:
# You can run this cell to test your function!
#turtle.reset() # clear previous test
draw_y_axis_line()
# call your function here to test it!


Define a function **`draw_y_axis_ticks()`**. This function should move the turtle to bottom corner of the chart, and then use it to draw 11 equally spaced "tick marks" (height indicators) on the y axis line. There should be tick marks at the very bottom and very top of the line. You must use a **loop** to do this; do not write the same code 10 times!
- Even with the loop, controlling the turtle can take a lot of commands so this function may get long
- When testing, I recommend you call your `draw_y_axis_line()` function first so you can make sure the tick marks are where you want them to be!

In [6]:
def draw_y_axis_ticks():
    turtle.penup()
    turtle.goto(X_ORIGIN, Y_ORIGIN)
    turtle.left(180)
    turtle.pendown()
    tick_space = CHART_HEIGHT/10
    Y_tick = Y_ORIGIN
    for tick in range(1,12):
        turtle.forward(5)
        turtle.penup()
        Y_tick = Y_tick + tick_space
        turtle.goto(X_ORIGIN, Y_tick)
        turtle.pendown()



In [7]:
# You can run this cell to test your function!
#turtle.reset() # clear previous test
draw_y_axis_ticks()
# call your function here to test it!


Define a function **`draw_y_axis_labels()`**. The function should move the turtle to the bottom corner of the chart and then draw a percent label between 0% and 100% to the left of each of the y axis tick marks. So it should have a 0% at the bottom, then a 10%, then a 20%, etc., with a 100% at the top. You must use a **loop** to do this; do not write the same code 10 times!
- This method will look similar in structure to your `draw_y_axis_ticks()` function.
- You use the turtle's [write()](https://docs.python.org/3/library/turtle.html#turtle.write) method to have it draw text!
- The default font that the turtle draws text with is rather small. You can make it more readable by passing in a _triple_ to the `write()` method's `font` argument. For example, `turtle.write("message", font=("Arial", 14, "normal"))` would have it write "message" in 14pt Arial.
- Again, when testing, I recommend you call your `draw_y_axis_line()` and `draw_y_axis_ticks()` functions first so you can make sure the labels are where you want them to be!

In [8]:
def draw_y_axis_labels():
    turtle.penup()
    turtle.goto(X_ORIGIN - 40, Y_ORIGIN)
    turtle.pendown()
    tick_space = CHART_HEIGHT/10
    Y_tick = Y_ORIGIN
    percentage = 0
    for tick in range(1,12):
        turtle.write(str(percentage) + '%')
        turtle.penup()
        percentage = percentage + 10
        Y_tick = Y_tick + tick_space
        turtle.goto(X_ORIGIN - 40, Y_tick)
        turtle.pendown()



In [None]:
# You can run this cell to test your function!
#turtle.reset() # clear previous test
draw_y_axis_labels()  
# call your function here to test it!


Define a function **`draw_x_axis()`** (that's `x`, not `y`!). This function should take in as an _argument_ a **list** of values that will go on the x-axis. The function should move the turtle to the bottom corner of the chart and then draw the complete x-axis (horizontal), with tick marks and labels.
- This function will work very similarly to the above steps. You are welcome to create additional "helper functions" (e.g., `draw_x_axis_ticks()`) if you wish. But you will still need to have a single `draw_x_axis()` function that calls those helpers.
- You should test this function by passing in a list of values (e.g., `[1995, 2000, 2005, 2010, 2015]`). Note that you can use a **`range()`** to very quickly produce a long list of numbers (such as every year from 1990 to 2018).
- As an _optional_ extra challenge, make sure that for large lists of values that the labels do not overlap. You should do this by only labeling _some_ of the tick marks (e.g., every other one), or by rotating the text (the turtle's facing).

In [9]:
def draw_x_axis_line():
    turtle.penup()
    turtle.goto(X_ORIGIN, Y_ORIGIN)
    turtle.pendown()
    turtle.goto(X_ORIGIN + CHART_WIDTH, Y_ORIGIN)
    
def draw_x_axis_ticks(list_year):
    turtle.penup()
    turtle.goto(X_ORIGIN, Y_ORIGIN)
    turtle.left(90)
    turtle.pendown()
    year_num = len(list_year)
    tick_space = CHART_WIDTH/(year_num-1)
    X_tick = X_ORIGIN
    for tick in range(1,year_num+1):
        turtle.forward(5)
        turtle.penup()
        X_tick = X_tick + tick_space
        turtle.goto(X_tick, Y_ORIGIN)
        turtle.pendown()
        
def draw_x_axis_labels(list_year):
    turtle.penup()
    turtle.goto(X_ORIGIN -5 , Y_ORIGIN - 20)
    turtle.pendown()
    year_num = len(list_year)
    tick_space = CHART_WIDTH/(year_num-1)
    X_tick = X_ORIGIN
    for tick in range(1,year_num+1):
        turtle.write(str(list_year[tick-1]))
        turtle.penup()
        X_tick = X_tick + tick_space
        turtle.goto(X_tick - 10, Y_ORIGIN -20)
        turtle.pendown()
        
def draw_x_axis(list_year):
    draw_x_axis_line()
    draw_x_axis_ticks(list_year)
    draw_x_axis_labels(list_year)


In [10]:
# You can run this cell to test your function!
    
list_year = [1995, 2000, 2005, 2010, 2015]
draw_x_axis(list_year)

# call your function here to test it!


Be sure to call all of your "draw axis" methods to test that your entire chart looks correct!

## Part 2. Reading Data
In this next step you will read in some data to draw in your chart&mdash;initially to specify the values on the axes.

You will be reading data from a [**.csv**](https://en.wikipedia.org/wiki/Comma-separated_values) file: a plain-text data format where each line represents a record (row) of data and where feature (column) is separated by a comma. In particular, your program will be able to handle `.csv` files that include a header row, where the first feature (column) is a "label" for a record, and the rest of the columns are numeric values (such as a time series).
- This assignment comes with a `test.csv` file for you to test with, found inside the `data/` folder, that contains arbitrary testing data. I recommend you use this to test your work as you write your program. The `data/` folder also has data files `us-wealth-share-partial.csv` and `us-wealth-share-full.csv` which contain historical data about the distribution of wealth in the United States ([source](http://gabriel-zucman.eu/usdina/)).

The first step is to read the file and get the list of features from the "header" row. For a time series, this will be a list of the "years" included.

Define a function **`get_years_from_file()`** that takes in a _**relative** file path_ to a csv file (a string such as "data/test.csv"). This function should open that file, read the first line of content (try using the [readline()](https://docs.python.org/3/library/io.html#io.IOBase.readline) function), [split](https://docs.python.org/3/library/stdtypes.html#str.split) the values into a list. Because the first value (e.g., "group") represents the row category and not a year, you'll need to "remove" that element from the list. Finally, ***return*** the year values (as a **list**). 

- Be sure to use a `with` block when opening the file!
- You'll need to remove the "new line" (`\n`) character from the end of the line; you can use the [strip()](https://docs.python.org/3/library/stdtypes.html#str.strip) method to do this.
- Note that you should *not* use list comprehensions anywhere in this assignment (anything that has a `for` inside of `[]`). Write out the for loop independently.

You can test your function by opening the `data/test.csv` file. You should get a list of years: `['1990', '1991', '1992', '1993', '1994', '1995', '1996']`.

In [11]:
def get_years_from_file(path):
    with open(path) as test:
        test_data = test.readline()
        line_clear = test_data.strip('\n')
        data = line_clear.split(',')
        del data[0]
    return (data)


Once you have these headings, you can use them to label the x-axis of your chart! Try calling your `draw_x_axis()` 
function, passing in the list of years from the file.

In [None]:
#turtle.reset() # clear previous test!
list_year = get_years_from_file('us-wealth-share-full.csv')
print(list_year)


    
# call your function here to test it!


Next you will need to read the _data values_ from the file so that you can draw them on your chart. However, all content from the file will be read as _strings_, so you'll need to do some processing to turn it into numeric data.

Define a function **`get_numeric_data_from_file()`** that takes in a _**relative** file path_ to a csv file (a string such as `"data/test.csv"`). This function will need to open that file and read its contents; you can use `readline()` or just loop through the lines of the file.

For each line _after_ the first (so don't include the heading line!), you'll need to do the following:
- _Strip_ off the newline character at the end
- _Split_ each line into a list of values
- For each element in the line (from index 1 on), convert that element from a string into a number (using `float()`&mdash;they may not be whole numbers). Remember to "reassign" the changed value back into the line's list
    - Yes, this means you will have a nested loop!
    - Be careful that you don't change the first element of the line; that's the "label" for the row (which you will need to include).
- Append that line list to a list of "all the lines" (thereby creating a "list of lists").

The function will need to **return** this "list of line values", which will be a _list of lists_ in which each value is numeric.

- Again, remember to open the file using a `with` block
- Remember that you can use a `range()` to loop through the _indices_ of a list, rather than its items!
- Do *not* use list comprehensions anywhere in this assignment (anything that has a `for` inside of `[]`). Write out the for loop independently.

You can test your function again with the `data/test.csv` file. It should return a list of 4 lists&mdash;one for each row. Notice that the values in each of those lists (except the first "label") are numbers, not strings:
```
[['aa', 10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0],
 ['bb', 80.0, 60.0, 70.0, 50.0, 30.0, 0.0, 90.0],
 ['ac', 20.0, 0.0, 80.0, 20.0, 0.0, 80.0, 20.0],
 ['bd', 50.0, 40.0, 30.0, 20.0, 10.0, 10.0, 0.0]]
```

In [12]:
def get_numeric_data_from_file(path):
    with open(path) as test:
        head = test.readline()
        line = []
        for l in range(1,5):
            test_data = test.readline()
            line_clear = test_data.strip('\n')
            data = line_clear.split(',')
            for i in range(1, len(data)):
                data[i] =  float(data[i])
            line.append(data)
    return(line)


get_numeric_data_from_file('test.csv')

[['aa', 10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0],
 ['bb', 80.0, 60.0, 70.0, 50.0, 30.0, 0.0, 90.0],
 ['ac', 20.0, 0.0, 80.0, 20.0, 0.0, 80.0, 20.0],
 ['bd', 50.0, 40.0, 30.0, 20.0, 10.0, 10.0, 0.0]]

## Part 3. Charting Data
In this final step you'll use the data values you've processed to draw a [**line chart**](https://en.wikipedia.org/wiki/Line_chart) of your time series. By the end, you'll have a function that will be able to use the turtle to visualize any (appropriately formatted) csv file!

Define a function **`draw_single_data_series()`** that will be used to draw a _single_ data series (that is: a single line on your chart). This function will take two arguments: a **list** of values (the first element of which is the series name), and a string representing what [_color_](https://docs.python.org/3/library/turtle.html#turtle.color) to draw the line in. The function should use the turtle to draw a single line representing this series.
- You should make the drawn line a little [thicker](https://docs.python.org/3/library/turtle.html#turtle.width) than default; 3 pixels wide worked for me.
- Your function should use a **loop** to move the turtle from each "data point" to the next, connected via a line (leave the pen down!). The [goto()](https://docs.python.org/3/library/turtle.html#turtle.goto) is a practical necessity here.
- You will need to calculate where each "point" on the line is based on the width and height of the chart in order to tell the turle where to go to.
    - _Tip_: For the y-value, think about calculating the _percentage_ of the chart's height, and then moving up that far from the bottom corner. For the x-value, you'll use a similar process to how you draw the x-axis tick marks, so try to copy your math from that!
    - This kind of geometric math can be tricky! Please check in if you need help!
- Finally: at the right side of each line, your function should have the turtle `write()` the label of the data series (the first element of the list).

You can test this function by passing it the first list from the `data/test.csv` file's list of lists (e.g., `test_data[0]`. I recommend also drawing the axes with appropriate label values to make sure everything is where you expect!

In [13]:
def draw_single_data_series(list_data, color):
    turtle.penup()
    turtle.pensize(3)
    turtle.pencolor(color)
    CHART_HEIGHT = 500
    CHART_WIDTH = 600
    X_ORIGIN = 80
    Y_ORIGIN = 50
    X_i = X_ORIGIN
    X_len = CHART_WIDTH / (len(list_data)-2)
    Y_len = CHART_HEIGHT / 10
    turtle.goto(X_ORIGIN, Y_len*list_data[1]/10 + Y_ORIGIN)
    turtle.pendown()
    for i in range(2,len(list_data)):
        y_i = Y_ORIGIN + Y_len*list_data[i]/10
        X_i = X_i + X_len
        turtle.goto(X_i, y_i)
    turtle.write(list_data[0])
    turtle.penup()
    

In [15]:
turtle.reset() # clear previous test!

# call your function here to test it!


Create a variable **`color_palette`** that is a _list_ of 7 or more strings representing [_colors_](https://docs.python.org/3/library/turtle.html#turtle.pencolor) to draw lines in. These can be named colors (e.g., `"red"` or `"blue"`), or a hex code (such as `"#33cc8c"`). [Colorbrewer](http://colorbrewer2.org/) has a nice set of palettes if you need inspiration.

In [16]:
color_palette = ['#edf8fb','#bfd3e6','#9ebcda','#8c96c6','#8c6bb1','#88419d','#6e016b']


Finally, define a function **`draw_data_set()`** that takes in two arguments: a _**relative** file path_ to a csv file (a string such as `"data/test.csv"`), and a list of colors to draw in. This function should do the following:

1. Get the features and data from the specified file
2. Draw the chart axes and labels with appropriate values

This function should call on your previous functions (indeed, most of its body will be made up of the testing code you've used before!).

In [21]:
def draw_data_set(file_path, color):
    list_year = get_years_from_file(file_path)
    test_data = get_numeric_data_from_file(file_path)
    draw_y_axis_line()
    draw_y_axis_labels()  
    draw_y_axis_ticks() 
    draw_x_axis(list_year)
    for i in range(0,len(test_data)):
        draw_single_data_series(test_data[i], color[i+2])



Call your `draw_data_set()` function, passing in the path to one of the provided CSV files (or one of your own!) and your `color_palette` list. Be sure and click the "Save Drawing" button at the top to save a copy for us to see!

In [22]:
#turtle.reset() # clear previous test
turtle.reset()
draw_data_set('us-wealth-share.csv',color_palette)
# call your function here to test it!

IndexError: list index out of range