<img style="float: left;" src="earth-lab-logo-rgb.png" width="150" height="150" />

# Earth Analytics Education - Bootcamp Course Fall 2020

### Important !! 
Before you turn in your assignment, make sure to run the entire notebook with a fresh kernel. To do this:

* First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart & Run All) and then 

In the cells below you will see the following
For each code cell, you will replace the `raise NotImplementedError()` code with your code that addresses the activity challenge. 
```
# YOUR CODE HERE
raise NotImplementedError()
```


Any open ended questions will have a "YOUR ANSWER HERE" within a markdown cell. Replace that text with your answer also formatted using Markdown.

---

# Week 4 Homework Template

To complete assignment 4, be sure you have reviewed Chapter 1 from the <a href="https://www.earthdatascience.org/courses/scientists-guide-to-plotting-data-in-python/" target="_blank">Scientist's Guide to Plotting Data in Python</a> online textbook, which introduces plotting in Python using matplotlib.   

**Read the instructions for each question carefully to successfully complete the required tasks.**


## PEP 8 Syntax and Clean Code

Be sure to follow PEP 8 syntax guidelines as your write your code. These guidelines include the following:
* Use clear and expresssive variable names
* Organize your code to support readability
* Follow PEP 8 standards for code line length and spacing 
* Use comments sparingly to document important steps in your code
* Finally, use the `autopep8` tool as a check to apply PEP 8 syntax throughout your notebook

IMPORTANT: the `autopep8` tool will not fix all PEP 8 syntax issues but it 
will fix many of them. Be sure to double check your code prior to submitting
it!

If you need a reminder about what PEP 8 is, read our <a href="https://www.earthdatascience.org/courses/intro-to-earth-data-science/write-clean-expressive-code/intro-to-clean-code/python-pep-8-style-guide/" target="_blank">online textbook page on PEP 8 </a>.


## About the Assignment Data

For this assignment, you will use summarized data on fire occurrence in California from 2000 to 2015 provided by <a href="https://www.fs.usda.gov/rds/archive/Product/RDS-2013-0009.4/" target="_blank">the United States Forest Service</a>. These data show the total number of annual fires and the mean size.


In [None]:
# DO NOT MODIFY THIS CELL
# Core imports needed for grading
import matplotcheck.notebook as nb
import matplotcheck.timeseries as ts
from matplotcheck.base import PlotTester

<div class="alert-info" markdown="1">

## Import Python Packages (4 points)

In the cell below, replace `raise NotImplementedError()` with your code to

1. import a package and module needed to create plots.
also import the following packages:
2. import the os package: `import os` 
3. import pandas: `import pandas as pd`
4. import earthpy: `import earthpy as et`

The test below will check to see that you imported os and pandas.
</div>

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
# DO NOT MODIFY THIS CELL
# This cell will ensure that you imported the plotting package and pandas correctly

# Creating total points
import_answer_points = 0
# test that both modules imported - use duck typing
try:
    pd.NA
    print("Score! Pandas has been imported as a pd!")
    import_answer_points += 1
except NameError:
    print("Pandas has not been imported as a pd, please make sure to import is properly.")

try:
    plt.show()
    print("Nice! matplotlib.pyplot has been imported as plt!")
    import_answer_points += 1
except NameError:
    print("matplotlib.pyplot has not been imported as plt, please make sure to import is properly.")
    
try:
    os.getcwd()
    print("Great work! The os module has imported correctly!")
    import_answer_points += 1
except NameError:
    print("Oops make sure that the os package is imported.")
    
try:
    data = et.io
    print("Great work! The earthpy package has imported correctly!")
    import_answer_points += 1
except NameError:
    print("Oops make sure that the earthpy package using the alias et.")

print("You recieved {} out of 4 points.".format(import_answer_points))
import_answer_points


<div class="alert-info" markdown="1">

## Open the Fire Data (5 points)

You can download file needed to complete this task using the earthpy package as follows:

`et.data.get_data(url="url-here")`

by default earthpy will:

* create an `~/earth-analytics/data` directory in your home directory that you will use to process your data all semester.
* Download the data and unzip (if it's compressed) into an `~/earth-analytics/data/earthpy-downloads/` directory.

In the cell below, complete the following task:

1.  Download the `usda-fire-data.csv` using the following url: `https://ndownloader.figshare.com/files/24649844`
2.  Set your working directory to the `earth-analytics/data` directory using the following syntax:

`os.chdir(os.path.join(et.io.HOME, 'earth-analytics', 'data'))`

3 Open the `usda-fire-data.csv` file using pandas `read_csv()` and assign the output data to a variable (be sure the variable name is expressive!).

At the end of the cell, call the variable like to ensure that the output prints
in your notebook. The last line of your cell should look something like this:

`data_frame_variable_name`

The code will look something like the example below (except your variable names will be more expressive!_:

```python
# Download the data - be sure to fix the url!
et.data.get_data(url="url-here")
# Set your working directory
os.chdir(os.path.join(et.io.HOME, 'earth-analytics', 'data'))

# Open up your csv file using pandas
# First set the file path to the csv you downloaded above - 
csv_path = os.path.join("earthpy-downloads", "usda-fire-data.csv")
# Then open up the file using pandas
data_frame_variable_name = pd.read_csv(path-to-file-here)
data_frame_variable_name
```

### Tips:
1. You will learn more about pandas, directories and file paths in a later class. 
2. `os.path.join` is a function that will allow you to create file paths 
that can run on any machine (mac, windows or Linux). It is a good practice
to use this when creating file paths in `Python`. 
****
</div>

**IMPORTANT: At the end of your cell below be sure to call the name of your dataframe
so that the dataframe prints in your notebook. If you don't do this, the test below
will fail.**


In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
# DO NOT MODIFY THIS CELL

<img style="float: left;" src="colored-bar.png"/>


<div class="alert-info" markdown="1">

## Create a Bar Plot of Total Fires (15 points)

Create a plot with total fires on the y-axis and year on the x-axis. 

There are several ways to create a bar plot using matplotlib.
Below, create a bar plot using the following approach:

```
ax.bar(x=data_frame_name.year,
       height=data_frame_name.size_column_name)
```
Modify the plot as follows:

* Add a title, x and y label using the same approach that you used in last week's homework assignment. 
* Change the **color** of the bars on the plot
* Change the **edgecolor** of the bars on the plot
</div>

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

# Add your code here

### DO NOT MODIFY LINE BELOW###
student_plot1_ax = nb.convert_axes(plt)

In [None]:
# DO NOT MODIFY THIS CELL


<img style="float: left;" src="colored-bar.png"/>

<div class="alert-info" markdown="1">

## Create a Bar Plot Using Pandas of Total Fires (12 points)

Above you created a plot using a core matplotlib approach. You can also 
plot using the pandas method `data_Frame.plot()`. This approach will have 
some limitations as you begin to plot more complex data sets but 
it is useful for you to see how it works. 

```python
data_frame.plot(x="x-axis-column-header",
               y="y-axis-column-header", 
               kind="type-of-plot-here",
               title="title-here")

```
When using `.plot()` you can also set the edgecolor and color attributes
just like you did using core matplotlib. 

Below, create a plot using `data_frame.plot()` using the same title, 
axis labels and colors that you did above.
****
</div>

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

### DO NOT MODIFY LINE BELOW###
student_plot2_ax = nb.convert_axes(plt)

In [None]:
# DO NOT MODIFY THIS CELL


<img style="float: left;" src="colored-bar.png"/>

<div class="alert-info" markdown="1">

## Create Column Containing Mean Fire Size in Hectares (10 Points)

Above, you imported some data which contains a column for mean 
fire size in acres. For your analysis, you want the data to be in hectares. Do the following

1. Create a variable that contains the value to convert acres to hectares

HINT: `1 acre = 0.404686 hectares`

2. You can create a new column in a pandas data frame using the following syntax:

`df["new_col_name"] = df["existing_col_name"] * conversion_value_here`

Create a new column called `mean_size_hectares` in your data frame that contains the mean fire size values converted to hectares.

****
</div>
    
**IMPORTANT: At the end of your cell below be sure to call the name of your dataframe
so that the dataframe prints in your notebook. If you don't do this, the test below
will fail.**

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
# DO NOT MODIFY THIS CELL

# Testing that the proper column has been added to the dataframe

student_data_2 = _
dataframe_2_answer_points = 0

if "mean_size_hectares" in student_data_2.columns.tolist():
    print("Great - you have added the mean_size_hectares_column. Great job! You get 5 / 5 points for this task")
    dataframe_2_answer_points += 5
else:
    print("Oops - you should have a column called mean_size_hectares in your data frame. Make sure you added and named the column correctly.")
    
dataframe_2_answer_points

In [None]:
# DO NOT MODIFY THIS CELL


<div class="alert-info" markdown="1">

## Create A Figure with Subplots (Multi-plot Figure - 20 points)

Above you experimented with plotting using both Pandas `.plot()` and 
matplotlib `.bar()`. In the cell below create a Figure that contains
two subplots:

* Plot 1: Create a bar plot that shows total number of fires by year
* Plot 2: Create a scatter plot that shows the mean fire size in hectares by year

For each plot do the following:

* Modify the default plot colors
* Add a title, and x and y axis labels

For the figure:
* Add an overall title for the entire figure

When adding your titles and labels, think about the following pieces of information that could help someone easily interpret the plot:
* geographic coverage or extent of data.
* duration or temporal extent of the data.
* what was actually measured and/or represented by the data.
* units of measurement.
****
</div>

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

### DO NOT REMOVE LINE BELOW ###
ts_plot_axes = nb.convert_axes(plt, which_axes="all")

In [None]:
# DO NOT MODIFY THIS CELL!

## Explain Your Plot

In the Markdown cell below, answer the following questions about your plot using a **bullet list**.

1.  Why did you choose the styles and colors for the data being represented in each subplot? 

2. Do either the yearly total number of fires or average size of fires appear to be increasing over time in California? Explain your answer using the patterns that you see in the plotplot.

3. What additional data might help you to better answer the first question about whether number of fires or average fire size appear to increasing? (It could help to take a look at the information about the original dataset from <a href="https://www.fs.usda.gov/rds/archive/Product/RDS-2013-0009.4/" target="_blank">the United States Forest Service</a>.)

Remove any existing text in the cell below before adding your answer.

YOUR ANSWER HERE

## Discuss Your Workflow

In the Markdown cell below, answer the following questions using a **numbered list**:

Consider the variable name that you used for your pandas dataframe. Explain why it is expressive (or not).

Remove any existing text in the cell below before adding your answer.

YOUR ANSWER HERE

### Points for Overall Notebook

DO NOT MODIFY THIS CELL. 
Your instructor will use this cell to give you points for the following - 
5 points for each item below (20 points total):

- The notebook runs from start to finish and starts at [1]
- PEP 8 format compliance
- Expressive clean code
- Spelling & careful use of comments