Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\rightarrow$Run All).

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your name and collaborators below:

In [1]:
NAME = "Drew Phillips"
COLLABORATORS = ""

---

<img style="float: left;" src="earth-lab-logo-rgb.png" width="150" height="150" />

# Earth Analytics Education

# Week 11 Homework Template

To complete assignment 11, be sure you have reviewed Chaps 17-19 in Section 7 of the <a href="https://www.earthdatascience.org/courses/intro-to-earth-data-science/" target="_blank">Intro to Earth Data Science online textbook</a> online textbook, which conditional statements, loops, and functions in **Python**.   

Read the instructions for each question carefully to successfully complete the required tasks.

**Request for the autograding tool that we are building:** please do not rename the notebook file. We are working on the tool to allow renamed notebooks, but for now, please leave the notebook name as `ea-bootcamp-11-dry-code.ipynb`.


## Adherence to PEP 8 

Be sure to use clear and succinct names for variables, etc, and to organize your code to support readability.

You will also be graded on adherence to PEP 8 standards including length of code lines and the appropriate use of comments and white space.

Thus, be sure to add comments throughout your code (note that there are no pre-populated comments in this notebook), and use the `autopep8` tool to help you implement some of the PEP 8 standards. 

You can review the <a href="https://www.earthdatascience.org/courses/intro-to-earth-data-science/write-clean-expressive-code/intro-to-clean-code/python-pep-8-style-guide/" target="_blank">online textbook page on PEP 8 </a> as needed.


## Assignment Data

For this assignment, you will write **Python** code to download and work with data on total monthly precipitation (inches) and average monthly minimum temperature (Fahrenheit) for the Los Angeles International Airport (LAX) in California between 1992 and 2018 provided by <a href="https://w2.weather.gov/climate/xmacis.php?wfo=lox" target="_blank">the National Weather Service</a>.

In [None]:
# Core imports needed for grading
import matplotcheck.notebook as nb
import matplotcheck.timeseries as ts

## Import Python Packages

In the cell below, add code **after the line for `Your Code Here`**, replacing `raise NotImplementedError()` with your code to import the packages/modules needed to:
* create plots 
* set your working directory
* download data using earthpy functions
* work with numpy arrays and pandas dataframes

Be sure to list the package imports following the appropriate PEP 8 order. 

In [None]:
# Import required packages
import os
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import earthpy as et

## Define Function to Convert Precipitation Units

In the cell below, add code **after the line for `Your Code Here`**, replacing `raise NotImplementedError()` with your code, to complete the following task:

* Write a function to convert values from inches to millimeters (1 inch = 25.4 millimeters). 

Be sure to include a docstring with a brief description of the function (i.e. how it works, purpose) as well as identify the input parameters (i.e. type, description) and the returned output (i.e. type, description).

In [None]:
# Create function to convert inches to mm
def in_to_mm(inches):
    """Convert inches to millimeters.
    
    Parameters
    ----------
    inches: int or float
            Numeric value of unit "inches"
            
    Returns
    -------
    mm: int or float
        Numeric value of unit "millimeters"
    
    """
    
    mm = inches * 25.4
    
    return mm

## Define Function to Convert Temperature Units

In the cell below, add code **after the line for `Your Code Here`**, replacing `raise NotImplementedError()` with your code, to complete the following task:

* Write a function to convert values from Fahrenheit to Celsius units using the following equation:
    * Celsius = (Fahrenheit - 32) / 1.8
    * Note that including `Fahrenheit - 32` within parenthesis `()` tells `Python` to execute that calculation first.   

Be sure to include a docstring with a brief description of the function (i.e. how it works, purpose) as well as identify the input parameters (i.e. type, description) and the returned output (i.e. type, description).

In [None]:
# Create function to convert
# Degrees Fahrenheit to Celsius
def f_to_c(f):
    """Convert degrees Fahrenheit to Celsius.
    
    Parameters
    ----------
    f: int or float
        Numeric value of units in degrees Fahrenheit
        
    Returns
    -------
    c: int or float
        Numeric value of units in degrees Celsius
    """
    
    c = (f - 32) / 1.8
    
    return c

## Download Data Using EarthPy

In the cell below, add code **after the line for `Your Code Here`**, replacing `raise NotImplementedError()` with your code, to complete the following task:

* **Use a loop to download the following datasets using earthpy**: 
    * lax-monthly-mean-min-temp-fahr-1992-2018.csv from https://ndownloader.figshare.com/files/18601016
        * The dataset contains the monthly mean of minimum temperature (Fahrenheit) that occurred in each month and year at LAX airport between 1992 and 2018. The data are organized with a row for each year (in order from 1992 to 2018) and a column for each month (in order from January through December). Note that this .csv file has no column headers. 
    * lax-monthly-total-precip-inches-1992-2018.csv from https://ndownloader.figshare.com/files/18603971
        * The dataset contains the total monthly precipitation (inches) for each month and year in California between 1992 and 2018. The data are organized with a row for each month (in order from January to February) and a column for each year(in order from 1992 to 2018). Note that this .csv file has column headers for the year and one column for monthly mean (inches). 

In [None]:
# Create list of urls
urls = ["https://ndownloader.figshare.com/files/18601016",
        "https://ndownloader.figshare.com/files/18603971"]

# Loop thru urls and download each one
for url in urls:
    et.data.get_data(url=url)

## Set Working Directory

In the cell below, add code **after the line for `Your Code Here`**, replacing `raise NotImplementedError()` with your code, to set your working directory to **your `earth-analytics` directory in your home directory**.

Be sure to use the appropriate functions that will allow your code to run successfully on any operating system.

In [None]:
# Set working directory
work_dir = os.path.join(et.io.HOME, "earth-analytics")

# Change to working directory
os.chdir(work_dir)

# Check to see if dir changed correctly
os.getcwd()

## Check Paths and Import Data

In the cell below, add your code **after the line for `Your Code Here`**, replacing `raise NotImplementedError()` with your code, to complete the following task:

**For each file, write a conditional statement** to:
* Import the file into the appropriate data structure if the relative path to that file exists.
* Print a helpful message if the relative path to that file does not exist.

Aim to reduce repetition in your code with reusable variables for the paths. Be sure to use the appropriate functions that will allow your code to run successfully on any operating system.

It can help to open each .csv file and review the structure, headers, etc to identify which data structure is most appropriate for which file. 

In [None]:
# Set relative path to data dir for both datasets
dir_path = os.path.join(work_dir, "data", "earthpy-downloads")

# Set path to both datasets
temps_fpath = os.path.join(dir_path, 
                          "lax-monthly-mean-min-temp-fahr-1992-2018.csv") 

precip_fpath = os.path.join(dir_path,
                          "lax-monthly-total-precip-inches-1992-2018.csv")

# Import LAX temp data if the path exists
if os.path.exists(temps_fpath):
    temps = np.loadtxt(temps_fpath, delimiter=",")
else:
    print("Temperature file not found")
    
# Import LAX precip data if the path exists:
if os.path.exists(precip_fpath):
    precip = pd.read_csv(precip_fpath)
else:
    print("Precipitation file not found")

## Convert Monthly Mean of Total Precipitation to Millimeters

In the cell below, add code **after the line for `Your Code Here`**, replacing `raise NotImplementedError()` with your code, to complete the following task:

1. Convert the monthly means for total precipitation to millimeters using the function that you defined at the top of this notebook.
3. Save the final result to a new column in the **pandas** dataframe, and display the final dataframe. 
    
It can help to review how to create a new column from a calculation in a **pandas** dataframe.

In [None]:
# Add new column with precipitation in mm
precip['precip_mm'] = in_to_mm(inches = precip['monthly_mean'])

## Calculate Monthly Mean of Minimum Temperature in Celsius

In the cell below, add code **after the line for `Your Code Here`**, replacing `raise NotImplementedError()` with your code, to complete the following tasks:

1. Calculate the mean of the temperature values for each month (i.e. across all years of data).
2. Convert the temperature units to Celsius using the function that you defined at the top of this notebook.
3. Save the final result to a new **numpy** array and display the new array of monthly mean temp in Celsius.

Aim to use the most concise code that accomplishes this goal. Recall that you can pass the output of one function as the input to another function. 

In [None]:
# Convert array to Celsius 
# Argument is result of mean of array
temps_mean_c = f_to_c(f=np.mean(temps, axis=0))

# Display new array
temps_mean_c

## Get Month Names and Check Shape For Plotting 

In the cell below, add code **after the line for `Your Code Here`**, replacing `raise NotImplementedError()` with your code, to complete the following tasks:

1. Select the month names and save as a new **pandas** series. 
2. Use a conditional statement to:
    * Check whether the shape of the **numpy** array for temperature is the same as the shape of the **pandas** series. (Note that you can actually use the same code to check the shape of the **numpy** array and the **pandas** series.) 
    * Print a message indicating whether or not the **numpy** array for temperature can be plotted with the **pandas** series. 
    
It can help to review how to select data from **pandas** dataframes as a series. 

In [None]:
# Select month names and save as series
months = pd.Series(precip['month'])

# Check shapes of np array and pd series
if months.shape == temps_mean_c.shape:
    print("Array and series can be plotted together!")
else:
    print("Array and series are not the same length.")

## Figure of Monthly Means of Total Precipitation and Minimum Temperature at LAX

From the previous assignment, you have an idea of the fire season (i.e. the range of time within a year in which fire is most likely to occur) in California. 

Create a complimentary figure to show the monthly means of total precipitation and minimum temperature for the Los Angeles area (as represented by the LAX airport), which is one of the most fire prone areas in the state. 

In the cell below, add code **after the line for `Your Code Here`**, replacing `raise NotImplementedError()` with your code, to complete the following task:

* Create one multi-plot figure that contains two subplots that are **side by side**:
    * **left** plot: mean of **total precipitation** for each month
        * **Use the pandas series for the month names.**
    * **right** plot: mean of **minimum temperature** for each month
        * **Use the pandas series for the month names.**
* **Use a different color for each plot but you can use the same style if you like.** 
    * For each plot, be sure to include appropriate titles and axes labels including units of measurement where appropriate. 
* Add an overall title for the entire figure. 

For your title and labels, be sure to think about the following pieces of information that could help someone easily interpret the plot:
* geographic coverage or extent of data.
* duration or temporal extent of the data.
* what was actually measured and/or represented by the data.
* units of measurement.

**Request for the autograding tool that we are building:** please comment out the code line `plt.show()` in your plot code like this: `# plt.show()`. If you do not comment out that code line, you may see an extra empty plot underneath your desired figure. 

In [None]:
# Create figure with 2 plots
fig, (precip_plot, temp_plot) = plt.subplots(1,2, figsize=(12,6))

# Provide title for figure
fig.suptitle("Fire Season in California, 1992-2018")

"""For total precip plot on left"""
# Create bar graph
precip_plot.bar(months, precip['precip_mm'], 
                color="lightblue")

# Set appearance and labels
precip_plot.set(xlabel="Month", 
                ylabel="Mean total Precipitation (mm)",
                title="Mean Total Precipitation per Month")


"""For mean min temp plot on right"""
# Create bar graph
temp_plot.bar(months, temps_mean_c, 
              color='orange')

# Set appearance and labels
temp_plot.set(xlabel="Month", 
              ylabel="Mean Min. Temperature, °C",
              title="Mean Min Temperature per Month")

### DO NOT REMOVE LINE BELOW ###
ts_1_plot = nb.convert_axes(plt, which_axes="all")

## Discuss Your Plot

In the Markdown cell below, answer the following questions using any kind of Markdown list.

Review your figure on the Fire Season in California from the previous assignment:
1. How does the pattern of the monthly means of total precipitation at LAX compare to the fire season in California?
2. How does the pattern of the monthly means of minimum temperature at LAX compare to the fire season in California?

Remove any existing text in the cell before adding your answer.

1. There seems to be a negative correlation between minimum total monthly precipitation and fire size and count. In the summer months (May to Sep), the precipitation drops dramatically. Fire count ( >100 acres ) is at the highest values in June, July, and August and mean fire size is likewise at elevated values (though that graph peaks in October). This makes intuitive sense, as one would expect precipitation to limit fire ignition and propagation.

2. Monthly minimum temperature seems to be correlated with both fire measurements (mean size and count). Mean minimum temperatures are highest in July, August, and September, and fire size and count are likewise elevated (though we observe peaks slightly earlier and slightly later, respectively).

## Discuss Your Workflow

In the Markdown cell below, answer the following questions using any kind of Markdown list.

**Pandas** dataframes actually support summary statistics (e.g. mean, max) using `axis=`, just like **numpy** arrays. See <a href="https://stackoverflow.com/questions/22149584/what-does-axis-in-pandas-mean" target="_blank">this post on stackoverflow</a> and <a href="https://www.geeksforgeeks.org/python-pandas-dataframe-mean/" target="_blank">this short tutorial</a> for examples. 
* How could you use this code to calculate the monthly means for the **pandas** dataframe? 
    * Which axis would need to use for the **pandas** dataframe?
    * How can you capture the results of the calculation within the **pandas** dataframe? 

Note: you do not need to write new code to address these questions, but rather explain the process to achieve the outcome. 

Remove any existing text in the cell before adding your answer.

- You could use code like `df['monthly_mean'].mean(axis=1)` to find the mean of the columns, which are the months.
- You would need axis = 1, where 1 means columns (months).
- You would add a new column, called 'monthly_mean' above. And if you don't like that column, df.drop it!

## Practice Pseudo-coding

In the Markdown cell below, answer the following questions using any kind of Markdown list.

Practice pseudo-coding: you have code to download data and other code to import data into a data structure: 
1. Based on either a **numpy** array or **pandas** dataframe, list the steps that you would need to include in a function definition if you wanted the function to both download and import data into that data structure. 
    * It can help to begin with listing out all of the steps for each task and then reorganizing them as needed. 
2. Which step(s) make writing this function difficult and why? Is there anything that you would need to know that you have not learned yet?
    * Think about which placeholder variables you would need to accomplish these steps. 
3. Would it be possible to write one function that could import data into either a **numpy** array or **pandas** dataframe, depending on the input parameter? How might that work?

Note: you do not need to write new code to address these questions, but rather explain the process to achieve the outcome. 

Remove any existing text in the cell before adding your answer.

- The steps are as follows:
 1. Specify url 
 2. Download data from url
 3. Use os.path.join to create path to dataset
 4. Check if os.path.exists
 5. Print a message if the path does not exist
 6. Otherwise, import the data using pandas or numpy
 
 
 - Step 6 would be the most challenging because the function would need to determine the "right" data structure based on content. I would need to know how to check for data that numpy arrays cannot handle (text). I would then use a conditional as indicated below. I would need a structure to temporarily hold the data while the function iterates through it. A try... except statement might also work (try to import into numpy, except import into pandas).
     ```
     if any data in dataset has dtype = text:
         import into pandas
     elif any data in dataset has dtype != text:
         import into numpy
         ```
 
 
 - You could write code that checks the data for text. If the dataset contains text then the text is probably header info and should be imported as a pandas df. If dtype != strings then the dataset should be imported into a numpy array which can only hold numeric data. Or you could cheat and use pandas for everything. 