<div><img style="float: left; padding-right: 3em;" src="https://avatars.githubusercontent.com/u/19476722" width="150" /><div/>

# Earth Data Science Coding Challenge!
Before we get started, make sure to read or review the guidelines below. These will help make sure that your code is **readable** and **reproducible**. 

## Don't get **caught** by these Jupyter notebook gotchas

<img src="https://miro.medium.com/v2/resize:fit:4800/format:webp/1*o0HleR7BSe8W-pTnmucqHA.jpeg" width=300 style="padding: 1em; border-style: solid; border-color: grey;" />

  > *Image source: https://alaskausfws.medium.com/whats-big-and-brown-and-loves-salmon-e1803579ee36*

These are the most common issues that will keep you from getting started and delay your code review:

1. When you try to run some code on GitHub Codespaces, you may be prompted to select a **kernel**.
   * The **kernel** refers to the version of Python you are using
   * You should use the **base** kernel, which should be the default option. 
   * You can also use the `Select Kernel` menu in the upper right to select the **base** kernel
2. Before you commit your work, make sure it runs **reproducibly** by clicking:
   1. `Restart` (this button won't appear until you've run some code), then
   2. `Run All`

## Check your code to make sure it's clean and easy to read

<img src="https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcSO1w9WrbwbuMLN14IezH-iq2HEGwO3JDvmo5Y_hQIy7k-Xo2gZH-mP2GUIG6RFWL04X1k&usqp=CAU" height=200 />

* Format all cells prior to submitting (right click on your code).
* Use expressive names for variables so you or the reader knows what they are. 
* Use comments to explain your code -- e.g. 
  ```python
  # This is a comment, it starts with a hash sign
  ```

## Label and describe your plots

![Source: https://xkcd.com/833](https://imgs.xkcd.com/comics/convincing.png)

Make sure each plot has:
  * A title that explains where and when the data are from
  * x- and y- axis labels with **units** where appropriate
  * A legend where appropriate


## Icons: how to use this notebook
We use the following icons to let you know when you need to change something to complete the challenge:
  * &#128187; means you need to write or edit some code.
  
  * &#128214;  indicates recommended reading
  
  * &#9998; marks written responses to questions
  
  * &#127798; is an optional extra challenge
  

---

# In  2019 there were floods across the Midwestern US
![Inland floods in the Midwest - houses partially submerged in brown water](https://nationalfloodservices.com/wp-content/uploads/2020/09/midwest-flood-blog-1.png)

> Image source: <a src=https://nationalfloodservices.com/blog/the-2019-midwestern-floods-the-insidious-damage-of-inland-flooding/> National Flood Services - The 2019 Midwestern Floods</a>

From March to December 2019, large parts of the midwestern U.S. were flooded. What happened to cause this flooding? What impacts did the flooding have? Before we look at data about the flooding, we need to check out what other sources are saying about it.

&#128214; Here are some resources from different sources to get you started:
  * [The New York Times](https://www.nytimes.com/interactive/2019/09/11/us/midwest-flooding.html)
  * [National Flood Services](https://nationalfloodservices.com/blog/the-2019-midwestern-floods-the-insidious-damage-of-inland-flooding/)

&#128172; If you or someone you know have experience with this site, or 
were there during the floods, we also invite you to write about that.

## STEP 1: Set up Python

Use the cell below to add necessary **package imports** to this notebook. It's best to import everything in your very first code cell because it helps folks who are reading your code to figure out where everything comes from (mostly right now this is **you** in the future). It's *very* frustrating to try to figure out what packages need to be installed to get some code to run.

&#128214; Our friend [the PEP-8 style guide has some things to say about imports](https://peps.python.org/pep-0008/#imports). In particular - **standard library packages** should be listed at the top. These are packages that you don't need to install because they come with Python. You can check if a package is part of the standard library by searching the [Python Standard Library documentation page](https://docs.python.org/3/library/). 

&#128187; Your task:
  * **Uncomment** all the import lines below. HINT: Use the `CMD`-`/` shortcut to uncomment many lines at once.
  * Add the **library for working with DataFrames in Python** to the imports, as well as the **hvplot extension**
  * Separate the **standard library package(s)** at the top
  * Run and test your import cell to make sure everything will work

In [None]:
# import folium
# from io import BytesIO
# import requests
# import subprocess

# YOUR CODE HERE
raise NotImplementedError()

In [None]:
# RUN THIS CELL TO TEST YOUR CODE - DO NOT MODIFY!
import_pts = 0

# Check that pandas has been imported properly
try:
    pd.DataFrame()
    import_pts += 1
    print('\u2705 Great work! '
          'You correctly imported the pandas library.')
except:
    print('\u274C Oops - pandas was not imported correctly.')
    
# Check that hvplot has been imported
try:
    pd.DataFrame().hvplot
    import_pts += 1
    print('\u2705 Great work! '
          'You correctly imported the hvplot.pandas library.')
except:
    print('\u274C Oops - hvplot.pandas was not imported correctly.')

# Subtract one point for any PEP-8 errors
tmp_path = "tmp.py"
with open(tmp_path, "w") as tmp_file:
    tmp_file.write(In[-2])
ignore_flake8 = 'W292,F401,E302'
flake8_out = subprocess.run(
    ['flake8', 
     '--ignore', ignore_flake8, 
     '--import-order-style', 'edited',
     '--count', 
     tmp_path],
    stdout=subprocess.PIPE,
).stdout.decode("ascii")
print(flake8_out)
import_pts -= int(flake8_out.splitlines()[-1])

print(
    "\n \u27A1 You received {} out of 2 points.".format(import_pts)
)

import_pts

## STEP 2: SITE MAP AND DESCRIPTION


### Near Omaha, NE, roads were flooded and levees overflowed

![The Platte and Missouri rivers overflow their banks. Before and after images show a large extent of land outside the usual riverbanks flooded.](https://nationalfloodservices.com/wp-content/uploads/2020/09/Historic_floods_have_inundated_Nebraska_40463013783.jpg)

> Image Source: [National Flood Services](https://nationalfloodservices.com/blog/the-2019-midwestern-floods-the-insidious-damage-of-inland-flooding/)

To start, you'll be focusing on the Missouri River at Omaha, a few miles north of where this image above depicts the junctions of the Platte and Missouri Rivers. Then, you'll pick your own site that was affected by a flood.

### Site Description

&#9998; In the cell below, describe the Omaha area and/or the Missouri River in a few sentences. 
You can include:
  * Information about the **climatology** of the area, or typical 
  precipitation and temperature at different months of the year
  * You don't need to include a **runoff ratio** (average annual runoff divided by average annual precipitation) for a river this large, since it would be hard to find and not as meaningful. If you are working with small watersheds you should definitely report this number. Check out the [GAGES II dataset](https://water.usgs.gov/GIS/dsdl/basinchar_and_report_sept_2011.zip)
  * Which **wildlife and ecosystems** exist in the area
  * What **communities and infrastructure** are in the area

WRITE YOUR SITE DESCRIPTION HERE

### Site Map: The Missouri River near Omaha

The code below will create an interactive map of the area using the **folium**
library. But something is wrong - no one defined the names latitude and longitude.

&#128187; Your task:
  * Find the location of the Missouri River near Omaha **USGS stream gauge** using the [National Water Information System](https://waterdata.usgs.gov/nwis?). This is not the easiest thing to find if you aren't used to NWIS, so you can use the following instructions to get started:
      * Go to the [National Water Information System Mapper](https://dashboard.waterdata.usgs.gov/app/nwd/en/)
      * Type in `Omaha` in the `Find a Place` box and pick the one in Nebraska (NE)
      * Click on the Missouri River near Omaha site. It should open a new window.
      * Click on `Site page` at the top.
      * Scroll to the bottom and open the `Location metadata` section. There you will find the latitude and longitude as decimal values.
  * Define latitude and longitude variables to **match the variable names 
    used in the code**.
  * Change the current label, "Thingy" to be descriptive of the site.
  * Run and test your cell to make sure everything works.

&#127798; EXTRA CHALLENGE: Customize your folium plot [using the folium documentation](https://python-visualization.github.io/folium/quickstart.html#Getting-Started). For example, you could:
  * Change the base map images
  * Change the initial zoom

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

# Initialize map and tweak settings
m = folium.Map(
    # Location to display
    location=(sg_lat, sg_lon),
    # Turns off annoying zooming while trying to scroll to the next cell
    scrollWheelZoom=False)

# Put a marker at the stream gauge location
folium.Marker([sg_lat, sg_lon], popup="Thingy").add_to(m)

# Display the map
m

## STEP 3: DOWNLOAD TIME SERIES DATA

### One way to express how big a flood is by estimating how often larger floods occur.

For example, you might have heard news media talking about a "100-year flood". 

Next, you will write Python code to download and work with a **time series** of streamflow data during the flooding on the Missouri River.

> A **time series** of data is taken at the same location but collected regularly or semi-regularly over time. 

You will then consider how the values compared to previous years before the flood event by computing the flood's **return period**.

> A **return period** is an estimate of how often you might expect to see a flood of at least a particular size. This does *NOT* mean an extreme flood "has" to occur within the return period, or that it couldn't occur more than once.

&#128214; Here are some resources from your textbook you can review to learn more:
  * [Introduction to time-series data](https://www.earthdatascience.org/courses/use-data-open-source-python/use-time-series-data-in-python/)
  * [Flood return period and probability](https://www.earthdatascience.org/courses/use-data-open-source-python/use-time-series-data-in-python/floods-return-period-and-probability/)

&#9998; In the cell below, explain what data you will need to complete this analysis, including:
  1. What type or types of data do you need?
  2. How many years of data do you think you need to compute the return period of an extreme event like the 2019 Omaha floods?

YOUR ANSWER HERE

### US streamflow data are available from the National Water Information Service (NWIS) 

&#128187; Practice downloading the data you need using the NWIS website. **You will not use your downloaded data in the analysis, but you must follow these steps to get the correct urls.** In the cell below, use the following instructions to get urls for downloading the USGS data:

1. Go back to the Missouri River near Omaha station page.
2. This time, click `Data` instead of `Site Page`
3. Select `Daily Data` from the list of datasets.
4. Select the date range from October 1, 1991 to September 30, 2022, set your results to be as `Tab-separated`, and press `Go`.
    > NOTE: For hydrologic data, we often use the **Water Year**, which starts the  October before in order to capture the full snow season. In this case, we are downloading WY1992-WY2022
5. Copy the url that populates in your browser window and paste it below. You don't need to save the data - we will do that using Python.
    

&#9998; USGS streamflow URL: *url here*

#### Exploring the NWIS API

One way to access data is through an **Application Programming Interface**, or **API**. The URL you've just found is an example of a simple, public API. All the parameters of your data search are visible in the URL. For example, to get data starting in 2015, we could change `begin_date=1991-10-01` to `begin_date=2015-01-01`)

 &#9998; In the cell below - what parameter would you change in the USGS url if you wanted to switch locations?


WRITE YOUR ANSWER HERE

### Data description and citation

&#9998; In the cell below, describe your data. Include the following information:
  1. A 1-2 sentence description of the data
  2. Data citation
  3. What are the units?
  4. What is the time interval for each data point?
  5. Is there a "no data" value, or a value used to indicate when the sensor was broken or didn't detect anything? (These are also known as NA, N/A, NaN, nan, or nodata values)

&#128214; The [NWIS data format page](https://waterdata.usgs.gov/nwis/?tab_delimited_format_info) might be helpful.

WRITE YOUR DATA DESCRIPTION AND CITATION HERE

### Download the data

In the cell below complete the following task:

1. Use [the code suggested by ChatGPT for how to request a URL over HyperText Transfer Protocol (HTTP) in Python](https://chat.openai.com/share/ca03fcfd-1264-41a1-a19e-05dd57a842ce) as a starting point for your code.
2. Replace the url in the code below with the USGS NWIS URL you found, and change the variable names to something descriptive.
   > HINT: URLs are a type of Python object called a `string`. Make sure you put quotes around your URL so that Python knows how to interpret it!
3. Call the response object at the end of the cell.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
ans_req = _
req_pts = 0

if ans_req.ok:
    print('\u2705 Great work! Your download succeeded')
    req_pts += 4
else:
    print('\u274C Hmm, looks like your url is not correct')
    
# Subtract one point for any PEP-8 errors
tmp_path = "tmp.py"
with open(tmp_path, "w") as tmp_file:
    tmp_file.write(In[-2])
ignore_flake8 = 'W292,F401,E302,F821'
flake8_out = subprocess.run(
    ['flake8', 
     '--ignore', ignore_flake8, 
     '--import-order-style', 'edited',
     '--count', 
     tmp_path],
    stdout=subprocess.PIPE,
).stdout.decode("ascii")
print(flake8_out)
req_pts -= int(flake8_out.splitlines()[-1])

print('\u27A1 You earned {} of 4 points for downloading data'.format(req_pts))

req_pts

## STEP 4: CLEAN UP

### You will need to take a look at the raw downloaded data to figure out what import parameters to use with the pandas read_csv() function

&#128187; In the cell below, replace `response` with the name of the response variable that you defined above.

The code below prints the first 10 lines of your download and numbers them. Does this look like streamflow data to you?

In [None]:
# Print the top of the data
for i, line in enumerate(nwis_response.content.splitlines()[:10]):
    print(i, line)

In the [NWIS documentation](https://waterdata.usgs.gov/nwis/?tab_delimited_format_info), they say that you can ignore lines that start with a hash sign (#) because they are **commented**. When we use pandas to import the data, we'll be able to tell it what character indicates a comment, but we're not there yet. The code below again prints the first 35 lines of the response content, this time skipping all commented lines. 

&#128187; In the cell below, replace `response` with the name of the response variable that you defined above. Then run the code.

In [None]:
# Take a look at the data. What got downloaded?
for i, line in enumerate(nwis_response.content.splitlines()[:35]):
    # Skip commented lines
    if not line.startswith(b'#'):
        print(i, line)

&#9998; What do you notice about the data now? In the following cell, write down your thoughts on:
  * What separator or **delimiter** does the data use to separate columns?
  * What should the data types of each column be?
  * Which column contains the streamflow data?
  * Do you need to skip any rows that don't contain data?
  * Which column do you think makes sense as the **index** (unique identifier) for each row?
  * Is there anything else strange?

The answers to the questions above will help you figure out what parameters to use with the `pd.read_csv()` function.

WRITE YOUR OBSERVATIONS ABOUT THE UNCLEANED DATA HERE

### Now we're ready to import the data with pandas. 

Notice that when you print your downloaded data, each line has a `b` in front of it. The `b` stands for "bytes". In order for pandas to be able to read the data, we need to **decode** it so each line is a regular string. In the cell below, we do this using the `io.BytesIO` function, which tricks `pandas` into thinking it is reading a binary file.

&#128187; Your task:

Paste the following code in the cell below:
```python
pd.read_csv(
    BytesIO(nwis_response.content),
    # comment='',
    # delimiter='', 
    # skiprows=[],
    # names=[],
    # index_col='',
    # parse_dates=True,
)
```

Then:
  1. Replace `response` with the name of your HTTP Response variable
  2. Uncomment the code below, **one line at a time**, running the code and making corrections in-between.
  3. Using the observations you made above, add the necessary values to get `pandas` to correctly import the data.
  4. Make sure to include units in your column names where applicable! What units are these streamflow measurements?

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
ans_q = _
q_points = 0

if isinstance(ans_q, pd.DataFrame):
    print("\u2705 Great, you created a pandas dataframe above")
    q_points += 1
else:
    print("\u274C Oops - the cell above should have a DataFrame output.")

if type(ans_q.index) == pd.DatetimeIndex:
    print("\u2705 Your DataFrame has the date as the index, "
          "good job!")
    q_points += 2
else:
    print("\u274C Your DataFrame does not have the date "
          "as the index.")

if len(ans_q) == 11323:
    print("\u2705 Your DataFrame is the right length!")
    q_points += 2
else:
    print("\u274C Check your date range.")
    
if round(ans_q.iloc[:,2].mean(), 0)==38402.0:
    print("\u2705 Your streamflow DataFrame has the expected values "
          "in it, good job!")
    q_points += 2
else:
    print("\u274C Your streamflow DataFrame does not have the "
          "expected values in it.")

# Subtract one point for any PEP-8 errors
tmp_path = "tmp.py"
with open(tmp_path, "w") as tmp_file:
    tmp_file.write(In[-2])
ignore_flake8 = 'W292,F401,E302,F821'
flake8_out = subprocess.run(
    ['flake8', 
     '--ignore', ignore_flake8, 
     '--import-order-style', 'edited',
     '--count', 
     tmp_path],
    stdout=subprocess.PIPE,
).stdout.decode("ascii")
print(flake8_out)
q_points -= int(flake8_out.splitlines()[-1])

print("\u27A1 You received {} out of 7 points for opening the "
      "streamflow data.".format(
    q_points))
q_points

Let's check your data. A useful method for looking at the **datatypes** in your `pd.DataFrame` is the `pd.DataFrame.info()` method.

> In Python, you will see both **methods** and **functions**. This is an *important and tricky* distinction we'll be talking about a lot. For right now -- functions have all of their arguments/parameters **inside** the parentheses, as in `pd.read_csv(args)`. For **methods**, the first argument is always some kind of Python **object** like a `pd.DataFrame`. Take a look at the next cell for an example of using the `pd.DataFrame.info()` **method**.


&#128187;  Replace `dataframe` with the name of your DataFrame variable

In [None]:
dataframe.info()

## STEP 5: PLOT THE DATA

Oops, we have one more problem! Take a look at the data types of your DataFrame columns...

✎ In the cell below, write down what data type you would expect the streamflow column to be. The main options are: Integer, Float, Datetime, or Object.

📖 Check out [this example showing the most common data types for pandas columns](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.dtypes.html)

> A **float** is a non-integer number. You can identify them because they have decimal points in Python, unlike integers. We do not call them decimals for a reason - a decimal.Decimal is different, and more precise than, a float in Python. If you are ever working with really, really small numbers, you may need to use decimals, but for most applications floats are fine.

### Can we see the flood in the streamflow data?

In the cell below, subset the stream discharge data to the same timeframe that you are interested in: February - April, 2019. Save the result to a variable and call it at the end of the cell for testing.

You can find some [examples of subsetting time series data in the textbook](https://www.earthdatascience.org/courses/use-data-open-source-python/use-time-series-data-in-python/date-time-types-in-pandas-python/subset-time-series-data-python/).

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
ans_subset = _
subset_points = 0

# Answer should be a DataFrame
if isinstance(ans_subset, pd.DataFrame):
    print("\u2705 Great, you created a pandas dataframe above")
    subset_points += 1
else:
    print("\u274C Oops - the cell above should have a DataFrame output.")

# Answer should include 731 days of data
if len(ans_subset)==731:
    print("\u2705 Your DataFrame has the right number of days")
    subset_points += 2
elif len(ans_subset) >731:
    print("\u274C Your subset has too many days.")
else:
    print("\u274C Your subset has too few days.")

# The mean of the streamflow column should be 1951
if round(ans_subset.iloc[:,2].mean(), 0)==67960.0:
    print("\u2705 Your streamflow DataFrame has the expected values "
          "in it, good job!")
    subset_points += 2
else:
    print("\u274C Your streamflow DataFrame does not have the "
          "expected values in it.")

# Subtract one point for any PEP-8 errors
tmp_path = "tmp.py"
with open(tmp_path, "w") as tmp_file:
    tmp_file.write(In[-2])
ignore_flake8 = 'W292,F401,E302,F821'
flake8_out = subprocess.run(
    ['flake8', 
     '--ignore', ignore_flake8, 
     '--import-order-style', 'edited',
     '--count', 
     tmp_path],
    stdout=subprocess.PIPE,
).stdout.decode("ascii")
print(flake8_out)
q_points -= int(flake8_out.splitlines()[-1])
    
print("\u27A1 You received {} out of 5 points for subsetting the "
      "streamflow data.".format(
    subset_points))

subset_points

&#128187; Now, in the cell below, plot your subsetted data. Don't forget to label your plot!


In [None]:
# YOUR CODE HERE
raise NotImplementedError()

You should be able to see the flood in your data going up above 1.88 million cfs at its peak in March 2019, as compared to the following year where the peak flow was 1.56 million cfs. But how unusual is that really?

Let's start by plotting ALL the data. Then you can (optionally) use a return period **statistic** to quantify how unusual it was.

&#128187; In the cell below, plot the entire time series of streamflow data, without any parameters.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

This plot looks a little fuzzy because it is trying to fit too many data points in a small area. One way to improve this is by **resampling** the data to **annual maxima**. That way we still get the same peak streamflows, but the computer will be able to plot all the values without overlapping.

> **Resampling** means changing the time interval between time series observations - in this case from daily to annual.

&#128214; Read about [different ways to resample time series data in your textbook](https://www.earthdatascience.org/courses/use-data-open-source-python/use-time-series-data-in-python/date-time-types-in-pandas-python/resample-time-series-data-pandas-python/)

&#128214; You can use a [list of **offset aliases**](https://pandas.pydata.org/docs/dev/user_guide/timeseries.html#timeseries-offset-aliases) to look up how to specify the final dates. This list is pretty hard to find - you might want to bookmark it.

&#128187; In the cell below, select the streamflow column, and then resample it to get an annual maximum.

> GOTCHA ALERT - the test below is looking for a pandas `DataFrame`, but when we select a single column we get a pandas `Series` (a `DataFrame` is a collection of `Series` where each column is one `Series`.) To get a `DataFrame` with a single column, use the syntax below with **two** square brackets:

```python
dataframe[['column_name']]
```

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
ans_resample = _
resample_points = 0

# Answer should be a DataFrame
if isinstance(ans_resample, pd.DataFrame):
    print("\u2705 Great, you created a pandas DataFrame above")
    resample_points += 1
else:
    print("\u274C Oops - the cell above should have a DataFrame output.")

# Answer should include 32 years of data
if len(ans_resample)==32:
    print("\u2705 Your DataFrame has the right number of years")
    resample_points += 2
else:
    print("\u274C Oops - did you resample your DataFrame to annual?")

# The mean of the streamflow Series should be 75378
if round(int(ans_resample.mean().iloc[0]), 0)==75378:
    print("\u2705 Your annual max streamflow DataFrame has the expected "
          "values in it, good job!")
    resample_points += 2
else:
    print("\u274C Your annual max streamflow DataFrame does not have the "
          "expected values in it.")

print("\u27A1 You received {} out of 5 points for subsetting the "
      "streamflow data.".format(
    resample_points))
resample_points

&#128187; Plot your resampled data.

> HINT: use the rot=45 parameter to rotate the x-axis labels if you can't see them. You could also consider extracting the year only in your `DataFrame` before plotting.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In the cell below, write a headline and 2-3 sentence description of your plot. What do you estimate the return period was for the flood in 2019?

### WRITE YOUR HEADLINE HERE
WRITE YOUR DESCRIPTION HERE.

> NOTE: You may be thinking...this really isn't enough years to get a good return period value, and you would be right! It turns out that the USGS also has **peak streamflow** values for years going back to the 1800s in Omaha. For your own analysis, one option would be to analyse the peak streamflow values instead, and/or compare them to the ones you computed in the years you have data. **GOTCHA ALERT** - the peak streamflow values from the USGS will be higher, because the daily data is a daily **average** rather than an instantaneous **maximum** like the peak flow.

## STEP 6 (OPTIONAL): CALCULATE THE FLOOD RETURN PERIOD 

&#127798; EXTRA CHALLENGE In the cell below, calculate the exceedence probability and return period for each year of the **annual** data, and add them as columns to your DataFrame.

> HINT: pandas columns have a `rank` method, which you can use. BUT -- you will need to use the `ascending=False` parameter, since higher rank should be lower exceedence probability 

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
ans_return = _
return_points = 0

# Answer should be a DataFrame
if isinstance(ans_return, pd.DataFrame):
    print("\u2705 Great, you created a pandas dataframe above")
    return_points += 1
else:
    print("\u274C Oops - the cell above should have a DataFrame output.")

# Answer should include 32 years of data
if len(ans_return)==32:
    print("\u2705 Your DataFrame has the right number of days")
    return_points += 2
elif len(ans_return) > 32:
    print("\u274C Your DataFrame has too many years.")
else:
    print("\u274C Your DataFrame has too few years.")

# The value "hash" should be 20106.0
if round(ans_return.mean().product(), 0)==157525.0:
    print("\u2705 Your streamflow DataFrame has the expected values "
          "in it, good job!")
    return_points += 2
else:
    print("\u274C Your streamflow DataFrame does not have the "
          "expected values in it.")
    
    # Subtract one point for any PEP-8 errors
tmp_path = "tmp.py"
with open(tmp_path, "w") as tmp_file:
    tmp_file.write(In[-2])
ignore_flake8 = 'W292,F401,E302,F821,W503'
flake8_out = subprocess.run(
    ['flake8', 
     '--ignore', ignore_flake8, 
     '--import-order-style', 'edited',
     '--count', 
     tmp_path],
    stdout=subprocess.PIPE,
).stdout.decode("ascii")
print(flake8_out)
return_points -= int(flake8_out.splitlines()[-1])

print("\u27A1 You received {} out of 5 extra credit points for calculating "
      "return periods.".format(return_points))
return_points