In [113]:
# session info
sessionInfo()

# Know your current working directory
getwd()

R version 4.0.2 (2020-06-22)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 10 x64 (build 19041)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggplot2_3.3.2

loaded via a namespace (and not attached):
 [1] magrittr_1.5     tidyselect_1.1.0 munsell_0.5.0    uuid_0.1-4      
 [5] colorspace_1.4-1 R6_2.4.1         rlang_0.4.7      dplyr_1.0.2     
 [9] tools_4.0.2      grid_4.0.2       gtable_0.3.0     withr_2.2.0     
[13] htmltools_0.5.0  ellipsis_0.3.1   digest_0.6.25    tibble_3.0.3    
[17] lifecycle_0.2.0  crayon_1.3.4     IRdisplay_0.7.0  purrr_0.3.4     
[21] repr_1.1.0       base64enc_0.1-3  vctrs_0.3.2      IRkernel_1.1.1  
[25] glue_1

# Week 5, Precipitation in Boulder, CO in R    


### LEARNING GOALS

After doing this exercise, you should be able to:

- Give examples of different kinds of precipitation and recall the two that most often occur in the Boulder, CO area
- Identify the months of the year that get the highest precipitation amounts on average, in Boulder, CO
- Modify an R dataframe

- Plot, in color! (oh my!)


## BACKGROUND

> In the local weather news, every day we learn something new about what water vapor is up to in  the atmosphere -- we learn about rainfall, snowstorms, hail, drizzle, sleet, or if you live along the Front Range, all five in the same day. In meteorology and hydrology, these descriptions of water's condensation behavior from the atmosphere to the surface fit under the all-encompassing umbrella of precipitation (Pun absolutely intended).

> To know how much precipitation to expect on any given day, scientists use various types of gauges to measure precipitation amounts. Some of the simplest devices were invented hundreds of years ago-- literally small cylinders with vertical ticks that denote some unit of length. Below is an image of a modern rain gauge, with some visiting rain!

> Simple precipitation gauges are manually checked while fancier ones are automated.

![Rain_Gauge.jpg](Rain_Gauge.jpg)

### Precipitation in the Boulder, CO Area

- According to the National Oceanic and Atmospheric Administration (NOAA), the average annual precipitation (including rain, snow, etc) in Boulder, CO is ~ 20 inches. 

- The majority of precipitation falls in the __Winter__ (snow) or in the __Summer__ (rain).

__Which months have the highest precipitation, on average?__ Let's import some data to see!

In [114]:
# Read a txt file, named "avg-mo-precip.txt"
avg_in_ppt <- read.delim("data/avg-monthly-precip-R.txt")

# Print the Table
print(avg_in_ppt)


   Avg_Precip_IN
1           0.70
2           0.75
3           1.85
4           2.93
5           3.05
6           2.02
7           1.93
8           1.62
9           1.84
10          1.31
11          1.39
12          0.84


### Need to add a column of months for the values. We can do this using `cbind()`. 

In [115]:
avg_mon_ppt = cbind(avg_in_ppt,Month=c(
    "Jan", "Feb", "Mar", "Apr", "May", "Jun", 
    "Jul", "Aug", "Sep", "Oct", "Nov", "Dec" ))

print(avg_mon_ppt)

   Avg_Precip_IN Month
1           0.70   Jan
2           0.75   Feb
3           1.85   Mar
4           2.93   Apr
5           3.05   May
6           2.02   Jun
7           1.93   Jul
8           1.62   Aug
9           1.84   Sep
10          1.31   Oct
11          1.39   Nov
12          0.84   Dec


__It looks like May and April get the most rainfall.__

In science, the standard convention is to work in the metric system. So we should probably convert. Let's convert to centimeters. There are 2.54 centimeters in 1 inch. Let's check the original values and then convert! 
    
### Let's Walk Through An Example 

#### Follow the steps below:

> 1) There are 2.54 cm in 1 inch. Create a new dataframe that has the product of the values in the text file `avg_monthly_precip`, multiplied by 2.54 cm.

In [116]:
cm = (avg_in_ppt*2.54)
print(cm)

   Avg_Precip_IN
1         1.7780
2         1.9050
3         4.6990
4         7.4422
5         7.7470
6         5.1308
7         4.9022
8         4.1148
9         4.6736
10        3.3274
11        3.5306
12        2.1336


> OOPS! The column name needs to be changed from "IN" to "CM". We'll use `colnames()`.

> `colnames()` defines the column (header) names of a matrix or dataframe.

> Since we are just interested in renaming one column, this can be done using the [] brackets.

In [117]:
# colnames(df2)[1]  <- "name" This would rename the first column.

colnames(cm)[1]  <- "Avg_Precip_CM"
print(cm)

   Avg_Precip_CM
1         1.7780
2         1.9050
3         4.6990
4         7.4422
5         7.7470
6         5.1308
7         4.9022
8         4.1148
9         4.6736
10        3.3274
11        3.5306
12        2.1336


> 2) Next, create a new data frame that includes the `cm` column and the `avg_mon_ppt` column. Use the `Print()` function to ensure the values converted correctly.

In [118]:
avg_monthly = cbind(avg_mon_ppt,cm)
print(avg_monthly)

   Avg_Precip_IN Month Avg_Precip_CM
1           0.70   Jan        1.7780
2           0.75   Feb        1.9050
3           1.85   Mar        4.6990
4           2.93   Apr        7.4422
5           3.05   May        7.7470
6           2.02   Jun        5.1308
7           1.93   Jul        4.9022
8           1.62   Aug        4.1148
9           1.84   Sep        4.6736
10          1.31   Oct        3.3274
11          1.39   Nov        3.5306
12          0.84   Dec        2.1336


> 3) __Reuse the "FOR PLOTTING" block of code__ from above and update the x and y axes, labels and titles with the correct terms and units.

In [127]:
# Graph cm using blue points overlayed by a line 

counts <- table(avg_in_ppt)
barplot(counts, main="Car Distribution",

ERROR: Error in parse(text = x, srcfile = src): <text>:5:0: unexpected end of input
3: counts <- table(avg_in_ppt)
4: barplot(counts, main="Car Distribution",
  ^


__Great!__

It turns out though, that the data we have been plotting only takes into account the climatology from __1971 - 2000__. A lot has happened since the year 2000, let's plot the climatology ranging from __1971 - 2019__. Then we can compare if any changes in the average have occurred in the last 19 years. Before we dive into that, let's review: 

# Brief Review, ANSWER KEY 

#### Use what you learned to answer the following questions:

__Q1)__ List five types of precipitation. Identify the two types of precipitation that are most common in the Boulder area.

__ANSWER__
> Snow, sleet, rain, hail, drizzle.
> Snow and rain are the most common in the Boulder, CO area

__Q2)__ According to the 1971 - 2000 climatology, which __three__ months receive the most precipitation?

__ANSWER__
> According to the climatology, April, May and June (June is just barely).


In [None]:
## Practing your Skills

#### For these next questions, follow the instructions below:

- The csv file called `mo-precip-1971-2019-w-avg.csv` contains a list of monthly averages (12) per year (1971, 1972, etc), from 1971 - 2019. 
- The last line in the file contains the average monthly precipitation values for the _entire series,_ in our case, 1971 - 2019.

__Q3)__ Using the array code `.shape` (see example above), what are the dimensions of `mo-precip-1971-2019-w-avg.csv`? Keeping in mind those dimensions, how many years are actually represented? 

- _HINT: Should the last line count as an individual year?_


In [None]:
__Q3)__ __ANSWER__ 

> Dimensions are (50,12)

> 49 years are represented (1971 - 2019)

If you were to `print ()` the csv file `mo_precip_1971_2019`, you'd find there are a lot of rows! For this lesson, we __only__ want to work with the average of all the rows, i.e. _only the last line_. Since we are only interested in the last line, let's isolate it.

We can isolate the last line from the rest by creating a new `numpy array`

### References (in order of appearance)
__Rain_Gauge_Image__ 
Four Season Tools. (2020). Link: https://www.smallfarmtools.com/-367 Accessed: 09/07/20.

__Avg_Monthly_Precip_Data__ 
National Oceanic and Atmospheric Administration. Earth Systems Research Laboratories. (2020). Link: https://psl.noaa.gov/boulder/Boulder.mm.precip.html Accessed: 09/07/20.

__Adding Columns to R DataFrame__ Link: https://www.datamentor.io/r-programming/data-frame/ Accessed: 09/26/20.

__High_Five_Husky__ 
Best Life. (2020). Link: https://bestlifeonline.com/adorable-puppy-pictures Accessed: 09/08/20.