# Quiz 01

### Working with CIMIS Weather Station Data

Weather Data collected by CIMIS automatic weather stations. The data is available in CSV format. Station data include measured parameters such as solar radiation, air temperature, soil temperature, relative humidity, precipitation, wind speed and wind direction as well as derived parameters such as vapor pressure, dew point temperature, and grass reference evapotranspiration (ETo).

You are given 5 years of data for a few stations in Southern California. 

Create a new blank jupyter notebook. Follow the following sequence of instructions and answer questions accordingly. Upload the jupyter notebook directly into Brightspace.

1. Load the ```daily.csv``` dataset. You should know how to find where the file is located and specify the path to open it. 
2. How many unique stations are there in this dataset? 
3. Create a dataframe for each of the stations **individually** for the following fields: date, precipitation (```Precip (in)```), average air temperature (```Avg Air Temp (F)```) and average relative humidity (```Avg Rel Hum (%)```). You should have one dataframe per station with these four columns.
4. Calculate **average monthly** statistics for each of the weather variables at each station. i.e. you should have a dataframe with each month for each variable summarized by the mean of that variable over the past 5 years, mean precipitation for Januarys, for Februarys etc.
5. Write a few sentences describing differences you see between the stations and their 5-year climatologies. 
6. Create a function to convert degrees Farenheit to degrees Celsius. 
7. In each climatology dataframe convert  ```Avg Air Temp (F)``` to ```Avg Air Temp (C)``` and display results **rounded to one decimal place**.
8. For the **daily** data at the **Santa Monica** station, count the number of days in the five year dataset where ```Avg Air Temp (F)``` was below 50 degrees or above 100 degrees.





In [13]:
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv("daily.csv",parse_dates=['Date'])

In [14]:
stn_name = data['Stn Name']
unique_stns = stn_name.unique()

no_unique_stns = len(unique_stns)
print('count of unique stations: ',no_unique_stns)

count of unique stations:  3


In [15]:
unique_stns

array(['U.C. Riverside', 'Santa Monica', 'Santa Barbara'], dtype=object)

In [16]:
riverside = data[['Date','Precip (in)','Avg Air Temp (F)','Avg Rel Hum (%)']].loc[(data['Stn Name']==unique_stns[0])].reset_index()
santa_monica = data[['Date','Precip (in)','Avg Air Temp (F)','Avg Rel Hum (%)']].loc[(data['Stn Name']==unique_stns[1])].reset_index()
santa_barbara = data[['Date','Precip (in)','Avg Air Temp (F)','Avg Rel Hum (%)']].loc[(data['Stn Name']==unique_stns[2])].reset_index()


In [17]:
riverside.head()

Unnamed: 0,index,Date,Precip (in),Avg Air Temp (F),Avg Rel Hum (%)
0,0,2017-01-01,0.0,47.8,74.0
1,1,2017-01-02,0.0,49.6,75.0
2,2,2017-01-03,0.0,47.8,77.0
3,3,2017-01-04,0.0,50.8,73.0
4,4,2017-01-05,0.13,55.2,82.0


In [18]:
# There are many ways to split and find the monthly means. 

# I am showing here Kendall's method as it is closest to what I demonstrated in class
# in order for this to work you must have tried to parse dates when opening the file! i.e. data = pd.read_csv("daily.csv",parse_dates=['Date'])

#Add a column to each dataframe with the month
riverside['Month'] = riverside['Date'].astype(str).str.slice(start=5,stop=7)
santa_monica['Month'] = santa_monica['Date'].astype(str).str.slice(start=5,stop=7)
santa_barbara['Month'] = santa_barbara['Date'].astype(str).str.slice(start=5,stop=7)

#Group each dataframe by month column and calculate monthly means for past five years into new dataframes
riverside_data_grouped = riverside.groupby('Month').mean()
santa_monica_data_grouped = santa_monica.groupby('Month').mean()
santa_barbara_data_grouped = santa_barbara.groupby('Month').mean()

#Print all the grouped dataframes
print('UC Riverside Monthly Mean Data')
display(riverside_data_grouped)
print('\nSanta Monica Monthly Mean Data')
display(santa_monica_data_grouped)
print('\nSanta Barbara Monthly Mean Data')
display(santa_barbara_data_grouped)



UC Riverside Monthly Mean Data


Unnamed: 0_level_0,index,Precip (in),Avg Air Temp (F),Avg Rel Hum (%)
Month,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,927.833333,0.056989,55.309677,51.98913
2,958.497041,0.04284,55.423669,49.16568
3,987.0,0.048495,58.332258,54.516304
4,1017.5,0.019167,63.243333,53.922222
5,1048.0,0.007204,64.575806,62.225806
6,1078.5,0.001,71.857778,57.482955
7,1109.0,0.000968,77.337634,51.715054
8,1140.0,0.002097,77.943011,51.213115
9,1170.5,0.004944,74.877778,53.134078
10,1201.0,0.009032,68.051075,48.886486



Santa Monica Monthly Mean Data


Unnamed: 0_level_0,index,Precip (in),Avg Air Temp (F),Avg Rel Hum (%)
Month,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,3118.833333,0.071522,58.124457,57.098901
2,3149.497041,0.057101,56.878107,57.160714
3,3178.0,0.069351,57.880541,63.755814
4,3208.5,0.015444,60.39382,69.682353
5,3239.0,0.011075,60.998387,76.559783
6,3269.5,0.000452,64.530636,80.398844
7,3300.0,0.00086,67.666484,81.733333
8,3331.0,0.000323,69.323118,78.772973
9,3361.5,0.002722,68.965169,76.057471
10,3392.0,0.004892,66.443716,64.177778



Santa Barbara Monthly Mean Data


Unnamed: 0_level_0,index,Precip (in),Avg Air Temp (F),Avg Rel Hum (%)
Month,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,5309.833333,0.069839,55.726486,69.011364
2,5340.497041,0.043254,55.665625,64.202703
3,5369.0,0.091398,58.75,68.846154
4,5399.5,0.024944,62.35,72.248485
5,5430.0,0.024892,62.240323,76.832402
6,5460.5,0.016278,63.613143,80.775758
7,5491.0,0.009301,65.614607,84.127168
8,5522.0,0.006183,67.695161,82.791411
9,5552.5,0.00815,67.88314,79.716216
10,5583.0,0.022258,65.712022,65.07362


In [19]:
def fahrenheit_to_celsius(temp_fahrenheit):
    """Converts temperature in degrees Fahrenheit to degrees Celsius"""
    temp_celsius = (temp_fahrenheit - 32)*(5/9)
    return temp_celsius

In [23]:
#Add a column for avg air temp C to each dataframe
riverside_data_grouped['Avg Air Temp (C)'] = riverside_data_grouped['Avg Air Temp (F)'].apply(fahrenheit_to_celsius).round(1)
santa_monica_data_grouped['Avg Air Temp (C)'] = santa_monica_data_grouped['Avg Air Temp (F)'].apply(fahrenheit_to_celsius).round(1)
santa_barbara_data_grouped['Avg Air Temp (C)'] = santa_barbara_data_grouped['Avg Air Temp (F)'].apply(fahrenheit_to_celsius).round(1)

#Print all the grouped dataframes
riverside_data_grouped

Unnamed: 0_level_0,index,Precip (in),Avg Air Temp (F),Avg Rel Hum (%),Avg Air Temp (C)
Month,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,927.833333,0.056989,55.309677,51.98913,12.9
2,958.497041,0.04284,55.423669,49.16568,13.0
3,987.0,0.048495,58.332258,54.516304,14.6
4,1017.5,0.019167,63.243333,53.922222,17.4
5,1048.0,0.007204,64.575806,62.225806,18.1
6,1078.5,0.001,71.857778,57.482955,22.1
7,1109.0,0.000968,77.337634,51.715054,25.2
8,1140.0,0.002097,77.943011,51.213115,25.5
9,1170.5,0.004944,74.877778,53.134078,23.8
10,1201.0,0.009032,68.051075,48.886486,20.0


In [24]:
santa_monica_data_grouped

Unnamed: 0_level_0,index,Precip (in),Avg Air Temp (F),Avg Rel Hum (%),Avg Air Temp (C)
Month,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,3118.833333,0.071522,58.124457,57.098901,14.5
2,3149.497041,0.057101,56.878107,57.160714,13.8
3,3178.0,0.069351,57.880541,63.755814,14.4
4,3208.5,0.015444,60.39382,69.682353,15.8
5,3239.0,0.011075,60.998387,76.559783,16.1
6,3269.5,0.000452,64.530636,80.398844,18.1
7,3300.0,0.00086,67.666484,81.733333,19.8
8,3331.0,0.000323,69.323118,78.772973,20.7
9,3361.5,0.002722,68.965169,76.057471,20.5
10,3392.0,0.004892,66.443716,64.177778,19.1


In [25]:
santa_barbara_data_grouped

Unnamed: 0_level_0,index,Precip (in),Avg Air Temp (F),Avg Rel Hum (%),Avg Air Temp (C)
Month,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,5309.833333,0.069839,55.726486,69.011364,13.2
2,5340.497041,0.043254,55.665625,64.202703,13.1
3,5369.0,0.091398,58.75,68.846154,14.9
4,5399.5,0.024944,62.35,72.248485,16.9
5,5430.0,0.024892,62.240323,76.832402,16.8
6,5460.5,0.016278,63.613143,80.775758,17.6
7,5491.0,0.009301,65.614607,84.127168,18.7
8,5522.0,0.006183,67.695161,82.791411,19.8
9,5552.5,0.00815,67.88314,79.716216,19.9
10,5583.0,0.022258,65.712022,65.07362,18.7


In [28]:

santa_monica_count = len(santa_monica[(santa_monica['Avg Air Temp (F)'] < 50) | (santa_monica['Avg Air Temp (F)'] > 100)])
print('The number of days when the temperature was below 50 or above 100 degrees fahrenheit in Santa Monica is',santa_monica_count)


The number of days when the temperature was below 50 or above 100 degrees fahrenheit in Santa Monica is 29


In [26]:
# OR
no_of_days = 0

for i in range(len(santa_monica)):
    if santa_monica.loc[i,'Avg Air Temp (F)'] < 50 or santa_monica.loc[i,'Avg Air Temp (F)'] > 100:
        no_of_days = no_of_days+1
        
print('The number of days when the temperature was below 50 or above 100 degrees fahrenheit in Santa Monica is',no_of_days)
    

The number of days when the temperature was below 50 or above 100 degrees fahrenheit in Santa Monica is 29
