## 3.2 Additional practice
Below code imports the CSV file `Module3_2_dayData.csv` as a Pandas DataFrame. It contains daily Ethiopian Meteorology Institute (EMI) station data for different stations and elements. With the tools you learned, you can select parts of this DataFrame.

In [None]:
import pandas as pd
dayData = pd.read_csv('Module3_2_dayData.csv')
dayData

With Boolean slicing, we can filter the DataFrame. For example, to get all TMPMAX values for the station Gondar A.P., you can use below code.

In [None]:
dayData.loc[(dayData.station == 'Gondar A.P.') & (dayData.element == 'TMPMAX')]

In [None]:
# Select all rows for station Shembekit
dayData.loc[(dayData.station == 'Shembekit')]

In [None]:
# Select all rows for station Ambagiorgis Sch, for the year 1972 only
dayData.loc[(dayData.station == 'Ambagiorgis Sch') & (dayData.year == 1972)]

Numpy as several mathematical functions. You can put any part of the DataFrame into those functions. For example, to calculate the sum of PRECIP for station Shembekit in 2015, we need to select that data, and then take the sum with `np.sum`.

In [None]:
import numpy as np
# Select only Shembekit, 2015, PRECIP, and save it under the name precip_sh_2015
precip_sh_2015 = dayData.loc[(dayData.station == 'Shembekit') & (dayData.year == 2015) & (dayData.element == 'PRECIP')] 

# Select only column 'value', and take the sum of this column
precip_sum = np.sum(precip_sh_2015.value) 

print(precip_sum)

In [None]:
# What is the average TMPMAX for Gondar A.P.?
tempmax_gondar = dayData.loc[(dayData.station == 'Gondar A.P.') & (dayData.element == 'TMPMAX')]

tempmax_avg = np.mean(tempmax_gondar.value)

print(tempmax_avg)

In [None]:
# What is the maximum SUNHRS for Gondar A.P. in the month August, 2012?
sun_gondar_aug2012 = dayData.loc[(dayData.station == 'Gondar A.P.') & (dayData.element == 'SUNHRS') & 
                                 (dayData.year == 2012) & (dayData.month == 8)]

sunmax_gondar_aug2012 = np.max(sun_gondar_aug2012.value)

print(sunmax_gondar_aug2012)

In [None]:
# What are the minimum and maximum years for station Aymba?
aymba_data = dayData.loc[(dayData.station == 'Aymba')]

aymba_minyear = np.min(aymba_data.year)
aymba_maxyear = np.max(aymba_data.year)

print('Minimum year of station Aymba:', aymba_minyear)
print('Maximum year of station Aymba:', aymba_maxyear)

We know how to combine two conditions using the **&** symbol. For example, when we want only the rows from station Shembekit of the year 2015, we can use the following code:
```python
dayData.loc[(dayData.station == 'Shembekit') & (dayData.year == 2015)]
```

But if we now want to select all rows where the year is either 2015 OR 2016, we should not use the **&** symbol, but we can use the **|** symbol, like this:

```python
dayData.loc[(dayData.year == 2015) | (dayData.year == 2016)]
```

If we now want to combine this with selecting a station, we need to put the selection of years within brackets, like this:

```python
dayData.loc[(dayData.station == 'Shembekit') & ((dayData.year == 2015) | (dayData.year == 2016))]
```


In [None]:
# Select all rows with years 2011 or 2012
dayData.loc[(dayData.year == 2011) | (dayData.year == 2012)]

In [None]:
# Select all rows of station Aymba for the years 2009 or 2010
dayData.loc[(dayData.station == 'Aymba') & ((dayData.year == 2009) | (dayData.year == 2010))]

In [None]:
# Advanced: Of station Maksegnit, select the values where element is PRECIP for year 2000 or 2001
dayData.loc[(dayData.station == 'Maksegnit') & (dayData.element == 'PRECIP') & ((dayData.year == 2000) | (dayData.year == 2001))]

In [None]:
# Advanced: change below for-loop to calculate the sum of PRECIP per station, and store it in the dictionary precipDict
precip_dict = {}

for station in ['Ambagiorgis Sch','Aymba','Chewahit','Gondar A.P.','Maksegnit','Shembekit']:
    # Add code that calculates the sum of precip for only station
    precip_station = dayData.loc[(dayData.station == station) & (dayData.element == 'PRECIP')]
    precip_sum = np.sum(precip_station.value)
    
    # Add code that adds the calculated sum into precip_dict, like precip_dict[station] = sum
    precip_dict[station] = round(precip_sum, 1)

precip_dict

In [None]:
# Advanced: change below for-loop to calculate the sum of PRECIP per year for Aymba, and store it in the dictionary precipDictAymba
startyear = np.min(dayData.loc[(dayData.station == 'Aymba'), 'year']) 
endyear = np.max(dayData.loc[(dayData.station == 'Aymba'), 'year']) 
aymba_precip_dict = {}

for year in range(startyear, endyear+1):
    # Add code that calculates the sum of precip for only Aymba, only one year
    aymba_precip = dayData.loc[(dayData.station == 'Aymba') & (dayData.year == year) & (dayData.element == 'PRECIP')]
    aymba_precip_sum = np.sum(aymba_precip.value)
    
    # Add code that adds the calculated sum into aymba_precip_dict, like aymba_precip_dict[year]=sum
    aymba_precip_dict[year] = round(aymba_precip_sum, 1)

aymba_precip_dict