> Try to finish the Pokemon and NFL statsheet problems in class. If time permits, go over the Austin weather problem as well; if not, assign it as optional homework.
> Remember that you are *guiding* the students through completing these projects! Use your best judgement to control the flow of the class while letting the class give their best attempts at solving the problems.

# Beginning a Pokemon Journey

You’re beginning your Pokemon journey and your goal is to become the most powerful gym leader. To do that, you have to pick a Pokemon type to train. Using Python and Pandas, decide which team you should begin training!

To start, 
* **Import Pandas** and **read in pokemon_data.csv**. 
* Familiarize yourself with the data by **printing the columns** as well as the **first 9 rows**.

In [None]:
import pandas as pd 

df = pd.read_csv('../input/week3class1review/pokemon_data.csv')
print(df.columns, '\n')
df.head(9)
#or 
#print(df.head(9))

As a beginner, you can’t catch any legendaries yet! 
* **Remove all legendary Pokemon** (Legendary == True) from the dataframe.

In [None]:
#Need .index to specify rows
#This statement is dropping the rows indicated by the indices
#   where Legendary == True
df = df.drop(df.loc[df['Legendary'] == True].index)
df

Next, find out the stat total for each Pokemon. 
* **Add a column called ‘Total’** to your dataframe and make it the **sum** of HP, Attack, Defense, Special Attack, Special Defense, and Speed for each Pokemon. 

In [None]:
#[:, 4:10] -> all the rows, columns 4-9
#.sum(axis=1) -> summing columns (horizontally)
df['Total'] = df.iloc[:, 4:10].sum(axis=1)
df

#or
#df['Total'] = df['HP'] + df['Attack'] + df['Defense'] + df['SpAtk'] + df['SpDef'] + df['Speed']

Now, to select a type to train,
* **Create a pivot table** with **Type1 as the row index and Total as the column values**. The pivot table should display the average stat total for each primary type. 
* **Sort your pivot table by descending**, and note the type with the highest stat total. This will be the type of your team!

In [None]:
df_pivot = pd.pivot_table(df, index='Type1', values='Total')
df_pivot = df_pivot.sort_values('Total', ascending=False)
df_pivot

* **Filter** the dataframe from earlier so it only displays Pokemon where **Type1 or Type2** is equal to the type you chose in the last part.
* **Sort** this filtered dataframe by descending. The top six Pokemon will be your team!

In [None]:
df.loc[(df['Type1'] == 'Dragon') | (df['Type2'] == 'Dragon')].sort_values('Total', ascending=False)


# Preparing a Statsheet for NFL Commentators

The Dallas Cowboys and Tampa Bay Buccaneers will play the first game of the 2021 NFL season. Your job is to prepare stat sheets for the announcers to reference during the game. Given multiple files of data from last year’s season, merge and clean the data according to the liking of the announcers. This is the format that they asked for -

*Columns from left to right:*
* Week
* Tampa Bay Opponent
* Tampa Bay Total Offensive Yards Gained
* Tampa Bay Total Defensive Yards Allowed
* Dallas Opponent
* Dallas Total Offensive Yards Gained
* Dallas Total Defensive Yards Allowed

In [None]:
import pandas as pd

#Open both files
df_tam = pd.read_csv('../input/week3class1review/tampa_stats.csv')
df_dal = pd.read_csv('../input/week3class1review/dallas_stats.csv')

df_tam.head()

In [None]:
#Delete the unneeded columns for the Tampa Bay statsheet
df_tam2 = df_tam.drop(df_tam.iloc[:, 1:9], axis=1)
#df_tam2.head()

In [None]:
#Since you can't input multiple splices to .iloc,
#   you'll have to go step by step and check for the first one
df_tam2 = df_tam2.drop(df_tam2.iloc[:, 2:5], axis=1)
#df_tam2.head()

In [None]:
#The last slice, and the rest of the values after the
#   last desired column
df_tam2 = df_tam2.drop(df_tam2.iloc[:, 3:7], axis=1)
df_tam2 = df_tam2.drop(df_tam2.iloc[:, 4:], axis=1)
#df_tam2.head()

In [None]:
#Now, change the column titles to the ones in row 0 and delete row 0
#This is a change we have to make since the original file has two headers
df_tam2 = df_tam2.rename(columns={'Unnamed: 0': 'Week',
                        'Unnamed: 9': 'TampaOpp',
                        'Unnamed: 13': 'TampaOffensiveTotYd',
                        'Unnamed: 18': 'TampaDefensiveTotYd'})
df_tam2 = df_tam2.drop([0], axis=0)
df_tam2

In [None]:
#Repeat all of these steps for the Dallas file
#Note that you can easily copy and paste all of this from 
#   earlier and change the variable names
df_dal2 = df_dal.drop(df_dal.iloc[:, 1:9], axis=1)
df_dal2 = df_dal2.drop(df_dal2.iloc[:, 2:5], axis=1)
df_dal2 = df_dal2.drop(df_dal2.iloc[:, 3:7], axis=1)
df_dal2 = df_dal2.drop(df_dal2.iloc[:, 4:], axis=1)
df_dal2 = df_dal2.rename(columns={'Unnamed: 0': 'Week',
                        'Unnamed: 9': 'DallasOpp',
                        'Unnamed: 13': 'DallasOffensiveTotYd',
                        'Unnamed: 18': 'DallasDefensiveTotYd'})
df_dal2 = df_dal2.drop([0], axis=0)
df_dal2

In [None]:
#Merge the two dataframes
df = pd.merge(df_tam2, df_dal2, on='Week')
df

# Looking Into Austin Weather

A weather station in Austin, Texas has given you a very large collection of data that has been gathered over the years 2013-2017. They need you to clean and summarize this data for a project they are working on. Below are the instructions they have given.
* **Import pandas** and **read in austin_weather.csv**.

In [None]:
import pandas as pd

df = pd.read_csv('../input/week3class1review/austin_weather.csv')
df

For this project, the meteorologists are not concerned with anything to do with humidity or wind. 
* **Drop the columns** that relate to these measurements.

In [None]:
df2 = df.drop(columns=['HumidityHighPercent', 'HumidityAvgPercent', 'HumidityLowPercent'])
df2 = df2.drop(columns=['WindHighMPH', 'WindAvgMPH', 'WindGustMPH'])
df2

* **Print all the weather data** from the **first day of every year** recorded.
* **Print the days from 2016** when the **high temperature was greater than or equal to 90 degrees fahrenheit**.

In [None]:
df2.loc[(df2['Month'] == 1) & (df2['Day'] == 1)]

In [None]:
df2.loc[(df2['Year'] == 2016) & (df2['TempHighF'] >= 90)]

The weather station's data has been corrupted, so all of the recorded days with no rain have null values in the Precipitation column.
* First, **print all the rows with null values**.
* **Fill all null values in the Precipitation column with 0**. 
* Then, **print all days in April when it rained**.

In [None]:
df2.isnull()

In [None]:
df2.fillna(0)

In [None]:
df2.loc[(df2['Month'] == 4) & df2['PrecipitationSumInches'] > 0]