## Learning Pandas with Pokemon - Part 1

### What is pandas?
In computer programming, pandas is a software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series. It is free software released under the three-clause BSD license. The name is derived from the term "<b>pan</b>el <b>da</b>ta", an econometrics term for multidimensional, structured data sets. (Source: wikipedia)

### Today, we'll be exploring the Pokemon Dataset! 
![alt text](images/pokemon.jpg)

#### About the Dataset: 
---
This data set includes 721 Pokemon, including their number, name, first and second type, and basic stats: HP, Attack, Defense, Special Attack, Special Defense, and Speed. These are the raw attributes that are used for calculating how much damage an attack will do in the games. This dataset is about the pokemon games (NOT pokemon cards or Pokemon Go). 

- <b>#</b>: ID for each pokemon
- <b>Name</b>: Name of each pokemon
- <b>Type 1</b>: Primary type; each pokemon has a type, this determines weakness/resistance to attacks
- <b>Type 2</b>: Secondary type; some pokemon are dual type and have 2
- <b>Total</b>: sum of all stats that come after this, a general guide to how strong a pokemon is
- <b>HP</b>: hit points, or health, defines how much damage a pokemon can withstand before fainting
- <b>Attack</b>: the base modifier for normal attacks (eg. Scratch, Punch)
- <b>Defense</b>: the base damage resistance against normal attacks
- <b>SP Atk</b>: special attack, the base modifier for special attacks (e.g. fire blast, bubble beam)
- <b>SP Def</b>: the base damage resistance against special attacks
- <b>Speed</b>: determines which pokemon attacks first each round

The data for this table has been acquired from several different sites, including: pokemon.com, pokemondb, and bulbapeida.
    
<b>Source</b>: https://www.kaggle.com/abcsds/pokemon

#### Exercise 0: Getting Started
---
1. Download: [Pokemon dataset](https://github.com/wwcodemanila/WWCodeManila-ML.AI/blob/master/datasets/pokemon.csv)
2. Import the necessary libraries (`pandas`, `numpy`, `matplotlib.pyplot`)
3. Load the dataset 

In [None]:
# Write your code here

#### Exercise 1: Warmup
---
<b>Hint</b>: Most commands for this section can be found in our [first machine learning project](https://github.com/wwcodemanila/WWCodeManila-ML.AI/blob/master/tutorials/Intro-to-Machine-Learning.ipynb).

1. How many pokemon are there in the dataset?
2. What are the columns in the dataset? 
3. What are the datatypes of each column?
4. What are the first 5 pokemon in the dataset?
5. What are the last 5 pokemon in the dataset?
6. Print a summary of the data (count, mean, min, max, etc.)

In [None]:
# Write your code here

#### Exercise 2: Cleaning the Data
---
1. Right now, the dataset is indexed by numbers 0 - 800. You can confirm this by printing `df.index`. Suppose we want to index the pokemon by 'Name' instead of by number. Set the index to become the 'Name' column. [[Hint](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.set_index.html)]
2. Drop the '#' column and inspect your dataset for changes. [[Hint1](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.drop.html)] [[Hint2](https://stackoverflow.com/questions/22149584/what-does-axis-in-pandas-mean)]
3. - Option 1: Remove all spaces in the column names (e.g. 'Type 1' becomes 'Type1'); 
   - Option 2: Replace all spaces in the column names with underscores (e.g. 'Type 1' becomes 'Type_1'). [[Hint](https://stackoverflow.com/questions/30763351/removing-space-in-dataframe-python)]
4. Time to clean up null entries. You can check if a column contains null values using `df.isnull().any()`. Your job is to find columns with null values and replace NaN with the string 'None'. [[Hint](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.fillna.html)]

In [None]:
# Write your code here

#### Exercise 3: Basic Selection - loc and iloc
---
1. Retrieve the row data of the first pokemon in the dataset by <b>integer-location</b> based indexing. [[Hint](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.iloc.html)]
2. Retrieve the row data of <b>Bulbasaur</b> in the dataset by <b>label-location</b> based indexing. [[Hint](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.loc.html)] 
3. Retrieve the row data of the <b>20th</b> pokemon in the dataset by <b>integer-location</b> based indexing. [[Hint](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.iloc.html)]
4. Retrieve the row data of <b>Ninetales</b> in the dataset by <b>label-location</b> based indexing. [[Hint](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.loc.html)] 
5. Retrieve the row data of your favorite Pokemon.  

In [None]:
# Write your code here

### [Continue to Part 2.](https://github.com/wwcodemanila/WWCodeManila-ML.AI/blob/master/exercises/pokemon_pandas_part2.ipynb)