# Task 1 - Seaborn

In this lab you work on applying your new Python skills to work with data - finally!

**Note: This lab is deliberately a bit shorter to give you some time to recover from the Project Milestone 3. If you finish early, I recommend getting started on Lab 8!**

## Objectives

In this lab you will:

1. Practice working with data using the pandas package
2. Practice some data wrangling mathematical operations.
3. Practice working with the seaborn package.
4. Practice creating data visualizations.

In [2]:
# Usually all the import statements are at the top of the file

import pandas as pd
import seaborn as sns
import numpy as np
import os
import matplotlib.pyplot as plt

## Task 1: Working with data using pandas

In this part of the lab, we will practice loading in a sample data set using pandas, and doing some basic operations.

### 1.1: Load in the data

There is a CSV file (`pokemon.csv`) inside a directory called `data` within the `lab3A` directory.

Your task is to use the pandas [`read_csv()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html) function to read this dataset, assign it to a dataframe called `df`, and then print its [`head`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.head.html) also known as the first 5 lines of the dataframe.

*Hint: don't forget to first `import pandas as pd` to use `read_csv` and other pandas function.*

In [3]:
### Your solution here
df = pd.read_csv('data/pokemon.csv')
df.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False


### 1.2: How many total pokemon are there in the dataset?

Make sure to use the `dataframe.count()` function to print the total number of entries in each column of the dataframe before you answer!

In [9]:
### Your solution here
df.count()

#             800
Name          800
Type 1        800
Type 2        414
Total         800
HP            800
Attack        800
Defense       800
Sp. Atk       800
Sp. Def       800
Speed         800
Generation    800
Legendary     800
dtype: int64

### 1.3: Create a new dataframe `df2` that only includes the Pokemon from the first generation. 

*Hint: Remember that you can subset dataframes using the `[]` syntax. [More on this here](https://pandas.pydata.org/docs/getting_started/intro_tutorials/03_subset_data.html)*

In [10]:
### Your solution here
df2 = df[df["Generation"] == 1]
df2

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...
161,149,Dragonite,Dragon,Flying,600,91,134,95,100,100,80,1,False
162,150,Mewtwo,Psychic,,680,106,110,90,154,90,130,1,True
163,150,MewtwoMega Mewtwo X,Psychic,Fighting,780,106,190,100,154,100,130,1,True
164,150,MewtwoMega Mewtwo Y,Psychic,,780,106,150,70,194,120,140,1,True


### 1.4: Print ONLY the mean HP, Attack, Defense, and Speed of all pokemon in the first generation using pandas functions 

In [11]:
### Your solution here
gen1Mean = df2[['HP','Attack', 'Defense', 'Speed']]
gen1Mean.mean()

HP         65.819277
Attack     76.638554
Defense    70.861446
Speed      72.584337
dtype: float64

In [4]:
df[0:10]

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False
5,5,Charmeleon,Fire,,405,58,64,58,80,65,80,1,False
6,6,Charizard,Fire,Flying,534,78,84,78,109,85,100,1,False
7,6,CharizardMega Charizard X,Fire,Dragon,634,78,130,111,130,85,100,1,False
8,6,CharizardMega Charizard Y,Fire,Flying,634,78,104,78,159,115,100,1,False
9,7,Squirtle,Water,,314,44,48,65,50,64,43,1,False


In [6]:
df[0:10]['HP'].mean(skipna=False)

64.0

In [7]:
df[0:10]['HP'].dropna()

0    45
1    60
2    80
3    80
4    39
5    58
6    78
7    78
8    78
9    44
Name: HP, dtype: int64

In [22]:
df2 = df[(df["HP"] > 20) & (df["Type 1"] == 'Dragon') & (df["Type 2"].isna())]
df2

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
159,147,Dratini,Dragon,,300,41,64,45,50,50,50,1,False
160,148,Dragonair,Dragon,,420,61,84,65,70,70,70,1,False
406,371,Bagon,Dragon,,300,45,75,60,40,30,50,3,False
407,372,Shelgon,Dragon,,420,65,95,100,60,50,50,3,False
671,610,Axew,Dragon,,320,46,87,60,30,40,57,5,False
672,611,Fraxure,Dragon,,410,66,117,70,40,50,67,5,False
673,612,Haxorus,Dragon,,540,76,147,90,60,70,97,5,False
682,621,Druddigon,Dragon,,485,77,120,90,60,90,48,5,False
774,704,Goomy,Dragon,,300,45,50,35,55,75,40,6,False
775,705,Sliggoo,Dragon,,452,68,75,53,83,113,60,6,False


In [50]:
list1 = [4, 8 , 15, 16, 23, 42]
list2 = [2, 10, 23, 90]
list3 = [1]

def removeMiddle(lst, start, end):
    assert len(lst) > 0 , "Empty List!"
    assert len(lst) > 1 , "Only one value in list!"
    del lst[start:end+1]
    print(lst)
    

In [71]:
removeMiddle(list1, start=1, end=3)

[4, 15]


In [81]:
def noEvenStart(lst):
    for x in lst:
        if x % 2 == 0:
            del lst[x]
        if x % 1 == 0:
            break

    print(lst)

In [82]:
list1 = [4, 8, 10, 11, 12, 15]
noEvenStart(list1)

[4, 8, 10, 11, 15]


In [83]:
test1 = 2 # create your first test rating
test2 = 6 # create your second test rating 
test3 = 11
test4 = 0
# Define your function below:
def movieReview(rating):
    if rating <= 5:
        print("Avoid at all costs!")
    elif rating > 5 & rating < 9:
        print("This one was fun")
    elif rating >= 9:
        print("Outstanding!")


# Test your function below with assert statements
    assert rating > 0 , "Not valid, it can't be that bad!"
    assert rating < 10 , "Broke the charts??"