# Welcome to our Pandas Workshop!

---

We will first start using the [Pokemon with stats](https://www.kaggle.com/abcsds/pokemon) Kaggle dataset.

![Pikachu with Mask](https://compote.slate.com/images/18ba92e4-e39b-44a3-af3b-88f735703fa7.png?width=780&height=520&rect=1560x1040&offset=0x0)

## The Fundamentals

---

### Understanding our DataFrames and locating values

In [1]:
import pandas as pd
import numpy as np

pokemon = pd.read_csv('/work/Pokemon.csv')

pokemon.head(5) # Display the first 5 values of the dataset

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False


In [2]:
# We can also visualize the data in print format (less fancy looking!)
print(pokemon.head(5))

   #                   Name Type 1  Type 2  Total  HP  Attack  Defense  \
0  1              Bulbasaur  Grass  Poison    318  45      49       49   
1  2                Ivysaur  Grass  Poison    405  60      62       63   
2  3               Venusaur  Grass  Poison    525  80      82       83   
3  3  VenusaurMega Venusaur  Grass  Poison    625  80     100      123   
4  4             Charmander   Fire     NaN    309  39      52       43   

   Sp. Atk  Sp. Def  Speed  Generation  Legendary  
0       65       65     45           1      False  
1       80       80     60           1      False  
2      100      100     80           1      False  
3      122      120     80           1      False  
4       60       50     65           1      False  


In [3]:
pokemon.tail(5) #Display the last 5 values of the dataset

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
795,719,Diancie,Rock,Fairy,600,50,100,150,100,150,50,6,True
796,719,DiancieMega Diancie,Rock,Fairy,700,50,160,110,160,110,110,6,True
797,720,HoopaHoopa Confined,Psychic,Ghost,600,80,110,60,150,130,70,6,True
798,720,HoopaHoopa Unbound,Psychic,Dark,680,80,160,60,170,130,80,6,True
799,721,Volcanion,Fire,Water,600,80,110,120,130,90,70,6,True


In [4]:
print(pokemon.columns) # List all columns in our dataframe
print(f"\n{pokemon['Name'][0:10]}")


Index(['#', 'Name', 'Type 1', 'Type 2', 'Total', 'HP', 'Attack', 'Defense',
       'Sp. Atk', 'Sp. Def', 'Speed', 'Generation', 'Legendary'],
      dtype='object')

0                    Bulbasaur
1                      Ivysaur
2                     Venusaur
3        VenusaurMega Venusaur
4                   Charmander
5                   Charmeleon
6                    Charizard
7    CharizardMega Charizard X
8    CharizardMega Charizard Y
9                     Squirtle
Name: Name, dtype: object


In [5]:
print(pokemon[['Name', 'Type 1', 'Type 2', 'Generation']]) # We can even display several column values from the DataFrame!

                      Name   Type 1  Type 2  Generation
0                Bulbasaur    Grass  Poison           1
1                  Ivysaur    Grass  Poison           1
2                 Venusaur    Grass  Poison           1
3    VenusaurMega Venusaur    Grass  Poison           1
4               Charmander     Fire     NaN           1
..                     ...      ...     ...         ...
795                Diancie     Rock   Fairy           6
796    DiancieMega Diancie     Rock   Fairy           6
797    HoopaHoopa Confined  Psychic   Ghost           6
798     HoopaHoopa Unbound  Psychic    Dark           6
799              Volcanion     Fire   Water           6

[800 rows x 4 columns]


In [6]:
# Read each row using iloc
pokemon.iloc[0:5]

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False


In [7]:
# Find a specific value (Row, Column)
print(pokemon.iloc[4,1]) # Returns "Charmander" name

Charmander


In [8]:
# Go through each rows and print the index and row

for index, row in pokemon.iterrows():
    print(index, row['Name'], f"HP: {row['HP']}") # Iterates through the rows to get Index, Name, HP, etc.


0 Bulbasaur HP: 45
1 Ivysaur HP: 60
2 Venusaur HP: 80
3 VenusaurMega Venusaur HP: 80
4 Charmander HP: 39
5 Charmeleon HP: 58
6 Charizard HP: 78
7 CharizardMega Charizard X HP: 78
8 CharizardMega Charizard Y HP: 78
9 Squirtle HP: 44
10 Wartortle HP: 59
11 Blastoise HP: 79
12 BlastoiseMega Blastoise HP: 79
13 Caterpie HP: 45
14 Metapod HP: 50
15 Butterfree HP: 60
16 Weedle HP: 40
17 Kakuna HP: 45
18 Beedrill HP: 65
19 BeedrillMega Beedrill HP: 65
20 Pidgey HP: 40
21 Pidgeotto HP: 63
22 Pidgeot HP: 83
23 PidgeotMega Pidgeot HP: 83
24 Rattata HP: 30
25 Raticate HP: 55
26 Spearow HP: 40
27 Fearow HP: 65
28 Ekans HP: 35
29 Arbok HP: 60
30 Pikachu HP: 35
31 Raichu HP: 60
32 Sandshrew HP: 50
33 Sandslash HP: 75
34 Nidoran♀ HP: 55
35 Nidorina HP: 70
36 Nidoqueen HP: 90
37 Nidoran♂ HP: 46
38 Nidorino HP: 61
39 Nidoking HP: 81
40 Clefairy HP: 70
41 Clefable HP: 95
42 Vulpix HP: 38
43 Ninetales HP: 73
44 Jigglypuff HP: 115
45 Wigglytuff HP: 140
46 Zubat HP: 40
47 Golbat HP: 75
48 Oddish HP: 45
49 

#### Note:
**iloc** stands for integer location
```python
pokemon.iloc[]
```

**loc** stands for location (general labels)
```python
pokemon.loc[]
```

In [47]:
# loc function can have multiple conditions so that you may include specific values
pokemon.loc[(pokemon['Legendary'] == True) & ((pokemon['Type 1'] == 'Flying') | (pokemon['Type 2'] == 'Flying'))]

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
156,144,Articuno,Ice,Flying,580,90,85,100,95,125,85,1,True
157,145,Zapdos,Electric,Flying,580,90,90,85,125,90,100,1,True
158,146,Moltres,Fire,Flying,580,90,100,90,125,85,90,1,True
269,249,Lugia,Psychic,Flying,680,106,90,130,90,154,110,2,True
270,250,Ho-oh,Fire,Flying,680,106,130,90,110,154,90,2,True
425,384,Rayquaza,Dragon,Flying,680,105,150,90,150,90,95,3,True
426,384,RayquazaMega Rayquaza,Dragon,Flying,780,105,180,100,180,100,115,3,True
551,492,ShayminSky Forme,Grass,Flying,600,100,103,75,120,75,127,4,True
702,641,TornadusIncarnate Forme,Flying,,580,79,115,70,125,80,111,5,True
703,641,TornadusTherian Forme,Flying,,580,79,100,80,110,90,121,5,True


## Manipulating Our Dataframe

---

### Adding, dropping, rearranging, and combining columns


Let's mess around with the structure of our dataframes

Pandas documentation: 
- https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop.html
- https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.copy.html

In [10]:
# code here

pokemod = pokemon.copy()

In [11]:
pokemod2 = pokemod.drop(columns=['Sp. Atk','Sp. Def','Generation'])
pokemod2.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Speed,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,45,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,60,False
2,3,Venusaur,Grass,Poison,525,80,82,83,80,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,80,False
4,4,Charmander,Fire,,309,39,52,43,65,False


In [12]:
pokemod2.drop(columns=['Legendary'], inplace=True)
pokemod2.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Speed
0,1,Bulbasaur,Grass,Poison,318,45,49,49,45
1,2,Ivysaur,Grass,Poison,405,60,62,63,60
2,3,Venusaur,Grass,Poison,525,80,82,83,80
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,80
4,4,Charmander,Fire,,309,39,52,43,65


In [13]:
pokemod2['Legendary'] = np.nan
pokemod2.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Speed,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,45,
1,2,Ivysaur,Grass,Poison,405,60,62,63,60,
2,3,Venusaur,Grass,Poison,525,80,82,83,80,
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,80,
4,4,Charmander,Fire,,309,39,52,43,65,


In [14]:
pokemod2['isPokemon'] = True
pokemod2.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Speed,Legendary,isPokemon
0,1,Bulbasaur,Grass,Poison,318,45,49,49,45,,True
1,2,Ivysaur,Grass,Poison,405,60,62,63,60,,True
2,3,Venusaur,Grass,Poison,525,80,82,83,80,,True
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,80,,True
4,4,Charmander,Fire,,309,39,52,43,65,,True


In [15]:
pokemod2['Offense'] = pokemod2['Attack'] + pokemod2['Speed']
pokemod2.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Speed,Legendary,isPokemon,Offense
0,1,Bulbasaur,Grass,Poison,318,45,49,49,45,,True,94
1,2,Ivysaur,Grass,Poison,405,60,62,63,60,,True,122
2,3,Venusaur,Grass,Poison,525,80,82,83,80,,True,162
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,80,,True,180
4,4,Charmander,Fire,,309,39,52,43,65,,True,117


In [16]:
pokemod2['TrueDef'] = pokemod2.iloc[:,7:9].sum(axis=1)
pokemod2.head()

In [17]:
pokecols = pokemod.columns.to_list()
print(pokecols)

['#', 'Name', 'Type 1', 'Type 2', 'Total', 'HP', 'Attack', 'Defense', 'Sp. Atk', 'Sp. Def', 'Speed', 'Generation', 'Legendary']


In [18]:
pokecols = pokecols[:8] + [pokecols[10]] + pokecols[8:10] + pokecols[11:]
print(pokecols)

['#', 'Name', 'Type 1', 'Type 2', 'Total', 'HP', 'Attack', 'Defense', 'Speed', 'Sp. Atk', 'Sp. Def', 'Generation', 'Legendary']


In [19]:
pokemod = pokemod[pokecols]
pokemod.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Speed,Sp. Atk,Sp. Def,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,45,65,65,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,60,80,80,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,80,100,100,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,80,122,120,1,False
4,4,Charmander,Fire,,309,39,52,43,65,60,50,1,False


## Filtering Data

---

### Displaying data values based on conditions


Let's try to find weird combination of pokemon types and see which pokemons have the highest of each attribute



Pandas documentation: 
- https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html
- https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iloc.html

In [20]:
# code here

pokemod.loc[pokemod['Type 1'] == 'Grass']

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Speed,Sp. Atk,Sp. Def,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,45,65,65,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,60,80,80,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,80,100,100,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,80,122,120,1,False
48,43,Oddish,Grass,Poison,320,45,50,55,30,75,65,1,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...
718,650,Chespin,Grass,,313,56,61,65,38,48,45,6,False
719,651,Quilladin,Grass,,405,61,78,95,57,56,58,6,False
720,652,Chesnaught,Grass,Fighting,530,88,107,122,64,74,75,6,False
740,672,Skiddo,Grass,,350,66,65,48,52,62,57,6,False


In [21]:
pokemod.loc[(pokemod['Type 1'] == 'Grass') & (pokemod['Type 2'] == 'Poison')]

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Speed,Sp. Atk,Sp. Def,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,45,65,65,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,60,80,80,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,80,100,100,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,80,122,120,1,False
48,43,Oddish,Grass,Poison,320,45,50,55,30,75,65,1,False
49,44,Gloom,Grass,Poison,395,60,65,70,40,85,75,1,False
50,45,Vileplume,Grass,Poison,490,75,80,85,50,110,90,1,False
75,69,Bellsprout,Grass,Poison,300,50,75,35,40,70,30,1,False
76,70,Weepinbell,Grass,Poison,390,65,90,50,55,85,45,1,False
77,71,Victreebel,Grass,Poison,490,80,105,65,70,100,70,1,False


In [22]:
pokemod.loc[(pokemod['Type 1'] == 'Grass') | (pokemod['Type 1'] == 'Poison')]

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Speed,Sp. Atk,Sp. Def,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,45,65,65,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,60,80,80,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,80,100,100,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,80,122,120,1,False
28,23,Ekans,Poison,,288,35,60,44,55,40,54,1,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...
720,652,Chesnaught,Grass,Fighting,530,88,107,122,64,74,75,6,False
740,672,Skiddo,Grass,,350,66,65,48,52,62,57,6,False
741,673,Gogoat,Grass,,531,123,100,62,68,97,81,6,False
760,690,Skrelp,Poison,Water,320,50,60,60,30,60,60,6,False


In [23]:
pokemod.loc[pokemod['Attack'] >= 170]

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Speed,Sp. Atk,Sp. Def,Generation,Legendary
163,150,MewtwoMega Mewtwo X,Psychic,Fighting,780,106,190,100,130,154,100,1,True
232,214,HeracrossMega Heracross,Bug,Fighting,600,80,185,115,75,40,105,2,False
424,383,GroudonPrimal Groudon,Ground,Fire,770,100,180,160,90,150,90,3,True
426,384,RayquazaMega Rayquaza,Dragon,Flying,780,105,180,100,115,180,100,3,True
429,386,DeoxysAttack Forme,Psychic,,600,50,180,20,150,180,20,3,True
494,445,GarchompMega Garchomp,Dragon,Ground,700,108,170,115,92,120,95,4,False
711,646,KyuremBlack Kyurem,Dragon,Ice,700,125,170,100,95,120,90,5,True


In [24]:
pokemod.loc[(pokemod['Attack'] >= 170) & (pokemod['Legendary'] == False)]

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Speed,Sp. Atk,Sp. Def,Generation,Legendary
232,214,HeracrossMega Heracross,Bug,Fighting,600,80,185,115,75,40,105,2,False
494,445,GarchompMega Garchomp,Dragon,Ground,700,108,170,115,92,120,95,4,False


## Conditional Changes 

---

### Replacing values based on conditions/filters


Let's change some values based on pokemon types



Pandas documentation: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html

#### Note/Cheat Sheet on Common Pandas Operators:

```
AND can be reprented as &
OR can be reprented as |
NOT can be reprented as ~
```

In [25]:

# code here

pokemod.loc[pokemod['Total'] >= 500, 'Overpowered'] = True
pokemod.loc[pokemod['Total'] < 500, 'Overpowered'] = False
pokemod

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Speed,Sp. Atk,Sp. Def,Generation,Legendary,Overpowered
0,1,Bulbasaur,Grass,Poison,318,45,49,49,45,65,65,1,False,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,60,80,80,1,False,False
2,3,Venusaur,Grass,Poison,525,80,82,83,80,100,100,1,False,True
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,80,122,120,1,False,True
4,4,Charmander,Fire,,309,39,52,43,65,60,50,1,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
795,719,Diancie,Rock,Fairy,600,50,100,150,50,100,150,6,True,True
796,719,DiancieMega Diancie,Rock,Fairy,700,50,160,110,110,160,110,6,True,True
797,720,HoopaHoopa Confined,Psychic,Ghost,600,80,110,60,70,150,130,6,True,True
798,720,HoopaHoopa Unbound,Psychic,Dark,680,80,160,60,80,170,130,6,True,True


In [26]:
pokemod2.loc[pokemod['Type 1'] == 'Bug', 'Type 1'] = 'Insect'
pokemod2.loc[pokemod['Type 1'] == 'Bug']

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Speed,Legendary,isPokemon,Offense,TrueDef
13,10,Caterpie,Insect,,195,45,30,35,45,,True,75,80
14,11,Metapod,Insect,,205,50,20,55,30,,True,50,85
15,12,Butterfree,Insect,Flying,395,60,45,50,70,,True,115,120
16,13,Weedle,Insect,Poison,195,40,35,30,50,,True,85,80
17,14,Kakuna,Insect,Poison,205,45,25,50,35,,True,60,85
...,...,...,...,...,...,...,...,...,...,...,...,...,...
698,637,Volcarona,Insect,Fire,550,85,60,65,100,,True,160,165
717,649,Genesect,Insect,Steel,600,71,120,95,99,,True,219,194
732,664,Scatterbug,Insect,,200,38,35,40,35,,True,70,75
733,665,Spewpa,Insect,,213,45,22,60,29,,True,51,89


## Aggregate Statistics 

---

### Using the groupby function with numpy functions


Let's see what type of pokemon has the highest average and median of each atttribute



Pandas documentation: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html

In [27]:
# code here

pokemod.groupby(['Type 1']).mean()

Unnamed: 0_level_0,#,Total,HP,Attack,Defense,Speed,Sp. Atk,Sp. Def,Generation,Legendary
Type 1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
Bug,334.492754,378.927536,56.884058,70.971014,70.724638,61.681159,53.869565,64.797101,3.217391,0.0
Dark,461.354839,445.741935,66.806452,88.387097,70.225806,76.16129,74.645161,69.516129,4.032258,0.064516
Dragon,474.375,550.53125,83.3125,112.125,86.375,83.03125,96.84375,88.84375,3.875,0.375
Electric,363.5,443.409091,59.795455,69.090909,66.295455,84.5,90.022727,73.704545,3.272727,0.090909
Fairy,449.529412,413.176471,74.117647,61.529412,65.705882,48.588235,78.529412,84.705882,4.117647,0.058824
Fighting,363.851852,416.444444,69.851852,96.777778,65.925926,66.074074,53.111111,64.703704,3.37037,0.0
Fire,327.403846,458.076923,69.903846,84.769231,67.769231,74.442308,88.980769,72.211538,3.211538,0.096154
Flying,677.75,485.0,70.75,78.75,66.25,102.5,94.25,72.5,5.5,0.5
Ghost,486.5,439.5625,64.4375,73.78125,81.1875,64.34375,79.34375,76.46875,4.1875,0.0625
Grass,344.871429,421.142857,67.271429,73.214286,70.8,61.928571,77.5,70.428571,3.357143,0.042857


In [31]:
pokemod.groupby(['Type 1']).median()

Unnamed: 0_level_0,#,Total,HP,Attack,Defense,Speed,Sp. Atk,Sp. Def,Generation,Legendary
Type 1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
Bug,291.0,395.0,60.0,65.0,60.0,60.0,50.0,60.0,3.0,0.0
Dark,509.0,465.0,65.0,88.0,70.0,70.0,65.0,65.0,5.0,0.0
Dragon,443.5,600.0,80.0,113.5,90.0,90.0,105.0,90.0,4.0,0.0
Electric,403.5,477.5,60.0,65.0,65.0,88.0,95.0,79.5,4.0,0.0
Fairy,669.0,405.0,78.0,52.0,66.0,45.0,75.0,79.0,6.0,0.0
Fighting,308.0,455.0,70.0,100.0,70.0,60.0,40.0,63.0,3.0,0.0
Fire,289.5,482.0,70.0,84.5,64.0,78.5,85.0,67.5,3.0,0.0
Flying,677.5,557.5,79.0,85.0,75.0,116.0,103.5,80.0,5.5,0.5
Ghost,487.0,464.5,59.5,66.0,72.5,60.5,65.0,75.0,4.0,0.0
Grass,372.0,430.0,65.5,70.0,66.0,58.5,75.0,66.0,3.5,0.0


In [33]:
pokemod.groupby(['Type 1']).mean().sort_values('Attack', ascending=False)

Unnamed: 0_level_0,#,Total,HP,Attack,Defense,Speed,Sp. Atk,Sp. Def,Generation,Legendary
Type 1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
Dragon,474.375,550.53125,83.3125,112.125,86.375,83.03125,96.84375,88.84375,3.875,0.375
Fighting,363.851852,416.444444,69.851852,96.777778,65.925926,66.074074,53.111111,64.703704,3.37037,0.0
Ground,356.28125,437.5,73.78125,95.75,84.84375,63.90625,56.46875,62.75,3.15625,0.125
Rock,392.727273,453.75,65.363636,92.863636,100.795455,55.909091,63.340909,75.477273,3.454545,0.090909
Steel,442.851852,487.703704,65.222222,92.703704,126.37037,55.259259,67.518519,80.62963,3.851852,0.148148
Dark,461.354839,445.741935,66.806452,88.387097,70.225806,76.16129,74.645161,69.516129,4.032258,0.064516
Fire,327.403846,458.076923,69.903846,84.769231,67.769231,74.442308,88.980769,72.211538,3.211538,0.096154
Flying,677.75,485.0,70.75,78.75,66.25,102.5,94.25,72.5,5.5,0.5
Poison,251.785714,399.142857,67.25,74.678571,68.821429,63.571429,60.428571,64.392857,2.535714,0.0
Water,303.089286,430.455357,72.0625,74.151786,72.946429,65.964286,74.8125,70.517857,2.857143,0.035714


## Exporting Our Dataframe

---

### Saving our dataframe into a csv or other type of file


Let's save our dataframe into a seperate csv file

Pandas documentation: 
- https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html
- https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_json.html

In [30]:
# code here
pokemod.to_csv('rearrangedpokemon.csv')
pokemod2.to_json('clownedpokemon.json')

## Using a real world dataset:

---
https://www.kaggle.com/jolasa/bay-area-bike-sharing-trips

In [None]:
# Importing the new data

bike_df = pd.read_csv('2019 - 01.csv')

### Lets rearrange the dataset according to our task:

" Predict the trip duration based on the start_station_name, user_type, member_birth year, and member_gender"

In [None]:
# Rearrange the data

cols = bike_df.columns.tolist()
print(cols)

In [None]:
cols = col[0] + col[2:] + col[1]
print(cols)

In [None]:
bike_df = bike_df[cols]
bike_df.head()

In [None]:
bike_df.drop(colums=['start_station_id','end_station_id','bike_id'],inplace=True)

### Lets use what we learned to get some general info from this dataset:
- Unique values
- Duplicates
- Aggregate statistics
- Null/NaN values

In [None]:
# Combing through the data

pd.unique(bike_df[user_type])

In [None]:
pd.unique(bike_df[member_gender])

In [None]:
pd.unique(bike_df[month])

In [None]:
bike_df.isna().sum()

In [None]:
bike_df.duplicated(subset=['start_station_name'])

In [None]:
bike_df.groupby(['member_gender'])mean()