# **DEMO 2**

Dataset: [Pokemon with stats (link)](https://www.kaggle.com/datasets/abcsds/pokemon)

Selection Rationale: I choose this dataset because I was recently playing the new nintendo switch game

___

### Analysis Objectives & Plan
1. You want to become a strong pokemon gym leader using the stats file! 
- First reset the index to the 'Name' of pokemon within the file.
- Condense the number of columns to visualize the pokemon type and total basepoints ('Total')
- Remove any unnecessary columns in the dataset.
2. Most gym leaders have a team make up of a particular pokemon type. Sort your list by pokemon type. 
- Sort the pokemon with the highest HP (health points) and type.
- Sort pokemon with highest overall (Total) points & create a list based on strongest pokemon in a descedning order. 
- Now choose a type based on the top few strongest. Organize the list to have your chosen type and highest Total points.
3. The strongest pokemon within certain types may not be possible to capture or aviable in all games. Generate the list of realistic pokemon you can use. 
- Remove options you likely won't be able to obtain within the game (i.e. legendary)
- Provide descriptive statistics for your new list. 
- What is the mean values of the new list
- Based on highest speed stats choose your team of 6 pokemon

Love the enthusiasm for the first objective, but it's a bit unclear what the analysis objective is - what exactly are we looking to find? (e.g., You could write, "Look at pokemon types and see base points). Otherwise - the rest looks great here!

___

### Import Libraries 

In [1]:
# import necessary libraries 
import pandas as pd 

___

## Import and Explore Data
- import dataset and assign to a variable named df
- print/output the following (as pythonically as possible):
    - DataFrame's shape
    - Column names 
    - First 5 rows 

In [2]:
# import data from csv file
df = pd.read_csv('Pokemon2.csv')

In [3]:
df 

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...
795,719,Diancie,Rock,Fairy,600,50,100,150,100,150,50,6,True
796,719,DiancieMega Diancie,Rock,Fairy,700,50,160,110,160,110,110,6,True
797,720,HoopaHoopa Confined,Psychic,Ghost,600,80,110,60,150,130,70,6,True
798,720,HoopaHoopa Unbound,Psychic,Dark,680,80,160,60,170,130,80,6,True


Always good practice to include a comment indicating the purpose of the code.

In [4]:
# df shape
print(df.shape)

(800, 13)


In [5]:
# df columns 
print(df.columns)

Index(['#', 'Name', 'Type 1', 'Type 2', 'Total', 'HP', 'Attack', 'Defense',
       'Sp. Atk', 'Sp. Def', 'Speed', 'Generation', 'Legendary'],
      dtype='object')


In [7]:
# output first 5 rows 
print(df.head())

   #                   Name Type 1  Type 2  Total  HP  Attack  Defense  \
0  1              Bulbasaur  Grass  Poison    318  45      49       49   
1  2                Ivysaur  Grass  Poison    405  60      62       63   
2  3               Venusaur  Grass  Poison    525  80      82       83   
3  3  VenusaurMega Venusaur  Grass  Poison    625  80     100      123   
4  4             Charmander   Fire     NaN    309  39      52       43   

   Sp. Atk  Sp. Def  Speed  Generation  Legendary  
0       65       65     45           1      False  
1       80       80     60           1      False  
2      100      100     80           1      False  
3      122      120     80           1      False  
4       60       50     65           1      False  


___

Nice job exploring the data!

### Import and Explain Tasks 

___

##### Objective 1

In [8]:
# Step 1. We don't really care about the pokemon number to reassign Name to be the first column
df = df.set_index('Name')


In [9]:
df

Unnamed: 0_level_0,#,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
Bulbasaur,1,Grass,Poison,318,45,49,49,65,65,45,1,False
Ivysaur,2,Grass,Poison,405,60,62,63,80,80,60,1,False
Venusaur,3,Grass,Poison,525,80,82,83,100,100,80,1,False
VenusaurMega Venusaur,3,Grass,Poison,625,80,100,123,122,120,80,1,False
Charmander,4,Fire,,309,39,52,43,60,50,65,1,False
...,...,...,...,...,...,...,...,...,...,...,...,...
Diancie,719,Rock,Fairy,600,50,100,150,100,150,50,6,True
DiancieMega Diancie,719,Rock,Fairy,700,50,160,110,160,110,110,6,True
HoopaHoopa Confined,720,Psychic,Ghost,600,80,110,60,150,130,70,6,True
HoopaHoopa Unbound,720,Psychic,Dark,680,80,160,60,170,130,80,6,True


Should include a comment in the code cell above to indicate its purpose.

In [10]:
#Step 2. We wish to see primarily the pokemon type and total base points. Index the table to show these in a simplier view
df.iloc[:-1, 0:4] 

Unnamed: 0_level_0,#,Type 1,Type 2,Total
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Bulbasaur,1,Grass,Poison,318
Ivysaur,2,Grass,Poison,405
Venusaur,3,Grass,Poison,525
VenusaurMega Venusaur,3,Grass,Poison,625
Charmander,4,Fire,,309
...,...,...,...,...
Zygarde50% Forme,718,Dragon,Ground,600
Diancie,719,Rock,Fairy,600
DiancieMega Diancie,719,Rock,Fairy,700
HoopaHoopa Confined,720,Psychic,Ghost,600


In [11]:
#Step 3. We don't care which generation the pokemon is from, so we can drop this column
df = df.drop(columns = 'Generation')

Code looks good here!

___

##### Objective 2

In [12]:
# step 1. We want to see the pokemon with the highest HP sorted by Type. Sort the table to show this
df_sorted = df.sort_values(['Type 1', 'HP'], ascending=[1,0])

In [13]:
print(df_sorted)

                           # Type 1    Type 2  Total  HP  Attack  Defense  \
Name                                                                        
Yanmega                  469    Bug    Flying    515  86      76       86   
Volcarona                637    Bug      Fire    550  85      60       65   
Heracross                214    Bug  Fighting    500  80     125       75   
HeracrossMega Heracross  214    Bug  Fighting    600  80     185      115   
Accelgor                 617    Bug       NaN    495  80      70       40   
...                      ...    ...       ...    ...  ..     ...      ...   
Krabby                    98  Water       NaN    325  30     105       90   
Horsea                   116  Water       NaN    295  30      40       70   
Staryu                   120  Water       NaN    340  30      45       55   
Magikarp                 129  Water       NaN    200  20      10       55   
Feebas                   349  Water       NaN    200  20      15       20   

The last two code cells could be combined into one cell.

In [14]:
#Step 2. We want pokemon with high overall (Total) points, create a list based on strongest pokemon in a descedning order. 
condition = df['Total'] > 500
sorted_df = df[condition].sort_values(by='Total', ascending=False)
print(sorted_df)

                         #    Type 1    Type 2  Total   HP  Attack  Defense  \
Name                                                                          
RayquazaMega Rayquaza  384    Dragon    Flying    780  105     180      100   
MewtwoMega Mewtwo X    150   Psychic  Fighting    780  106     190      100   
MewtwoMega Mewtwo Y    150   Psychic       NaN    780  106     150       70   
GroudonPrimal Groudon  383    Ground      Fire    770  100     180      160   
KyogrePrimal Kyogre    382     Water       NaN    770  100     150       90   
...                    ...       ...       ...    ...  ...     ...      ...   
Machamp                 68  Fighting       NaN    505   90     130       80   
Ninetales               38      Fire       NaN    505   73      76       75   
Nidoking                34    Poison    Ground    505   81     102       77   
Nidoqueen               31    Poison    Ground    505   90      92       87   
Conkeldurr             534  Fighting       NaN    50

___

In [15]:
#Step 3. You decided to choose psychic pokemon as your type to train. Organize the list to have psychic type and highest Total points.
psychic_condition = df['Type 1'] == 'Psychic'
hp_condition = df['Total'] > 500
combined_condition = psychic_condition & hp_condition
psychic_pokemon_high_hp = df[combined_condition].sort_values(by='Total', ascending=False)
print(psychic_pokemon_high_hp)

                           #   Type 1    Type 2  Total   HP  Attack  Defense  \
Name                                                                           
MewtwoMega Mewtwo X      150  Psychic  Fighting    780  106     190      100   
MewtwoMega Mewtwo Y      150  Psychic       NaN    780  106     150       70   
HoopaHoopa Unbound       720  Psychic      Dark    680   80     160       60   
Lugia                    249  Psychic    Flying    680  106      90      130   
Mewtwo                   150  Psychic       NaN    680  106     110       90   
GalladeMega Gallade      475  Psychic  Fighting    618   68     165       95   
GardevoirMega Gardevoir  282  Psychic     Fairy    618   68      85       65   
DeoxysDefense Forme      386  Psychic       NaN    600   50      70      160   
HoopaHoopa Confined      720  Psychic     Ghost    600   80     110       60   
Victini                  494  Psychic      Fire    600  100     100      100   
Cresselia                488  Psychic   

Output looks good in this objective section!


##### Objective 3

In [16]:
# step 1. Realistically, you will not have legendary pokemon on your team. Filter your new list to remove legendary pokemon.
new_table_rows = df[(df['Type 1'] == 'Psychic') & (df['Total'] > 500) & (df['Legendary'] == False)]
print(new_table_rows)

                           #   Type 1    Type 2  Total   HP  Attack  Defense  \
Name                                                                           
AlakazamMega Alakazam     65  Psychic       NaN    590   55      50       65   
Mew                      151  Psychic       NaN    600  100     100      100   
Espeon                   196  Psychic       NaN    525   65      65       60   
Celebi                   251  Psychic     Grass    600  100     100      100   
Gardevoir                282  Psychic     Fairy    518   68      65       65   
GardevoirMega Gardevoir  282  Psychic     Fairy    618   68      85       65   
Gallade                  475  Psychic  Fighting    518   68     125       65   
GalladeMega Gallade      475  Psychic  Fighting    618   68     165       95   
Cresselia                488  Psychic       NaN    600  120      70      120   

                         Sp. Atk  Sp. Def  Speed  Legendary  
Name                                                     

In [17]:
# step 2. What are the descriptive statistics of your new list?
df = new_table_rows
df.describe()

Unnamed: 0,#,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed
count,9.0,9.0,9.0,9.0,9.0,9.0,9.0,9.0
mean,296.111111,576.333333,79.111111,91.666667,81.666667,111.111111,111.111111,101.666667
std,153.194684,42.982555,21.848595,35.88175,22.079402,40.833333,14.743172,21.505813
min,65.0,518.0,55.0,50.0,60.0,65.0,95.0,80.0
25%,196.0,525.0,68.0,65.0,65.0,75.0,100.0,85.0
50%,282.0,600.0,68.0,85.0,65.0,100.0,115.0,100.0
75%,475.0,600.0,100.0,100.0,100.0,130.0,115.0,110.0
max,488.0,618.0,120.0,165.0,120.0,175.0,135.0,150.0


In [19]:
# step 3. What is the mean of your new list?
df.mean()

TypeError: unsupported operand type(s) for +: 'int' and 'str'

This code threw an error. Also, this code cell isn't necessary because you can already view the means in the previous output.

In [20]:
#Step 4. Based on highest speed stats choose your team of 6 pokemon.
my_team = ['AlakazamMega Alakazam', 'Espeon', 'Celebi', 'Mew', 'GardevoirMega Gardevoir', 'GalladeMega Gallade']

In [21]:
df.loc[my_team]

Unnamed: 0_level_0,#,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Legendary
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
AlakazamMega Alakazam,65,Psychic,,590,55,50,65,175,95,150,False
Espeon,196,Psychic,,525,65,65,60,130,95,110,False
Celebi,251,Psychic,Grass,600,100,100,100,100,100,100,False
Mew,151,Psychic,,600,100,100,100,100,100,100,False
GardevoirMega Gardevoir,282,Psychic,Fairy,618,68,85,65,165,135,100,False
GalladeMega Gallade,475,Psychic,Fighting,618,68,165,95,65,115,110,False


Output looks good! Just would want a comment indicating what the code is doing.

___

##### Conclusions

Based on filtering for total stat points, removing uncessasy columns within the dataset, sorting by type, and sorting by ascending statistics we were able to generate a list based on strong types & unltimating deciding on a psychic type list. Additional conditions based on legendary status allowed us to generate a team of 6 would be realisitically capture & train within a game to run a psychic style gym. 

We narrowed down our list in the end & could make our final choice based on another stat (speed) since the list was condensed. We could then generate basic descriptive stats of this list with the MEAN showing the most useful information. This indicated our teams mean total base stats was 576.333333, which is good setting our threshold at >500 and the highest in the game being 780 (for a legendary). 

You would want to remove the markdown cell above (you wouldn't want extraneous cells). Overall, nice work explaining what sort of information your analysis can tell us!