# Introducción a pandas

En este notebook vamos a ver una introducción a los comandos iniciales de la biblioteca pandas para Python.

Dos estructuras fundamentales dentro de pandas son Series y DataFrames.

Una Series representa una columna de nuestro conjunto de datos, mientras que un DataFrame es una tabla multidimensional compuesta por una colección de Series.

Para comenzar, vamos a crear nuestro primer DataFrame a partir de un archivo CSV.

In [1]:
import pandas as pd


df = pd.read_csv('superheroes.csv')

In [2]:
df

Unnamed: 0,name,Gender,Eye color,Race,Hair color,Height,Publisher,Skin color,Alignment,Weight
0,A-Bomb,Male,yellow,Human,No Hair,203.0,Marvel Comics,-,good,441.0
1,Abe Sapien,Male,blue,Icthyo Sapien,No Hair,191.0,Dark Horse Comics,blue,good,65.0
2,Abin Sur,Male,blue,Ungaran,No Hair,185.0,DC Comics,red,good,90.0
3,Abomination,Male,green,Human / Radiation,No Hair,203.0,Marvel Comics,-,bad,441.0
4,Abraxas,Male,blue,Cosmic Entity,Black,-99.0,Marvel Comics,-,bad,-99.0
...,...,...,...,...,...,...,...,...,...,...
729,Yellowjacket II,Female,blue,Human,Strawberry Blond,165.0,Marvel Comics,-,good,52.0
730,Ymir,Male,white,Frost Giant,No Hair,304.8,Marvel Comics,white,good,-99.0
731,Yoda,Male,brown,Yoda's species,White,66.0,George Lucas,green,good,17.0
732,Zatanna,Female,blue,Human,Black,170.0,DC Comics,-,good,57.0


In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 734 entries, 0 to 733
Data columns (total 10 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   name        734 non-null    object 
 1   Gender      734 non-null    object 
 2   Eye color   734 non-null    object 
 3   Race        734 non-null    object 
 4   Hair color  734 non-null    object 
 5   Height      734 non-null    float64
 6   Publisher   719 non-null    object 
 7   Skin color  734 non-null    object 
 8   Alignment   734 non-null    object 
 9   Weight      732 non-null    float64
dtypes: float64(2), object(8)
memory usage: 57.5+ KB


In [4]:
df.isnull().sum()

name           0
Gender         0
Eye color      0
Race           0
Hair color     0
Height         0
Publisher     15
Skin color     0
Alignment      0
Weight         2
dtype: int64

Obtenemos una columna o Series de nuestro DataFrame

In [5]:
races = df['Race']

In [6]:
races

0                  Human
1          Icthyo Sapien
2                Ungaran
3      Human / Radiation
4          Cosmic Entity
             ...        
729                Human
730          Frost Giant
731       Yoda's species
732                Human
733                    -
Name: Race, Length: 734, dtype: object

In [7]:
type(races)

pandas.core.series.Series

In [8]:
type(df)

pandas.core.frame.DataFrame

### Índice

In [10]:
df.index

RangeIndex(start=0, stop=734, step=1)

In [11]:
df = df.set_index('name')
df

Unnamed: 0_level_0,Gender,Eye color,Race,Hair color,Height,Publisher,Skin color,Alignment,Weight
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
A-Bomb,Male,yellow,Human,No Hair,203.0,Marvel Comics,-,good,441.0
Abe Sapien,Male,blue,Icthyo Sapien,No Hair,191.0,Dark Horse Comics,blue,good,65.0
Abin Sur,Male,blue,Ungaran,No Hair,185.0,DC Comics,red,good,90.0
Abomination,Male,green,Human / Radiation,No Hair,203.0,Marvel Comics,-,bad,441.0
Abraxas,Male,blue,Cosmic Entity,Black,-99.0,Marvel Comics,-,bad,-99.0
...,...,...,...,...,...,...,...,...,...
Yellowjacket II,Female,blue,Human,Strawberry Blond,165.0,Marvel Comics,-,good,52.0
Ymir,Male,white,Frost Giant,No Hair,304.8,Marvel Comics,white,good,-99.0
Yoda,Male,brown,Yoda's species,White,66.0,George Lucas,green,good,17.0
Zatanna,Female,blue,Human,Black,170.0,DC Comics,-,good,57.0


In [12]:
df.index

Index(['A-Bomb', 'Abe Sapien', 'Abin Sur', 'Abomination', 'Abraxas',
       'Absorbing Man', 'Adam Monroe', 'Adam Strange', 'Agent 13', 'Agent Bob',
       ...
       'Wyatt Wingfoot', 'X-23', 'X-Man', 'Yellow Claw', 'Yellowjacket',
       'Yellowjacket II', 'Ymir', 'Yoda', 'Zatanna', 'Zoom'],
      dtype='object', name='name', length=734)

### Cómo obtener valores que nos interesan

In [16]:
hero = df.loc['Aurora']

hero

Gender               Female
Eye color              blue
Race                 Mutant
Hair color            Black
Height                  180
Publisher     Marvel Comics
Skin color                -
Alignment              good
Weight                   63
Name: Aurora, dtype: object

In [22]:
hero = df.iloc[55]

hero

Gender               Female
Eye color              blue
Race                 Mutant
Hair color            Black
Height                  180
Publisher     Marvel Comics
Skin color                -
Alignment              good
Weight                   63
Name: Aurora, dtype: object

In [23]:
heroes = df.loc['Aurora':'Banshee']

heroes

Unnamed: 0_level_0,Gender,Eye color,Race,Hair color,Height,Publisher,Skin color,Alignment,Weight
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Aurora,Female,blue,Mutant,Black,180.0,Marvel Comics,-,good,63.0
Azazel,Male,yellow,Neyaphem,Black,183.0,Marvel Comics,red,bad,67.0
Azrael,Male,brown,Human,Black,-99.0,DC Comics,-,good,-99.0
Aztar,Male,-,-,-,-99.0,DC Comics,-,good,-99.0
Bane,Male,-,Human,-,203.0,DC Comics,-,bad,180.0
Banshee,Male,green,Human,Strawberry Blond,183.0,Marvel Comics,-,good,77.0


In [24]:
heroes = df.iloc[55:60]

heroes

Unnamed: 0_level_0,Gender,Eye color,Race,Hair color,Height,Publisher,Skin color,Alignment,Weight
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Aurora,Female,blue,Mutant,Black,180.0,Marvel Comics,-,good,63.0
Azazel,Male,yellow,Neyaphem,Black,183.0,Marvel Comics,red,bad,67.0
Azrael,Male,brown,Human,Black,-99.0,DC Comics,-,good,-99.0
Aztar,Male,-,-,-,-99.0,DC Comics,-,good,-99.0
Bane,Male,-,Human,-,203.0,DC Comics,-,bad,180.0


### Condiciones

In [25]:
condition = (df['Skin color'] == 'blue')

In [26]:
condition.head()

name
A-Bomb         False
Abe Sapien      True
Abin Sur       False
Abomination    False
Abraxas        False
Name: Skin color, dtype: bool

Si ahora queremos filtrar y quedarnos solamente con los datos que cumplen la condición:

In [27]:
df[df['Skin color'] == 'blue']

Unnamed: 0_level_0,Gender,Eye color,Race,Hair color,Height,Publisher,Skin color,Alignment,Weight
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Abe Sapien,Male,blue,Icthyo Sapien,No Hair,191.0,Dark Horse Comics,blue,good,65.0
Archangel,Male,blue,Mutant,Blond,183.0,Marvel Comics,blue,good,68.0
Beast,Male,blue,Mutant,Blue,180.0,Marvel Comics,blue,good,181.0
Copycat,Female,red,Mutant,White,183.0,Marvel Comics,blue,neutral,67.0
Dr Manhattan,Male,white,Human / Cosmic,No Hair,-99.0,DC Comics,blue,good,-99.0
Killer Frost,Female,blue,Human,Blond,-99.0,DC Comics,blue,bad,-99.0
Mystique,Female,yellow (without irises),Mutant,Red / Orange,178.0,Marvel Comics,blue,bad,54.0
Nebula,Female,blue,Luphomoid,No Hair,185.0,Marvel Comics,blue,bad,83.0
Shadow Lass,Female,black,Talokite,Black,173.0,DC Comics,blue,good,54.0


### Apply

Mediante apply podemos aplicar una función definida aparte a nuestro set de datos

In [33]:
def rate_height(height):
    if height >= 200:
        return "Tall"
    else:
        return "Not tall"

In [32]:
altos = df['Height'].apply(rate_height)
altos

name
A-Bomb                Alto
Abe Sapien         No alto
Abin Sur           No alto
Abomination           Alto
Abraxas            No alto
                    ...   
Yellowjacket II    No alto
Ymir                  Alto
Yoda               No alto
Zatanna            No alto
Zoom               No alto
Name: Height, Length: 734, dtype: object

In [34]:
df['Tallness'] = df['Height'].apply(rate_height)

In [35]:
df

Unnamed: 0_level_0,Gender,Eye color,Race,Hair color,Height,Publisher,Skin color,Alignment,Weight,Tallness
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
A-Bomb,Male,yellow,Human,No Hair,203.0,Marvel Comics,-,good,441.0,Tall
Abe Sapien,Male,blue,Icthyo Sapien,No Hair,191.0,Dark Horse Comics,blue,good,65.0,Not tall
Abin Sur,Male,blue,Ungaran,No Hair,185.0,DC Comics,red,good,90.0,Not tall
Abomination,Male,green,Human / Radiation,No Hair,203.0,Marvel Comics,-,bad,441.0,Tall
Abraxas,Male,blue,Cosmic Entity,Black,-99.0,Marvel Comics,-,bad,-99.0,Not tall
...,...,...,...,...,...,...,...,...,...,...
Yellowjacket II,Female,blue,Human,Strawberry Blond,165.0,Marvel Comics,-,good,52.0,Not tall
Ymir,Male,white,Frost Giant,No Hair,304.8,Marvel Comics,white,good,-99.0,Tall
Yoda,Male,brown,Yoda's species,White,66.0,George Lucas,green,good,17.0,Not tall
Zatanna,Female,blue,Human,Black,170.0,DC Comics,-,good,57.0,Not tall
