## Práctica con la biblioteca Pandas

Para profundizar en esta biblioteca se puede visitar el sitio oficial dando clic [aquí](https://pandas.pydata.org/).

In [1]:
import pandas as pd

## Cargando el dataset
Utilizaremos el dataset de la información del Titanic.

In [None]:
!wget --no-check-certificate https://catalabs.mx/datasets/titanic.csv -O titanic.csv

Creamos el dataframe a partir del archivo.

In [None]:
df_titanic = pd.read_csv('titanic.csv')

In [None]:
df_titanic.head()

Creamos un nuevo dataframe con las columnas Name, Sex y Age.

In [None]:
df_nombre_sexo_edad = df_titanic[["Name","Sex","Age"]]

In [None]:
df_nombre_sexo_edad.head(100)

In [None]:
print(df_nombre_sexo_edad.Name[1])

In [None]:
df_nombre_sexo_edad.shape

Eliminamos los renglones en los cuales el campo de edad (Age) sea nulo.

In [None]:
df_nombre_sexo_edad = df_nombre_sexo_edad.dropna()

In [None]:
df_nombre_sexo_edad.shape

Obtenemos un subconjunto de los datos que solo contenga a los menores de 10 años.

In [None]:
menores_de_10 = df_nombre_sexo_edad[df_nombre_sexo_edad["Age"] < 10]

In [None]:
menores_de_10.head(20)

### Determinar sobrevivientes por Clase

In [None]:
#Obtenemos el conjunto de datos de pasajeros en primera clase
primera_clase = df_titanic[df_titanic["Pclass"] == 1]

primera_clase.shape
total_primera_clase = primera_clase.shape[0]
print(total_primera_clase)
primera_clase_vivos = primera_clase[primera_clase["Survived"] > 0]
primera_clase_vivos.shape
total_primera_clase_vivos = primera_clase_vivos.shape[0]
print(total_primera_clase_vivos)
print(f"De {total_primera_clase} pasajeros en primera clase, sobrevivieron {total_primera_clase_vivos} \
que representan el {round(total_primera_clase_vivos/total_primera_clase*100,2)}%")

In [None]:
#Obtenemos el conjunto de datos de pasajeros en tercera clase
tercera_clase = df_titanic[df_titanic["Pclass"] == 3]

tercera_clase.shape
total_tercera_clase = tercera_clase.shape[0]
print(total_tercera_clase)
tercera_clase_vivos = tercera_clase[tercera_clase["Survived"] > 0]
tercera_clase_vivos.shape
total_tercera_clase_vivos = tercera_clase_vivos.shape[0]
print(total_tercera_clase_vivos)
print(f"De {total_tercera_clase} pasajeros en tercera clase, sobrevivieron {total_tercera_clase_vivos} \
que representan el {round(total_tercera_clase_vivos/total_tercera_clase*100,2)}%")

Ahora visualizaremos los valores de edad distribuidos por clase y usaremos el dato de sobrevivencia para colorear cada valor. ¿Qué podemos observar?

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
sns.scatterplot(x=df_titanic["Pclass"], y=df_titanic["Age"], hue = df_titanic["Survived"])

# Prácticas del curso de Pandas en Kaggle

Esta sección se basa en el curso disponible [aquí](https://www.kaggle.com/learn/pandas).

## 1. Creando, leyendo y escribiendo.

In [None]:
import pandas as pd #importamos la biblioteca y le colocamos el apodo pd

Creamos un dataframe a partir de un diccionario:

In [3]:
df_datos = pd.DataFrame({'Yes': [50, 21], 'No': [131, 2]})
df_datos.head()

Unnamed: 0,Yes,No
0,50,131
1,21,2


In [9]:
df_datos2 = pd.DataFrame({'Bob': ['I liked it.', 'It was awful.'], 'Sue': ['Pretty good.', 'Bland.'], 'Tim':['It was great.','Boring.']})
df_datos2.head()

Unnamed: 0,Bob,Sue,Tim
0,I liked it.,Pretty good.,It was great.
1,It was awful.,Bland.,Boring.


Si quisiéramos cambiar la descripción de la columna de índices, lo podemos hacer con el parámetro `index`.

In [10]:
pd.DataFrame({'Bob': ['I liked it.', 'It was awful.'],
              'Sue': ['Pretty good.', 'Bland.']},
             index=['Product A', 'Product B'])

Unnamed: 0,Bob,Sue
Product A,I liked it.,Pretty good.
Product B,It was awful.,Bland.


### Series

Una serie en Pandas, es una secuencia de valores. Una columna de un dataframe puede considerarse como una serie. Podemos crear una serie utilizando una simple lista.

In [11]:
pd.Series([1, 2, 3, 4, 5])

Unnamed: 0,0
0,1
1,2
2,3
3,4
4,5


Una serie al igual que un dataframe puede tener títulos en los índices, así mismo le podemos dar un nombre a la serie de datos. Una serie es esencia el equivalente a una columna de un dataframe.

In [12]:
pd.Series([30, 35, 40], index=['2015 Sales', '2016 Sales', '2017 Sales'], name='Product A')

Unnamed: 0,Product A
2015 Sales,30
2016 Sales,35
2017 Sales,40


## Leyendo archivos de datos

Existen diferentes formas de cargar archivos con Pandas, una de las más utilizadas es hacerlo a partir de un archivo separado por comas (CSV).

In [13]:
wine_reviews = pd.read_csv("winemag-data-130k-v2.csv")

In [14]:
wine_reviews.head()

Unnamed: 0.1,Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery
0,0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia
1,1,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,87,15.0,Douro,,,Roger Voss,@vossroger,Quinta dos Avidagos 2011 Avidagos Red (Douro),Portuguese Red,Quinta dos Avidagos
2,2,US,"Tart and snappy, the flavors of lime flesh and...",,87,14.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Rainstorm 2013 Pinot Gris (Willamette Valley),Pinot Gris,Rainstorm
3,3,US,"Pineapple rind, lemon pith and orange blossom ...",Reserve Late Harvest,87,13.0,Michigan,Lake Michigan Shore,,Alexander Peartree,,St. Julian 2013 Reserve Late Harvest Riesling ...,Riesling,St. Julian
4,4,US,"Much like the regular bottling from 2012, this...",Vintner's Reserve Wild Child Block,87,65.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Sweet Cheeks 2012 Vintner's Reserve Wild Child...,Pinot Noir,Sweet Cheeks


In [18]:
wine_reviews.shape

(129971, 13)

Podemos indicar una columna para que sea utilziada como etiquetas de índice (aparecerá totalemente a la izquierda de la tabla).

In [17]:
wine_reviews = pd.read_csv("winemag-data-130k-v2.csv", index_col=0)
wine_reviews.head()

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery
0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia
1,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,87,15.0,Douro,,,Roger Voss,@vossroger,Quinta dos Avidagos 2011 Avidagos Red (Douro),Portuguese Red,Quinta dos Avidagos
2,US,"Tart and snappy, the flavors of lime flesh and...",,87,14.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Rainstorm 2013 Pinot Gris (Willamette Valley),Pinot Gris,Rainstorm
3,US,"Pineapple rind, lemon pith and orange blossom ...",Reserve Late Harvest,87,13.0,Michigan,Lake Michigan Shore,,Alexander Peartree,,St. Julian 2013 Reserve Late Harvest Riesling ...,Riesling,St. Julian
4,US,"Much like the regular bottling from 2012, this...",Vintner's Reserve Wild Child Block,87,65.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Sweet Cheeks 2012 Vintner's Reserve Wild Child...,Pinot Noir,Sweet Cheeks


## 2. Indexado, Selección y Asignación.

### Cargamos los datos

In [25]:
import pandas as pd
reviews = pd.read_csv("winemag-data-130k-v2.csv", index_col=0)
pd.set_option('display.max_rows', 5)

Podemos revisar los valores de una columna, usando el nombre de la misma. Si la columna no tiene espacios en el nombre se puede usar la estructura `dataframe.columna`, si la columna tiene espacios se debe usar `dataframe["Columna"]`. Esta última opción aplica sin problema para las columnas sin espacios.

In [26]:
reviews

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery
0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia
1,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,87,15.0,Douro,,,Roger Voss,@vossroger,Quinta dos Avidagos 2011 Avidagos Red (Douro),Portuguese Red,Quinta dos Avidagos
...,...,...,...,...,...,...,...,...,...,...,...,...,...
129969,France,"A dry style of Pinot Gris, this is crisp with ...",,90,32.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Marcel Deiss 2012 Pinot Gris (Alsace),Pinot Gris,Domaine Marcel Deiss
129970,France,"Big, rich and off-dry, this is powered by inte...",Lieu-dit Harth Cuvée Caroline,90,21.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Schoffit 2012 Lieu-dit Harth Cuvée Car...,Gewürztraminer,Domaine Schoffit


In [27]:
reviews.country

Unnamed: 0,country
0,Italy
1,Portugal
...,...
129969,France
129970,France


In [28]:
reviews['country']

Unnamed: 0,country
0,Italy
1,Portugal
...,...
129969,France
129970,France


Para acceder a un dato específico de una columna, podemos usar la misma nomenclatura que usamos para acceder a una lista. Indicando el índice del elemento que deseamos consultar.

In [31]:
reviews['country'][100]

'US'

### Selección basada en índices (Index-based selection).

In [32]:
reviews.iloc[0]

Unnamed: 0,0
country,Italy
description,"Aromas include tropical fruit, broom, brimston..."
...,...
variety,White Blend
winery,Nicosia


Podemos usar las reglas de slicing de las listas para recuperar una parte de los renglones y de las columnas (loc e iloc usan el orden renglones:columnas).

In [34]:
reviews.iloc[:, 0:3] #Esto es equivalente a: reviews[['country','description','designation']]

Unnamed: 0,country,description,designation
0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco
1,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos
...,...,...,...
129969,France,"A dry style of Pinot Gris, this is crisp with ...",
129970,France,"Big, rich and off-dry, this is powered by inte...",Lieu-dit Harth Cuvée Caroline


In [35]:
reviews.iloc[:3, 0]

Unnamed: 0,country
0,Italy
1,Portugal
2,US


Podemos indicar una lista con los índices que deseamos obtener:

In [37]:
reviews.iloc[[0, 10, 20], 0]

Unnamed: 0,country
0,Italy
10,US
20,US


Vale la pena mencionar que al igual que en las listas, podemos usar valores negativos para indicar las posiciones:

In [38]:
reviews.iloc[-5:]  #Recupera los últimos 5 registros

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery
129966,Germany,Notes of honeysuckle and cantaloupe sweeten th...,Brauneberger Juffer-Sonnenuhr Spätlese,90,28.0,Mosel,,,Anna Lee C. Iijima,,Dr. H. Thanisch (Erben Müller-Burggraef) 2013 ...,Riesling,Dr. H. Thanisch (Erben Müller-Burggraef)
129967,US,Citation is given as much as a decade of bottl...,,90,75.0,Oregon,Oregon,Oregon Other,Paul Gregutt,@paulgwine,Citation 2004 Pinot Noir (Oregon),Pinot Noir,Citation
129968,France,Well-drained gravel soil gives this wine its c...,Kritt,90,30.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Gresser 2013 Kritt Gewurztraminer (Als...,Gewürztraminer,Domaine Gresser
129969,France,"A dry style of Pinot Gris, this is crisp with ...",,90,32.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Marcel Deiss 2012 Pinot Gris (Alsace),Pinot Gris,Domaine Marcel Deiss
129970,France,"Big, rich and off-dry, this is powered by inte...",Lieu-dit Harth Cuvée Caroline,90,21.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Schoffit 2012 Lieu-dit Harth Cuvée Car...,Gewürztraminer,Domaine Schoffit


### Selección basada en etiquetas (Label-based selection).

In [39]:
reviews.loc[0, 'country']

'Italy'

In [40]:
reviews.loc[:, ['taster_name', 'taster_twitter_handle', 'points']]

Unnamed: 0,taster_name,taster_twitter_handle,points
0,Kerin O’Keefe,@kerinokeefe,87
1,Roger Voss,@vossroger,87
...,...,...,...
129969,Roger Voss,@vossroger,90
129970,Roger Voss,@vossroger,90


### Manipulando las etiquetas de los índices

In [41]:
reviews.set_index("title")

Unnamed: 0_level_0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,variety,winery
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
Nicosia 2013 Vulkà Bianco (Etna),Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,White Blend,Nicosia
Quinta dos Avidagos 2011 Avidagos Red (Douro),Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,87,15.0,Douro,,,Roger Voss,@vossroger,Portuguese Red,Quinta dos Avidagos
...,...,...,...,...,...,...,...,...,...,...,...,...
Domaine Marcel Deiss 2012 Pinot Gris (Alsace),France,"A dry style of Pinot Gris, this is crisp with ...",,90,32.0,Alsace,Alsace,,Roger Voss,@vossroger,Pinot Gris,Domaine Marcel Deiss
Domaine Schoffit 2012 Lieu-dit Harth Cuvée Caroline Gewurztraminer (Alsace),France,"Big, rich and off-dry, this is powered by inte...",Lieu-dit Harth Cuvée Caroline,90,21.0,Alsace,Alsace,,Roger Voss,@vossroger,Gewürztraminer,Domaine Schoffit


### Selección condicional

In [42]:
reviews.country == 'Italy'

Unnamed: 0,country
0,True
1,False
...,...
129969,False
129970,False


In [43]:
reviews.loc[reviews.country == 'Mexico']

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery
378,Mexico,"The color is appropriately light, the aromas a...",Private Reserve,88,18.0,Valle de Guadalupe,,,,,L.A. Cetto 1996 Private Reserve Nebbiolo (Vall...,Nebbiolo,L.A. Cetto
601,Mexico,"Sauvignon Blanc is, in general, one of Baja's ...",Viña Kristel,87,15.0,Valle de Guadalupe,,,Michael Schachner,@wineschach,Monte Xanic 2012 Viña Kristel Sauvignon Blanc ...,Sauvignon Blanc,Monte Xanic
...,...,...,...,...,...,...,...,...,...,...,...,...,...
125495,Mexico,This Cabernet blend is one of Baja's best reds...,Gran Ricardo,90,56.0,Valle de Guadalupe,,,Michael Schachner,@wineschach,Monte Xanic 2010 Gran Ricardo Red (Valle de Gu...,Red Blend,Monte Xanic
125824,Mexico,"Slightly sweet on the nose, but wheaty and gra...",Dominó,80,15.0,San Antonio de las Minas Valley,,,Michael Schachner,@wineschach,Vinisterra 2011 Dominó Cinsault (San Antonio d...,Cinsault,Vinisterra


También podemos realizar selección con varias condiciones, podemos por ejemplo utilizar condiciones *and* (&) u *or* (|).

In [44]:
reviews.loc[(reviews.country == 'Mexico') & (reviews.points >= 90)]

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery
20186,Mexico,This Cabernet blend is one of Baja's best reds...,Gran Ricardo,90,56.0,Valle de Guadalupe,,,Michael Schachner,@wineschach,Monte Xanic 2010 Gran Ricardo Red (Valle de Gu...,Red Blend,Monte Xanic
43799,Mexico,"A world-class wine regardless of origin, this ...",Amado IV,92,54.0,Valle de Guadalupe,,,Michael Schachner,@wineschach,Viñas de Garza 2007 Amado IV Red (Valle de Gua...,Red Blend,Viñas de Garza
68520,Mexico,"Ripe and rich, but not dull or out of whack, t...",,91,108.0,Valle de Guadalupe,,,Michael Schachner,@wineschach,El Sombrero 2009 Red (Valle de Guadalupe),Red Blend,El Sombrero
68539,Mexico,This red blend of 67% Syrah and 33% Mourvedre ...,Pedregal,91,45.0,San Antonio de las Minas Valley,,,Michael Schachner,@wineschach,Vinisterra 2007 Pedregal Syrah-Mourvèdre (San ...,Syrah-Mourvèdre,Vinisterra
125495,Mexico,This Cabernet blend is one of Baja's best reds...,Gran Ricardo,90,56.0,Valle de Guadalupe,,,Michael Schachner,@wineschach,Monte Xanic 2010 Gran Ricardo Red (Valle de Gu...,Red Blend,Monte Xanic


In [45]:
reviews.loc[(reviews.country == 'Mexico') | (reviews.country == 'Uruguay')]

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery
378,Mexico,"The color is appropriately light, the aromas a...",Private Reserve,88,18.0,Valle de Guadalupe,,,,,L.A. Cetto 1996 Private Reserve Nebbiolo (Vall...,Nebbiolo,L.A. Cetto
601,Mexico,"Sauvignon Blanc is, in general, one of Baja's ...",Viña Kristel,87,15.0,Valle de Guadalupe,,,Michael Schachner,@wineschach,Monte Xanic 2012 Viña Kristel Sauvignon Blanc ...,Sauvignon Blanc,Monte Xanic
...,...,...,...,...,...,...,...,...,...,...,...,...,...
129445,Uruguay,No kidding that this was aged in oak (Criado e...,Criado en Roble,87,20.0,Canelones,,,Michael Schachner,@wineschach,Montes Toscanini 2015 Criado en Roble Tannat (...,Tannat,Montes Toscanini
129481,Uruguay,"Mint, leather and berry aromas result in a str...",,88,29.0,Canelones,,,Michael Schachner,@wineschach,Artesana 2011 Tannat-Merlot (Canelones),Tannat-Merlot,Artesana


Otra forma de filtrar los datos es con la función `isin`. Esta función recibe una lista de los valores a verificar.

In [49]:
reviews.loc[reviews.country.isin(['Italy', 'France', 'Mexico'])]

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery
0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia
6,Italy,"Here's a bright, informal red that opens with ...",Belsito,87,16.0,Sicily & Sardinia,Vittoria,,Kerin O’Keefe,@kerinokeefe,Terre di Giurfo 2013 Belsito Frappato (Vittoria),Frappato,Terre di Giurfo
...,...,...,...,...,...,...,...,...,...,...,...,...,...
129969,France,"A dry style of Pinot Gris, this is crisp with ...",,90,32.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Marcel Deiss 2012 Pinot Gris (Alsace),Pinot Gris,Domaine Marcel Deiss
129970,France,"Big, rich and off-dry, this is powered by inte...",Lieu-dit Harth Cuvée Caroline,90,21.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Schoffit 2012 Lieu-dit Harth Cuvée Car...,Gewürztraminer,Domaine Schoffit


Podemos incluso filtrar los valores nulos:

In [50]:
reviews.loc[reviews.price.notnull()]

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery
1,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,87,15.0,Douro,,,Roger Voss,@vossroger,Quinta dos Avidagos 2011 Avidagos Red (Douro),Portuguese Red,Quinta dos Avidagos
2,US,"Tart and snappy, the flavors of lime flesh and...",,87,14.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Rainstorm 2013 Pinot Gris (Willamette Valley),Pinot Gris,Rainstorm
...,...,...,...,...,...,...,...,...,...,...,...,...,...
129969,France,"A dry style of Pinot Gris, this is crisp with ...",,90,32.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Marcel Deiss 2012 Pinot Gris (Alsace),Pinot Gris,Domaine Marcel Deiss
129970,France,"Big, rich and off-dry, this is powered by inte...",Lieu-dit Harth Cuvée Caroline,90,21.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Schoffit 2012 Lieu-dit Harth Cuvée Car...,Gewürztraminer,Domaine Schoffit


### Asignación de valores

Es posible asignar un mismo valor a todos los renglones

In [52]:
reviews['critic'] = 'everyone'
reviews

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery,critic
0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia,everyone
1,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,87,15.0,Douro,,,Roger Voss,@vossroger,Quinta dos Avidagos 2011 Avidagos Red (Douro),Portuguese Red,Quinta dos Avidagos,everyone
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
129969,France,"A dry style of Pinot Gris, this is crisp with ...",,90,32.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Marcel Deiss 2012 Pinot Gris (Alsace),Pinot Gris,Domaine Marcel Deiss,everyone
129970,France,"Big, rich and off-dry, this is powered by inte...",Lieu-dit Harth Cuvée Caroline,90,21.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Schoffit 2012 Lieu-dit Harth Cuvée Car...,Gewürztraminer,Domaine Schoffit,everyone


También se pueden asignar los elementos de una lista o rango generado:

In [None]:
reviews['index_backwards'] = range(len(reviews), 0, -1)
reviews['index_backwards']

Finalmente podemos crear un campo calculado:

In [53]:
reviews['new_price'] = reviews.price * 1.15
reviews.loc[:,['price','new_price']]

Unnamed: 0,price,new_price
0,,
1,15.0,17.25
...,...,...
129969,32.0,36.80
129970,21.0,24.15


## 3. Funciones de sumarizado y mapeo (Summary Functions and Maps)

In [70]:
pd.set_option('display.max_rows', 20)
reviews

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery,critic,new_price
0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia,everyone,
1,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,87,15.0,Douro,,,Roger Voss,@vossroger,Quinta dos Avidagos 2011 Avidagos Red (Douro),Portuguese Red,Quinta dos Avidagos,everyone,17.25
2,US,"Tart and snappy, the flavors of lime flesh and...",,87,14.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Rainstorm 2013 Pinot Gris (Willamette Valley),Pinot Gris,Rainstorm,everyone,16.10
3,US,"Pineapple rind, lemon pith and orange blossom ...",Reserve Late Harvest,87,13.0,Michigan,Lake Michigan Shore,,Alexander Peartree,,St. Julian 2013 Reserve Late Harvest Riesling ...,Riesling,St. Julian,everyone,14.95
4,US,"Much like the regular bottling from 2012, this...",Vintner's Reserve Wild Child Block,87,65.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Sweet Cheeks 2012 Vintner's Reserve Wild Child...,Pinot Noir,Sweet Cheeks,everyone,74.75
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
129966,Germany,Notes of honeysuckle and cantaloupe sweeten th...,Brauneberger Juffer-Sonnenuhr Spätlese,90,28.0,Mosel,,,Anna Lee C. Iijima,,Dr. H. Thanisch (Erben Müller-Burggraef) 2013 ...,Riesling,Dr. H. Thanisch (Erben Müller-Burggraef),everyone,32.20
129967,US,Citation is given as much as a decade of bottl...,,90,75.0,Oregon,Oregon,Oregon Other,Paul Gregutt,@paulgwine,Citation 2004 Pinot Noir (Oregon),Pinot Noir,Citation,everyone,86.25
129968,France,Well-drained gravel soil gives this wine its c...,Kritt,90,30.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Gresser 2013 Kritt Gewurztraminer (Als...,Gewürztraminer,Domaine Gresser,everyone,34.50
129969,France,"A dry style of Pinot Gris, this is crisp with ...",,90,32.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Marcel Deiss 2012 Pinot Gris (Alsace),Pinot Gris,Domaine Marcel Deiss,everyone,36.80


#### Funciones de sumarizado

In [57]:
reviews.points.describe()

Unnamed: 0,points
count,129971.000000
mean,88.447138
...,...
75%,91.000000
max,100.000000


In [58]:
reviews.describe()

Unnamed: 0,points,price,new_price
count,129971.000000,120975.000000,120975.000000
mean,88.447138,35.363389,40.667897
...,...,...,...
75%,91.000000,42.000000,48.300000
max,100.000000,3300.000000,3795.000000


In [59]:
reviews.taster_name.describe()

Unnamed: 0,taster_name
count,103727
unique,19
top,Roger Voss
freq,25514


In [60]:
reviews.taster_name.unique()

array(['Kerin O’Keefe', 'Roger Voss', 'Paul Gregutt',
       'Alexander Peartree', 'Michael Schachner', 'Anna Lee C. Iijima',
       'Virginie Boone', 'Matt Kettmann', nan, 'Sean P. Sullivan',
       'Jim Gordon', 'Joe Czerwinski', 'Anne Krebiehl\xa0MW',
       'Lauren Buzzeo', 'Mike DeSimone', 'Jeff Jenssen',
       'Susan Kostrzewa', 'Carrie Dykes', 'Fiona Adams',
       'Christina Pickard'], dtype=object)

In [61]:
reviews.country.unique()

array(['Italy', 'Portugal', 'US', 'Spain', 'France', 'Germany',
       'Argentina', 'Chile', 'Australia', 'Austria', 'South Africa',
       'New Zealand', 'Israel', 'Hungary', 'Greece', 'Romania', 'Mexico',
       'Canada', nan, 'Turkey', 'Czech Republic', 'Slovenia',
       'Luxembourg', 'Croatia', 'Georgia', 'Uruguay', 'England',
       'Lebanon', 'Serbia', 'Brazil', 'Moldova', 'Morocco', 'Peru',
       'India', 'Bulgaria', 'Cyprus', 'Armenia', 'Switzerland',
       'Bosnia and Herzegovina', 'Ukraine', 'Slovakia', 'Macedonia',
       'China', 'Egypt'], dtype=object)

In [63]:
print(reviews.points.mean())
print(reviews.points.std())
print(reviews.points.median())

88.44713820775404
3.0397302029162336
88.0


In [64]:
reviews.taster_name.value_counts()

Unnamed: 0_level_0,count
taster_name,Unnamed: 1_level_1
Roger Voss,25514
Michael Schachner,15134
...,...
Fiona Adams,27
Christina Pickard,6


In [72]:
df_paises = reviews.country.value_counts()
df_paises.tail(20)

Unnamed: 0_level_0,count
country,Unnamed: 1_level_1
Croatia,73
Mexico,70
Moldova,59
Brazil,52
Lebanon,35
Morocco,28
Peru,16
Ukraine,14
Macedonia,12
Czech Republic,12


#### Mapeos

In [73]:
review_points_mean = reviews.points.mean()
reviews.points.map(lambda p: p - review_points_mean)

Unnamed: 0,points
0,-1.447138
1,-1.447138
2,-1.447138
3,-1.447138
4,-1.447138
...,...
129966,1.552862
129967,1.552862
129968,1.552862
129969,1.552862


In [74]:
def remean_points(row):
    row.points = row.points - review_points_mean
    return row

reviews.apply(remean_points, axis='columns')

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery,critic,new_price
0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,-1.447138,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia,everyone,
1,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,-1.447138,15.0,Douro,,,Roger Voss,@vossroger,Quinta dos Avidagos 2011 Avidagos Red (Douro),Portuguese Red,Quinta dos Avidagos,everyone,17.25
2,US,"Tart and snappy, the flavors of lime flesh and...",,-1.447138,14.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Rainstorm 2013 Pinot Gris (Willamette Valley),Pinot Gris,Rainstorm,everyone,16.10
3,US,"Pineapple rind, lemon pith and orange blossom ...",Reserve Late Harvest,-1.447138,13.0,Michigan,Lake Michigan Shore,,Alexander Peartree,,St. Julian 2013 Reserve Late Harvest Riesling ...,Riesling,St. Julian,everyone,14.95
4,US,"Much like the regular bottling from 2012, this...",Vintner's Reserve Wild Child Block,-1.447138,65.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Sweet Cheeks 2012 Vintner's Reserve Wild Child...,Pinot Noir,Sweet Cheeks,everyone,74.75
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
129966,Germany,Notes of honeysuckle and cantaloupe sweeten th...,Brauneberger Juffer-Sonnenuhr Spätlese,1.552862,28.0,Mosel,,,Anna Lee C. Iijima,,Dr. H. Thanisch (Erben Müller-Burggraef) 2013 ...,Riesling,Dr. H. Thanisch (Erben Müller-Burggraef),everyone,32.20
129967,US,Citation is given as much as a decade of bottl...,,1.552862,75.0,Oregon,Oregon,Oregon Other,Paul Gregutt,@paulgwine,Citation 2004 Pinot Noir (Oregon),Pinot Noir,Citation,everyone,86.25
129968,France,Well-drained gravel soil gives this wine its c...,Kritt,1.552862,30.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Gresser 2013 Kritt Gewurztraminer (Als...,Gewürztraminer,Domaine Gresser,everyone,34.50
129969,France,"A dry style of Pinot Gris, this is crisp with ...",,1.552862,32.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Marcel Deiss 2012 Pinot Gris (Alsace),Pinot Gris,Domaine Marcel Deiss,everyone,36.80


Pandas ofrece una forma más directa de aplicar una operación sobre una columna:

In [76]:
review_points_mean = reviews.points.mean()
reviews.points - review_points_mean

Unnamed: 0,points
0,-1.447138
1,-1.447138
2,-1.447138
3,-1.447138
4,-1.447138
...,...
129966,1.552862
129967,1.552862
129968,1.552862
129969,1.552862


In [77]:
reviews.country + " - " + reviews.region_1

Unnamed: 0,0
0,Italy - Etna
1,
2,US - Willamette Valley
3,US - Lake Michigan Shore
4,US - Willamette Valley
...,...
129966,
129967,US - Oregon
129968,France - Alsace
129969,France - Alsace


## 4. Agrupamiento y ordenamiento

#### Operaciones de agrupamiento

In [78]:
reviews.groupby('points').points.count()

Unnamed: 0_level_0,points
points,Unnamed: 1_level_1
80,397
81,692
82,1836
83,3025
84,6480
...,...
96,523
97,229
98,77
99,33


In [79]:
reviews.groupby('country').country.count()

Unnamed: 0_level_0,country
country,Unnamed: 1_level_1
Argentina,3800
Armenia,2
Australia,2329
Austria,3345
Bosnia and Herzegovina,2
...,...
Switzerland,7
Turkey,90
US,54504
Ukraine,14


Una vez agrupados podemos solicitar información de algun campo específico y obtendremos dicho valor para cada grupo.

In [82]:
reviews.groupby('points').price.mean()

Unnamed: 0_level_0,price
points,Unnamed: 1_level_1
80,16.372152
81,17.182353
82,18.870767
83,18.237353
84,19.310215
...,...
96,159.292531
97,207.173913
98,245.492754
99,284.214286


In [83]:
reviews.groupby('points').price.min()

Unnamed: 0_level_0,price
points,Unnamed: 1_level_1
80,5.0
81,5.0
82,4.0
83,4.0
84,4.0
...,...
96,20.0
97,35.0
98,50.0
99,44.0


También por ejemplo podemos obtener el título del primer vino revisado de cada grupo de empresas.

In [84]:
reviews.groupby('winery').apply(lambda df: df.title.iloc[0])

  reviews.groupby('winery').apply(lambda df: df.title.iloc[0])


Unnamed: 0_level_0,0
winery,Unnamed: 1_level_1
1+1=3,1+1=3 NV Rosé Sparkling (Cava)
10 Knots,10 Knots 2010 Viognier (Paso Robles)
100 Percent Wine,100 Percent Wine 2015 Moscato (California)
1000 Stories,1000 Stories 2013 Bourbon Barrel Aged Zinfande...
1070 Green,1070 Green 2011 Sauvignon Blanc (Rutherford)
...,...
Órale,Órale 2011 Cabronita Red (Santa Ynez Valley)
Öko,Öko 2013 Made With Organically Grown Grapes Ma...
Ökonomierat Rebholz,Ökonomierat Rebholz 2007 Von Rotliegenden Spät...
àMaurice,àMaurice 2013 Fred Estate Syrah (Walla Walla V...


Para obtener el vino mejor calificado por País y Provincia:

In [86]:
reviews.groupby(['country', 'province']).apply(lambda df: df.loc[df.points.idxmax()])

  reviews.groupby(['country', 'province']).apply(lambda df: df.loc[df.points.idxmax()])


Unnamed: 0_level_0,Unnamed: 1_level_0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery,critic,new_price
country,province,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
Argentina,Mendoza Province,Argentina,"If the color doesn't tell the full story, the ...",Nicasia Vineyard,97,120.0,Mendoza Province,Mendoza,,Michael Schachner,@wineschach,Bodega Catena Zapata 2006 Nicasia Vineyard Mal...,Malbec,Bodega Catena Zapata,everyone,138.00
Argentina,Other,Argentina,"Take note, this could be the best wine Colomé ...",Reserva,95,90.0,Other,Salta,,Michael Schachner,@wineschach,Colomé 2010 Reserva Malbec (Salta),Malbec,Colomé,everyone,103.50
Armenia,Armenia,Armenia,"Deep salmon in color, this wine offers a bouqu...",Estate Bottled,88,15.0,Armenia,,,Mike DeSimone,@worldwineguys,Van Ardi 2015 Estate Bottled Rosé (Armenia),Rosé,Van Ardi,everyone,17.25
Australia,Australia Other,Australia,Writes the book on how to make a wine filled w...,Sarah's Blend,93,15.0,Australia Other,South Eastern Australia,,,,Marquis Philips 2000 Sarah's Blend Red (South ...,Red Blend,Marquis Philips,everyone,17.25
Australia,New South Wales,Australia,De Bortoli's Noble One is as good as ever in 2...,Noble One Bortytis,94,32.0,New South Wales,New South Wales,,Joe Czerwinski,@JoeCz,De Bortoli 2007 Noble One Bortytis Semillon (N...,Sémillon,De Bortoli,everyone,36.80
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Uruguay,Juanico,Uruguay,This mature Bordeaux-style blend is earthy on ...,Preludio Barrel Select Lote N 77,90,45.0,Juanico,,,Michael Schachner,@wineschach,Familia Deicas 2004 Preludio Barrel Select Lot...,Red Blend,Familia Deicas,everyone,51.75
Uruguay,Montevideo,Uruguay,"A rich, heady bouquet offers aromas of blackbe...",Monte Vide Eu Tannat-Merlot-Tempranillo,91,60.0,Montevideo,,,Michael Schachner,@wineschach,Bouza 2015 Monte Vide Eu Tannat-Merlot-Tempran...,Red Blend,Bouza,everyone,69.00
Uruguay,Progreso,Uruguay,"Rusty in color but deep and complex in nature,...",Etxe Oneko Fortified Sweet Red,90,46.0,Progreso,,,Michael Schachner,@wineschach,Pisano 2007 Etxe Oneko Fortified Sweet Red Tan...,Tannat,Pisano,everyone,52.90
Uruguay,San Jose,Uruguay,"Baked, sweet, heavy aromas turn earthy with ti...",El Preciado Gran Reserva,87,50.0,San Jose,,,Michael Schachner,@wineschach,Castillo Viejo 2005 El Preciado Gran Reserva R...,Red Blend,Castillo Viejo,everyone,57.50


In [87]:
reviews.groupby(['country']).price.agg([len, min, max])

  reviews.groupby(['country']).price.agg([len, min, max])
  reviews.groupby(['country']).price.agg([len, min, max])


Unnamed: 0_level_0,len,min,max
country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Argentina,3800,4.0,230.0
Armenia,2,14.0,15.0
Australia,2329,5.0,850.0
Austria,3345,7.0,1100.0
Bosnia and Herzegovina,2,12.0,13.0
...,...,...,...
Switzerland,7,21.0,160.0
Turkey,90,14.0,120.0
US,54504,4.0,2013.0
Ukraine,14,6.0,13.0


In [93]:
reviews.groupby('country').points.max()

Unnamed: 0_level_0,points
country,Unnamed: 1_level_1
Argentina,97
Armenia,88
Australia,100
Austria,98
Bosnia and Herzegovina,88
...,...
Switzerland,90
Turkey,92
US,100
Ukraine,88


In [94]:
countries_reviewed = reviews.groupby(['country', 'province']).description.agg([len])
countries_reviewed

Unnamed: 0_level_0,Unnamed: 1_level_0,len
country,province,Unnamed: 2_level_1
Argentina,Mendoza Province,3264
Argentina,Other,536
Armenia,Armenia,2
Australia,Australia Other,245
Australia,New South Wales,85
...,...,...
Uruguay,Juanico,12
Uruguay,Montevideo,11
Uruguay,Progreso,11
Uruguay,San Jose,3


In [95]:
countries_reviewed.reset_index()

Unnamed: 0,country,province,len
0,Argentina,Mendoza Province,3264
1,Argentina,Other,536
2,Armenia,Armenia,2
3,Australia,Australia Other,245
4,Australia,New South Wales,85
...,...,...,...
420,Uruguay,Juanico,12
421,Uruguay,Montevideo,11
422,Uruguay,Progreso,11
423,Uruguay,San Jose,3


### Ordenamiento

In [96]:
countries_reviewed = countries_reviewed.reset_index()
countries_reviewed.sort_values(by='len')

Unnamed: 0,country,province,len
386,Turkey,Elazığ-Diyarbakir,1
389,Turkey,Urla-Thrace,1
395,US,Hawaii,1
357,South Africa,Piekenierskloof,1
354,South Africa,Paardeberg,1
...,...,...,...
409,US,Oregon,5373
227,Italy,Tuscany,5897
118,France,Bordeaux,5941
415,US,Washington,8639


In [97]:
countries_reviewed.sort_values(by='len', ascending=False)

Unnamed: 0,country,province,len
392,US,California,36247
415,US,Washington,8639
118,France,Bordeaux,5941
227,Italy,Tuscany,5897
409,US,Oregon,5373
...,...,...,...
389,Turkey,Urla-Thrace,1
48,Canada,Canada Other,1
40,Brazil,Serra do Sudeste,1
395,US,Hawaii,1


Ordenando con base al índice de los renglones

In [100]:
countries_reviewed.sort_index()

Unnamed: 0,country,province,len
0,Argentina,Mendoza Province,3264
1,Argentina,Other,536
2,Armenia,Armenia,2
3,Australia,Australia Other,245
4,Australia,New South Wales,85
...,...,...,...
420,Uruguay,Juanico,12
421,Uruguay,Montevideo,11
422,Uruguay,Progreso,11
423,Uruguay,San Jose,3


In [101]:
countries_reviewed.sort_index(ascending=False)

Unnamed: 0,country,province,len
424,Uruguay,Uruguay,24
423,Uruguay,San Jose,3
422,Uruguay,Progreso,11
421,Uruguay,Montevideo,11
420,Uruguay,Juanico,12
...,...,...,...
4,Australia,New South Wales,85
3,Australia,Australia Other,245
2,Armenia,Armenia,2
1,Argentina,Other,536


In [102]:
countries_reviewed.sort_values(by=['country', 'len'])

Unnamed: 0,country,province,len
1,Argentina,Other,536
0,Argentina,Mendoza Province,3264
2,Armenia,Armenia,2
6,Australia,Tasmania,42
4,Australia,New South Wales,85
...,...,...,...
421,Uruguay,Montevideo,11
422,Uruguay,Progreso,11
420,Uruguay,Juanico,12
424,Uruguay,Uruguay,24


In [103]:
countries_reviewed.sort_values(by=['country', 'len'], ascending=[True, False])

Unnamed: 0,country,province,len
0,Argentina,Mendoza Province,3264
1,Argentina,Other,536
2,Armenia,Armenia,2
5,Australia,South Australia,1349
7,Australia,Victoria,322
...,...,...,...
420,Uruguay,Juanico,12
421,Uruguay,Montevideo,11
422,Uruguay,Progreso,11
418,Uruguay,Atlantida,5


## 5. Tipos de datos y valores nulos

Se puede verificar el tipo de datos de todas las columnas o de una columna en específico:

In [104]:
reviews.dtypes

Unnamed: 0,0
country,object
description,object
designation,object
points,int64
price,float64
province,object
region_1,object
region_2,object
taster_name,object
taster_twitter_handle,object


In [105]:
reviews.price.dtype

dtype('float64')

In [116]:
reviews.points.astype('float64')

Unnamed: 0,points
0,87.0
1,87.0
2,87.0
3,87.0
4,87.0
...,...
129966,90.0
129967,90.0
129968,90.0
129969,90.0


### Manejando datos nulos

In [117]:
reviews[pd.isnull(reviews.country)]

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery,critic,new_price
913,,"Amber in color, this wine has aromas of peach ...",Asureti Valley,87,30.0,,,,Mike DeSimone,@worldwineguys,Gotsa Family Wines 2014 Asureti Valley Chinuri,Chinuri,Gotsa Family Wines,everyone,34.5
3131,,"Soft, fruity and juicy, this is a pleasant, si...",Partager,83,,,,,Roger Voss,@vossroger,Barton & Guestier NV Partager Red,Red Blend,Barton & Guestier,everyone,
4243,,"Violet-red in color, this semisweet wine has a...",Red Naturally Semi-Sweet,88,18.0,,,,Mike DeSimone,@worldwineguys,Kakhetia Traditional Winemaking 2012 Red Natur...,Ojaleshi,Kakhetia Traditional Winemaking,everyone,20.7
9509,,This mouthwatering blend starts with a nose of...,Theopetra Malagouzia-Assyrtiko,92,28.0,,,,Susan Kostrzewa,@suskostrzewa,Tsililis 2015 Theopetra Malagouzia-Assyrtiko W...,White Blend,Tsililis,everyone,32.2
9750,,This orange-style wine has a cloudy yellow-gol...,Orange Nikolaevo Vineyard,89,28.0,,,,Jeff Jenssen,@worldwineguys,Ross-idi 2015 Orange Nikolaevo Vineyard Chardo...,Chardonnay,Ross-idi,everyone,32.2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
124176,,This Swiss red blend is composed of four varie...,Les Romaines,90,30.0,,,,Jeff Jenssen,@worldwineguys,Les Frères Dutruy 2014 Les Romaines Red,Red Blend,Les Frères Dutruy,everyone,34.5
129407,,Dry spicy aromas of dusty plum and tomato add ...,Reserve,89,22.0,,,,Michael Schachner,@wineschach,El Capricho 2015 Reserve Cabernet Sauvignon,Cabernet Sauvignon,El Capricho,everyone,25.3
129408,,El Capricho is one of Uruguay's more consisten...,Reserve,89,22.0,,,,Michael Schachner,@wineschach,El Capricho 2015 Reserve Tempranillo,Tempranillo,El Capricho,everyone,25.3
129590,,"A blend of 60% Syrah, 30% Cabernet Sauvignon a...",Shah,90,30.0,,,,Mike DeSimone,@worldwineguys,Büyülübağ 2012 Shah Red,Red Blend,Büyülübağ,everyone,34.5


Podemos usar la función `fillna` para sustituir los valores nulos por un valor específico.

In [118]:
reviews.region_2.fillna("Desconocido")

Unnamed: 0,region_2
0,Desconocido
1,Desconocido
2,Willamette Valley
3,Desconocido
4,Willamette Valley
...,...
129966,Desconocido
129967,Oregon Other
129968,Desconocido
129969,Desconocido


Es posible sustiituir el valor de una columna por otro en todo el dataframe:

In [119]:
reviews.taster_twitter_handle.replace("@kerinokeefe", "@kerino")

Unnamed: 0,taster_twitter_handle
0,@kerino
1,@vossroger
2,@paulgwine
3,
4,@paulgwine
...,...
129966,
129967,@paulgwine
129968,@vossroger
129969,@vossroger


In [120]:
reviews

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery,critic,new_price
0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia,everyone,
1,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,87,15.0,Douro,,,Roger Voss,@vossroger,Quinta dos Avidagos 2011 Avidagos Red (Douro),Portuguese Red,Quinta dos Avidagos,everyone,17.25
2,US,"Tart and snappy, the flavors of lime flesh and...",,87,14.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Rainstorm 2013 Pinot Gris (Willamette Valley),Pinot Gris,Rainstorm,everyone,16.10
3,US,"Pineapple rind, lemon pith and orange blossom ...",Reserve Late Harvest,87,13.0,Michigan,Lake Michigan Shore,,Alexander Peartree,,St. Julian 2013 Reserve Late Harvest Riesling ...,Riesling,St. Julian,everyone,14.95
4,US,"Much like the regular bottling from 2012, this...",Vintner's Reserve Wild Child Block,87,65.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Sweet Cheeks 2012 Vintner's Reserve Wild Child...,Pinot Noir,Sweet Cheeks,everyone,74.75
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
129966,Germany,Notes of honeysuckle and cantaloupe sweeten th...,Brauneberger Juffer-Sonnenuhr Spätlese,90,28.0,Mosel,,,Anna Lee C. Iijima,,Dr. H. Thanisch (Erben Müller-Burggraef) 2013 ...,Riesling,Dr. H. Thanisch (Erben Müller-Burggraef),everyone,32.20
129967,US,Citation is given as much as a decade of bottl...,,90,75.0,Oregon,Oregon,Oregon Other,Paul Gregutt,@paulgwine,Citation 2004 Pinot Noir (Oregon),Pinot Noir,Citation,everyone,86.25
129968,France,Well-drained gravel soil gives this wine its c...,Kritt,90,30.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Gresser 2013 Kritt Gewurztraminer (Als...,Gewürztraminer,Domaine Gresser,everyone,34.50
129969,France,"A dry style of Pinot Gris, this is crisp with ...",,90,32.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Marcel Deiss 2012 Pinot Gris (Alsace),Pinot Gris,Domaine Marcel Deiss,everyone,36.80


## 6. Renombrado y combiando

Para renombrar una columna utilizar:

In [121]:
reviews.rename(columns={'points': 'score'})

Unnamed: 0,country,description,designation,score,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery,critic,new_price
0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia,everyone,
1,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,87,15.0,Douro,,,Roger Voss,@vossroger,Quinta dos Avidagos 2011 Avidagos Red (Douro),Portuguese Red,Quinta dos Avidagos,everyone,17.25
2,US,"Tart and snappy, the flavors of lime flesh and...",,87,14.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Rainstorm 2013 Pinot Gris (Willamette Valley),Pinot Gris,Rainstorm,everyone,16.10
3,US,"Pineapple rind, lemon pith and orange blossom ...",Reserve Late Harvest,87,13.0,Michigan,Lake Michigan Shore,,Alexander Peartree,,St. Julian 2013 Reserve Late Harvest Riesling ...,Riesling,St. Julian,everyone,14.95
4,US,"Much like the regular bottling from 2012, this...",Vintner's Reserve Wild Child Block,87,65.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Sweet Cheeks 2012 Vintner's Reserve Wild Child...,Pinot Noir,Sweet Cheeks,everyone,74.75
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
129966,Germany,Notes of honeysuckle and cantaloupe sweeten th...,Brauneberger Juffer-Sonnenuhr Spätlese,90,28.0,Mosel,,,Anna Lee C. Iijima,,Dr. H. Thanisch (Erben Müller-Burggraef) 2013 ...,Riesling,Dr. H. Thanisch (Erben Müller-Burggraef),everyone,32.20
129967,US,Citation is given as much as a decade of bottl...,,90,75.0,Oregon,Oregon,Oregon Other,Paul Gregutt,@paulgwine,Citation 2004 Pinot Noir (Oregon),Pinot Noir,Citation,everyone,86.25
129968,France,Well-drained gravel soil gives this wine its c...,Kritt,90,30.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Gresser 2013 Kritt Gewurztraminer (Als...,Gewürztraminer,Domaine Gresser,everyone,34.50
129969,France,"A dry style of Pinot Gris, this is crisp with ...",,90,32.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Marcel Deiss 2012 Pinot Gris (Alsace),Pinot Gris,Domaine Marcel Deiss,everyone,36.80


Es posible renombrar también las etiquetas de los índices:

In [122]:
reviews.rename(index={0: 'firstEntry', 1: 'secondEntry'})

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery,critic,new_price
firstEntry,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia,everyone,
secondEntry,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,87,15.0,Douro,,,Roger Voss,@vossroger,Quinta dos Avidagos 2011 Avidagos Red (Douro),Portuguese Red,Quinta dos Avidagos,everyone,17.25
2,US,"Tart and snappy, the flavors of lime flesh and...",,87,14.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Rainstorm 2013 Pinot Gris (Willamette Valley),Pinot Gris,Rainstorm,everyone,16.10
3,US,"Pineapple rind, lemon pith and orange blossom ...",Reserve Late Harvest,87,13.0,Michigan,Lake Michigan Shore,,Alexander Peartree,,St. Julian 2013 Reserve Late Harvest Riesling ...,Riesling,St. Julian,everyone,14.95
4,US,"Much like the regular bottling from 2012, this...",Vintner's Reserve Wild Child Block,87,65.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Sweet Cheeks 2012 Vintner's Reserve Wild Child...,Pinot Noir,Sweet Cheeks,everyone,74.75
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
129966,Germany,Notes of honeysuckle and cantaloupe sweeten th...,Brauneberger Juffer-Sonnenuhr Spätlese,90,28.0,Mosel,,,Anna Lee C. Iijima,,Dr. H. Thanisch (Erben Müller-Burggraef) 2013 ...,Riesling,Dr. H. Thanisch (Erben Müller-Burggraef),everyone,32.20
129967,US,Citation is given as much as a decade of bottl...,,90,75.0,Oregon,Oregon,Oregon Other,Paul Gregutt,@paulgwine,Citation 2004 Pinot Noir (Oregon),Pinot Noir,Citation,everyone,86.25
129968,France,Well-drained gravel soil gives this wine its c...,Kritt,90,30.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Gresser 2013 Kritt Gewurztraminer (Als...,Gewürztraminer,Domaine Gresser,everyone,34.50
129969,France,"A dry style of Pinot Gris, this is crisp with ...",,90,32.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Marcel Deiss 2012 Pinot Gris (Alsace),Pinot Gris,Domaine Marcel Deiss,everyone,36.80
