# Aregar o eliminar datos con `Pandas`.

In [3]:
import numpy as np
import pandas as pd

In [4]:
datos = pd.read_csv('../Datos/bestsellers.csv')
datos.head()

Unnamed: 0,Name,Author,User Rating,Reviews,Price,Year,Genre
0,10-Day Green Smoothie Cleanse,JJ Smith,4.7,17350,8,2016,Non Fiction
1,11/22/63: A Novel,Stephen King,4.6,2052,22,2011,Fiction
2,12 Rules for Life: An Antidote to Chaos,Jordan B. Peterson,4.7,18979,15,2018,Non Fiction
3,1984 (Signet Classics),George Orwell,4.7,21424,6,2017,Fiction
4,"5,000 Awesome Facts (About Everything!) (Natio...",National Geographic Kids,4.8,7665,12,2019,Non Fiction


Supongamos que deseamos eliminar la columna `Genre`. Para ello, usamos la función `variable_df.drop('Columna', axis = 1, inplace = True)`, donde el `axis = 1` indica que hacemos la operación con la columna y el `inplace = True` manifiesta que el cambio se guarde en los datos, es decir, que no se realice únicamente con la salida.

In [5]:
datos.drop('Genre', inplace = True, axis = 1)
datos.head(5)

Unnamed: 0,Name,Author,User Rating,Reviews,Price,Year
0,10-Day Green Smoothie Cleanse,JJ Smith,4.7,17350,8,2016
1,11/22/63: A Novel,Stephen King,4.6,2052,22,2011
2,12 Rules for Life: An Antidote to Chaos,Jordan B. Peterson,4.7,18979,15,2018
3,1984 (Signet Classics),George Orwell,4.7,21424,6,2017
4,"5,000 Awesome Facts (About Everything!) (Natio...",National Geographic Kids,4.8,7665,12,2019


¡Y ya no aparece la columna `Genre`!

Ahora bien, otra opción consiste en usar la palabra reserva de `Python` para eliminar objetos que es `del`.

In [6]:
del datos['Price']

In [7]:
datos.head()

Unnamed: 0,Name,Author,User Rating,Reviews,Year
0,10-Day Green Smoothie Cleanse,JJ Smith,4.7,17350,2016
1,11/22/63: A Novel,Stephen King,4.6,2052,2011
2,12 Rules for Life: An Antidote to Chaos,Jordan B. Peterson,4.7,18979,2018
3,1984 (Signet Classics),George Orwell,4.7,21424,2017
4,"5,000 Awesome Facts (About Everything!) (Natio...",National Geographic Kids,4.8,7665,2019


## Eliminación de filas

En el caso de las filas, basta con indicar el índice de la fila y usar el parámetro `axis = 0`, ligado a operaciones con filas. Es importante tener en cuenta que esto no modificará los índices del resto de filas.

In [8]:
datos.drop(0, axis = 0, inplace = True)
datos.head(5)

Unnamed: 0,Name,Author,User Rating,Reviews,Year
1,11/22/63: A Novel,Stephen King,4.6,2052,2011
2,12 Rules for Life: An Antidote to Chaos,Jordan B. Peterson,4.7,18979,2018
3,1984 (Signet Classics),George Orwell,4.7,21424,2017
4,"5,000 Awesome Facts (About Everything!) (Natio...",National Geographic Kids,4.8,7665,2019
5,A Dance with Dragons (A Song of Ice and Fire),George R. R. Martin,4.4,12643,2011


Y ya no se tiene a la fila que tenía el índice  `0`. También podemos eliminar varias filas usando tuplas.

In [10]:
datos.drop(range(1, 10), axis = 0, inplace = True)
datos.head()

Unnamed: 0,Name,Author,User Rating,Reviews,Year
10,A Man Called Ove: A Novel,Fredrik Backman,4.6,23848,2017
11,A Patriot's History of the United States: From...,Larry Schweikart,4.6,460,2010
12,A Stolen Life: A Memoir,Jaycee Dugard,4.6,4149,2011
13,A Wrinkle in Time (Time Quintet),Madeleine L'Engle,4.5,5153,2018
14,"Act Like a Lady, Think Like a Man: What Men Re...",Steve Harvey,4.6,5013,2009


Nótese que los índices, al imprimir las filas de los datos, comienza en diez, ya que borramos las filas con los índices uno a nueve.

## Agregar columnas

Para agregar columnas, se puede usar arreglos de `Numpy`.

In [11]:
# Agregando una fila de datos vacios
datos['Nueva columna'] = np.nan
datos.head()

Unnamed: 0,Name,Author,User Rating,Reviews,Year,Nueva columna
10,A Man Called Ove: A Novel,Fredrik Backman,4.6,23848,2017,
11,A Patriot's History of the United States: From...,Larry Schweikart,4.6,460,2010,
12,A Stolen Life: A Memoir,Jaycee Dugard,4.6,4149,2011,
13,A Wrinkle in Time (Time Quintet),Madeleine L'Engle,4.5,5153,2018,
14,"Act Like a Lady, Think Like a Man: What Men Re...",Steve Harvey,4.6,5013,2009,


Nótese que al final se tiene la columna recién creada, donde su valor para todas las observaciones será `NaN`.

In [13]:
datos.shape[0]

540

# Agregar filas

Existen diversas formas para agregar filas a `Pandas`, sin embargo, la más veloz y que consume menos memoria es:

In [16]:
datos.append(['Perder es cuestión de método', 'Santiago Gamboa', 7.8, 2453, 1995, 'X'])

  datos.append(['Perder es cuestión de método', 'Santiago Gamboa', 7.8, 2453, 1995, 'X'])


Unnamed: 0,Name,Author,User Rating,Reviews,Year,Nueva columna,0
10,A Man Called Ove: A Novel,Fredrik Backman,4.6,23848.0,2017.0,,
11,A Patriot's History of the United States: From...,Larry Schweikart,4.6,460.0,2010.0,,
12,A Stolen Life: A Memoir,Jaycee Dugard,4.6,4149.0,2011.0,,
13,A Wrinkle in Time (Time Quintet),Madeleine L'Engle,4.5,5153.0,2018.0,,
14,"Act Like a Lady, Think Like a Man: What Men Re...",Steve Harvey,4.6,5013.0,2009.0,,
...,...,...,...,...,...,...,...
1,,,,,,,Santiago Gamboa
2,,,,,,,7.8
3,,,,,,,2453
4,,,,,,,1995
