## Agregar y eliminar datos

In [1]:
import pandas as pd
import numpy as np

In [2]:
data_books = pd.read_csv('data/bestsellers-with-categories.csv')

In [3]:
data_books.loc[:5]

Unnamed: 0,Name,Author,User Rating,Reviews,Price,Year,Genre
0,10-Day Green Smoothie Cleanse,JJ Smith,4.7,17350,8,2016,Non Fiction
1,11/22/63: A Novel,Stephen King,4.6,2052,22,2011,Fiction
2,12 Rules for Life: An Antidote to Chaos,Jordan B. Peterson,4.7,18979,15,2018,Non Fiction
3,1984 (Signet Classics),George Orwell,4.7,21424,6,2017,Fiction
4,"5,000 Awesome Facts (About Everything!) (Natio...",National Geographic Kids,4.8,7665,12,2019,Non Fiction
5,A Dance with Dragons (A Song of Ice and Fire),George R. R. Martin,4.4,12643,11,2011,Fiction


### Eliminar columnas

#### drop()
Para eliminar columnas de un data frame, si $data$ es una variable donde se ha cargado un archivo csv con $pd.read\_ csv()$, usamos $drop()$ como se muestra a continuación:
$$data.drop('columna',\ axis=i)\hspace{0.3 cm}i=0,\ 1.$$
En el parámetro $axis$, $0$ corresponde a las filas y $1$ corresponde a las columnas y en el parámetro $'columna'$ podemos poner una lista con más de una columna: $['columna1,\ 'columna2']$. 

In [4]:
data_books.drop(['User Rating','Price'],axis=1).loc[:5] # se usó la función loc para mostrar pocas filas

Unnamed: 0,Name,Author,Reviews,Year,Genre
0,10-Day Green Smoothie Cleanse,JJ Smith,17350,2016,Non Fiction
1,11/22/63: A Novel,Stephen King,2052,2011,Fiction
2,12 Rules for Life: An Antidote to Chaos,Jordan B. Peterson,18979,2018,Non Fiction
3,1984 (Signet Classics),George Orwell,21424,2017,Fiction
4,"5,000 Awesome Facts (About Everything!) (Natio...",National Geographic Kids,7665,2019,Non Fiction
5,A Dance with Dragons (A Song of Ice and Fire),George R. R. Martin,12643,2011,Fiction


Si se vuelve a invocar el data frame las filas borradas anteriormente volverás a ser mostradas.

In [5]:
data_books.loc[:5]

Unnamed: 0,Name,Author,User Rating,Reviews,Price,Year,Genre
0,10-Day Green Smoothie Cleanse,JJ Smith,4.7,17350,8,2016,Non Fiction
1,11/22/63: A Novel,Stephen King,4.6,2052,22,2011,Fiction
2,12 Rules for Life: An Antidote to Chaos,Jordan B. Peterson,4.7,18979,15,2018,Non Fiction
3,1984 (Signet Classics),George Orwell,4.7,21424,6,2017,Fiction
4,"5,000 Awesome Facts (About Everything!) (Natio...",National Geographic Kids,4.8,7665,12,2019,Non Fiction
5,A Dance with Dragons (A Song of Ice and Fire),George R. R. Martin,4.4,12643,11,2011,Fiction


Para eliminar las columnas de manera permanente se puede agregar la sentencia $inplace$ a los parámetros de $drop()$
$$data.drop('columna',\ axis=i,\ inplace=True)$$

In [6]:
data_books.drop('Year',axis=1,inplace=True)

In [7]:
data_books.loc[:5]

Unnamed: 0,Name,Author,User Rating,Reviews,Price,Genre
0,10-Day Green Smoothie Cleanse,JJ Smith,4.7,17350,8,Non Fiction
1,11/22/63: A Novel,Stephen King,4.6,2052,22,Fiction
2,12 Rules for Life: An Antidote to Chaos,Jordan B. Peterson,4.7,18979,15,Non Fiction
3,1984 (Signet Classics),George Orwell,4.7,21424,6,Fiction
4,"5,000 Awesome Facts (About Everything!) (Natio...",National Geographic Kids,4.8,7665,12,Non Fiction
5,A Dance with Dragons (A Song of Ice and Fire),George R. R. Martin,4.4,12643,11,Fiction


Otra forma de eliminar columnas permanentemente es reescribir el data frame usando drop().
$$data = data.drop('columna',\ axis=i)$$

In [8]:
data_books = data_books.drop('User Rating',axis=1)

In [9]:
data_books.loc[:5]

Unnamed: 0,Name,Author,Reviews,Price,Genre
0,10-Day Green Smoothie Cleanse,JJ Smith,17350,8,Non Fiction
1,11/22/63: A Novel,Stephen King,2052,22,Fiction
2,12 Rules for Life: An Antidote to Chaos,Jordan B. Peterson,18979,15,Non Fiction
3,1984 (Signet Classics),George Orwell,21424,6,Fiction
4,"5,000 Awesome Facts (About Everything!) (Natio...",National Geographic Kids,7665,12,Non Fiction
5,A Dance with Dragons (A Song of Ice and Fire),George R. R. Martin,12643,11,Fiction


Podemos eliminar columnas con funciones nativas de python que no son propias de pandas (no recomendado):
$$del\hspace{0.2 cm} data['columna'].$$

In [10]:
del data_books['Price']

In [11]:
data_books.loc[:5]

Unnamed: 0,Name,Author,Reviews,Genre
0,10-Day Green Smoothie Cleanse,JJ Smith,17350,Non Fiction
1,11/22/63: A Novel,Stephen King,2052,Fiction
2,12 Rules for Life: An Antidote to Chaos,Jordan B. Peterson,18979,Non Fiction
3,1984 (Signet Classics),George Orwell,21424,Fiction
4,"5,000 Awesome Facts (About Everything!) (Natio...",National Geographic Kids,7665,Non Fiction
5,A Dance with Dragons (A Song of Ice and Fire),George R. R. Martin,12643,Fiction


### Eliminar filas

drop() también permite borrar columnas usando $axis=0$:
$$data.drop(i,\ axis=0)\hspace{0.3 cm}i\text{ es el número de fila}.$$
También podemos poner una lista de enteros $[i,\ j,\ k,\ ...]$ o un rango $range()$ en el parámetro donde van las filas para borrar varias filas.

In [12]:
data_books.drop([0,2],axis=0).loc[:5]

Unnamed: 0,Name,Author,Reviews,Genre
1,11/22/63: A Novel,Stephen King,2052,Fiction
3,1984 (Signet Classics),George Orwell,21424,Fiction
4,"5,000 Awesome Facts (About Everything!) (Natio...",National Geographic Kids,7665,Non Fiction
5,A Dance with Dragons (A Song of Ice and Fire),George R. R. Martin,12643,Fiction


Como en el caso anterior, usar solamente drop() no elimina permanentemente las filas.

### Añadir columnas

Para añadir columnas:
$$data['column\_ name'] = \text{asignación de contenido}$$

In [13]:
data_books['Numeración'] = np.nan
data_books.loc[:5]

Unnamed: 0,Name,Author,Reviews,Genre,Numeración
0,10-Day Green Smoothie Cleanse,JJ Smith,17350,Non Fiction,
1,11/22/63: A Novel,Stephen King,2052,Fiction,
2,12 Rules for Life: An Antidote to Chaos,Jordan B. Peterson,18979,Non Fiction,
3,1984 (Signet Classics),George Orwell,21424,Fiction,
4,"5,000 Awesome Facts (About Everything!) (Natio...",National Geographic Kids,7665,Non Fiction,
5,A Dance with Dragons (A Song of Ice and Fire),George R. R. Martin,12643,Fiction,


Podemos obtener el número de filas con $shape[0]$

In [14]:
data_books.shape[0]

550

In [15]:
data_books['Numeración'] = np.arange(0,data_books.shape[0])
data_books

Unnamed: 0,Name,Author,Reviews,Genre,Numeración
0,10-Day Green Smoothie Cleanse,JJ Smith,17350,Non Fiction,0
1,11/22/63: A Novel,Stephen King,2052,Fiction,1
2,12 Rules for Life: An Antidote to Chaos,Jordan B. Peterson,18979,Non Fiction,2
3,1984 (Signet Classics),George Orwell,21424,Fiction,3
4,"5,000 Awesome Facts (About Everything!) (Natio...",National Geographic Kids,7665,Non Fiction,4
...,...,...,...,...,...
545,Wrecking Ball (Diary of a Wimpy Kid Book 14),Jeff Kinney,9413,Fiction,545
546,You Are a Badass: How to Stop Doubting Your Gr...,Jen Sincero,14331,Non Fiction,546
547,You Are a Badass: How to Stop Doubting Your Gr...,Jen Sincero,14331,Non Fiction,547
548,You Are a Badass: How to Stop Doubting Your Gr...,Jen Sincero,14331,Non Fiction,548


### Agregar filas

Podemos crear un diccionario con los datos que queremos agregar:

In [16]:
otros_libros = {'Name':'Libro que no existe','Author':'Ricardo R','Reviews':0,'Genre':'Fiction','Numeración':1200}

Para agregar las filas concatenamos los dos data sets:
$$pd.concat([data_1,\ pd.DataFrame([\text{diccionario con los nuevos datos}])],\ ignore\_ index=True)$$

In [17]:
data_books = pd.concat([data_books,pd.DataFrame([otros_libros])],ignore_index=True)
data_books

Unnamed: 0,Name,Author,Reviews,Genre,Numeración
0,10-Day Green Smoothie Cleanse,JJ Smith,17350,Non Fiction,0
1,11/22/63: A Novel,Stephen King,2052,Fiction,1
2,12 Rules for Life: An Antidote to Chaos,Jordan B. Peterson,18979,Non Fiction,2
3,1984 (Signet Classics),George Orwell,21424,Fiction,3
4,"5,000 Awesome Facts (About Everything!) (Natio...",National Geographic Kids,7665,Non Fiction,4
...,...,...,...,...,...
546,You Are a Badass: How to Stop Doubting Your Gr...,Jen Sincero,14331,Non Fiction,546
547,You Are a Badass: How to Stop Doubting Your Gr...,Jen Sincero,14331,Non Fiction,547
548,You Are a Badass: How to Stop Doubting Your Gr...,Jen Sincero,14331,Non Fiction,548
549,You Are a Badass: How to Stop Doubting Your Gr...,Jen Sincero,14331,Non Fiction,549
