# Spreadsheets 1 (Dataframes)

We will now depart from the course material, but I'm introducing Numpy and Pandas as replacement, while a spreadsheet and a dataframe aren't exactly the same, they are quite similar, I will however, include some notes from the Udemy Data Track course, as to not loose track completely, the following note will appear in the following notebooks as a reminder.

#### Note: These notebooks follow the Businness Analytics Scholarship Program from Udemy and Bertelsmann curriculum, originally intended for use with spreadsheet software, I decided to keep the curriculum in the original order, so some topics will feel out of place when translated for use with Pandas.

## Commas vs Periods

An eternal dilemma similar to imperial versus metric (Go meters!) depending on where were you born or raised, you probably use one over the other for educational purposes:

* **Decimal Separator**: Period (.)
* **Thousand Separator**: Comma (,)

## What are dataframes?
Dataframes are a kind of data structure that resembles a spreadsheet, is one of the most popular ways to work with data, for this basic introduction we will be using the pandas library.

In [1]:
# We import pandas and give it the usual alias
import pandas as pd
#With this we create a empty dataframe
df = pd.DataFrame()
print(df)

Empty DataFrame
Columns: []
Index: []


In [2]:
# We can create one-dimensional dataframes
data = [1,2,3,4,5]
df = pd.DataFrame(data)
print(df)

   0
0  1
1  2
2  3
3  4
4  5


In [3]:
#Or create something more recognizable to us
data = [['Jose',10],['Joseph',12],['Jotaro',13]]
df = pd.DataFrame(data,columns=['Name','Age'])
print(df)

     Name  Age
0    Jose   10
1  Joseph   12
2  Jotaro   13


There are many ways to put data into a dataframe, but we will restrain ourselves to work with CSV's because we can use them both in notebooks, excel, google sheets, etcetera.

So, **how do I import a csv file into pandas?**

You know what that means, time to work with the pokedex!

In [4]:
pokemon=pd.read_csv('datasets/pokemon.csv')

We make sure everything is alright using the **head()** method, this will give us the first five rows and its columns.

In [5]:
pokemon.head()

Unnamed: 0,#,Name,Type 1,Type 2,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,80,82,83,100,100,80,1,False
3,4,Mega Venusaur,Grass,Poison,80,100,123,122,120,80,1,False
4,5,Charmander,Fire,,39,52,43,60,50,65,1,False


## Insert Data
Like on a spreadsheet, we can insert a new column with the .insert() method, while you can specify different values for each row, it's easier to start with and empty cell or using a placeholder.

We will create a new column **Catch?** in the 12th position (pandas columns are 0 indexed) with the text value No to indicate that we haven't catch any pokemon (sad I know).

In [18]:
pokemon.insert(12,"Catch?", 'No')

In [7]:
pokemon.head()

Unnamed: 0,#,Name,Type 1,Type 2,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary,Catch?
0,1,Bulbasaur,Grass,Poison,45,49,49,65,65,45,1,False,No
1,2,Ivysaur,Grass,Poison,60,62,63,80,80,60,1,False,No
2,3,Venusaur,Grass,Poison,80,82,83,100,100,80,1,False,No
3,4,Mega Venusaur,Grass,Poison,80,100,123,122,120,80,1,False,No
4,5,Charmander,Fire,,39,52,43,60,50,65,1,False,No


The .insert() method **doesn't** change the original csv file.
## Delete Data
We can also delete columns with the .drop() method which takes as arguments the name of the column we wish to delete and "axis=1".

I also asigned to itself instead of using the *inplace* argument, this is to change the dataframe itself.

In [12]:
pokemon = pokemon.drop(labels="Catch?", axis=1)

In [15]:
pokemon.head()

Unnamed: 0,#,Name,Type 1,Type 2,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,80,82,83,100,100,80,1,False
3,4,Mega Venusaur,Grass,Poison,80,100,123,122,120,80,1,False
4,5,Charmander,Fire,,39,52,43,60,50,65,1,False


## Saving and exporting a CSV file from pandas
We can save our new csv in our working directory, we will name it "pokemon_boogaloo.csv" and save it without its rows indexes.

In [19]:
pokemon.to_csv('pokemon_boogaloo.csv', index= False)