# DataFrames

This function can be used to create a DataFrame

![DF%20v2.png](attachment:DF%20v2.png)

In [1]:
import pandas as pd

In [6]:
row_labels = [0,1,2,3,4]
column_labels = ['A','B','C']
data = [[1,2,5],(4,2,4),{6,7,9},[3,2,4],[3,3,2]]

In [7]:
# Creating a DataFame using the variables in the previous cell
df = pd.DataFrame(index=row_labels, data=data, columns=column_labels)

In [8]:
type(df)

pandas.core.frame.DataFrame

In [9]:
df

Unnamed: 0,A,B,C
0,1,2,5
1,4,2,4
2,9,6,7
3,3,2,4
4,3,3,2


## set_index()

The set_index() function can set an index using existing columns

![set_index%202.png](attachment:set_index%202.png)

In [10]:
df

Unnamed: 0,A,B,C
0,1,2,5
1,4,2,4
2,9,6,7
3,3,2,4
4,3,3,2


In [11]:
# sets column 'C' as the index
df.set_index('C')

Unnamed: 0_level_0,A,B
C,Unnamed: 1_level_1,Unnamed: 2_level_1
5,1,2
4,4,2
7,9,6
4,3,2
2,3,3


In [12]:
# drop=False will retain 'C' as a column
df.set_index('C', drop=False)

Unnamed: 0_level_0,A,B,C
C,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
5,1,2,5
4,4,2,4
7,9,6,7
4,3,2,4
2,3,3,2


In [13]:
df

Unnamed: 0,A,B,C
0,1,2,5
1,4,2,4
2,9,6,7
3,3,2,4
4,3,3,2


In [14]:
# The function is not inplace unless inplace=True
df.set_index('C', drop=False,inplace=True)

In [15]:
df

Unnamed: 0_level_0,A,B,C
C,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
5,1,2,5
4,4,2,4
7,9,6,7
4,3,2,4
2,3,3,2


In [16]:
# append=True can append the index on to the existing index
df.set_index('B', drop=False,append=True)

Unnamed: 0_level_0,Unnamed: 1_level_0,A,B,C
C,B,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
5,2,1,2,5
4,2,4,2,4
7,6,9,6,7
4,2,3,2,4
2,3,3,3,2


## reset_index()

The reset_index function resets the index

![reset%203.png](attachment:reset%203.png)

In [17]:
df

Unnamed: 0_level_0,A,B,C
C,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
5,1,2,5
4,4,2,4
7,9,6,7
4,3,2,4
2,3,3,2


In [18]:
# As drop=False you cannot reset an index if it already exists as a column, in this instance drop must be True
df.reset_index()

ValueError: cannot insert C, already exists

In [21]:
df.reset_index(drop=True,inplace=True)

In [22]:
df

Unnamed: 0,A,B,C
0,1,2,5
1,4,2,4
2,9,6,7
3,3,2,4
4,3,3,2


## read_csv()

There are many input functions that can create a DataFrame by importing tabular data

![read%20csv%202.png](attachment:read%20csv%202.png)

In [26]:
countries_table = pd.read_csv(filepath_or_buffer='top_10_countries.csv')

In [27]:
countries_table

Unnamed: 0,Rank,Country / Dependency,Region,Population,% of world,Date
0,1,China,Asia,1412600000,17.80%,31-Dec-21
1,2,India,Asia,1386946912,17.50%,18-Jan-22
2,3,United States,Americas,333073186,4.20%,18-Jan-22
3,4,Indonesia[b],Asia,271350000,3.42%,31-Dec-20
4,5,Pakistan,Asia,225200000,2.84%,01-Jul-21
5,6,Brazil,Americas,214231641,2.70%,18-Jan-22
6,7,Nigeria,Africa,211401000,2.67%,01-Jul-21
7,8,Bangladesh,Asia,172062576,2.17%,18-Jan-22
8,9,Russia[b],Europe,146171015,1.84%,01-Jan-21
9,10,Mexico,Americas,126014024,1.59%,02-Mar-20


In [28]:
type(countries_table)

pandas.core.frame.DataFrame

In [29]:
# if header=None then column names will be generated using the 1 axis index positions
countries_table = pd.read_csv('top_10_countries_no_header.csv', header=None)

In [30]:
countries_table

Unnamed: 0,0,1,2,3,4,5
0,1,China,Asia,1412600000,17.80%,31-Dec-21
1,2,India,Asia,1386946912,17.50%,18-Jan-22
2,3,United States,Americas,333073186,4.20%,18-Jan-22
3,4,Indonesia[b],Asia,271350000,3.42%,31-Dec-20
4,5,Pakistan,Asia,225200000,2.84%,01-Jul-21
5,6,Brazil,Americas,214231641,2.70%,18-Jan-22
6,7,Nigeria,Africa,211401000,2.67%,01-Jul-21
7,8,Bangladesh,Asia,172062576,2.17%,18-Jan-22
8,9,Russia[b],Europe,146171015,1.84%,01-Jan-21
9,10,Mexico,Americas,126014024,1.59%,02-Mar-20


## Links and resources:
* DataFrame: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html
* set_index: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.set_index.html
* reset_index: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.reset_index.html
* input / output functions: https://pandas.pydata.org/docs/reference/io.html