# Pandas Demo
This demo covers some of the basic Pandas functionality used in the course. Pandas is a Python library used mainly for data analysis. To learn more, visit the [documentation](here).

In [281]:
import pandas as pd
import numpy as np

## DataFrames
**DataFrames** are data structures for working with tabular data. For more information about DataFrames, click [here](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html) to check out the Pandas documentation.


In [282]:
# read csv file from specified path
df = pd.read_csv('../Data/mydf.csv')

# print first few rows
df.head()

Unnamed: 0,col 1,col 2,col 3
0,1.331587,0.715279,-1.5454
1,-0.008384,0.5,-0.720086
2,0.265512,0.5,0.004291
3,-0.1746,0.433026,1.203037


In [271]:
# print last 2 rows
df.tail(2)

Unnamed: 0,col 1,col 2,col 3
2,0.265512,0.5,0.004291
3,-0.1746,0.433026,1.203037


In [272]:
# index DataFrame by column
Y = df['col 2']

Y

0    0.715279
1    0.500000
2    0.500000
3    0.433026
Name: col 2, dtype: float64

In [273]:
# obtain data without index
df['col 2'].values

array([0.71527897, 0.5       , 0.5       , 0.43302619])

In [274]:
# index DataFrame by columns
df[['col 1', 'col 3']]

Unnamed: 0,col 1,col 3
0,1.331587,-1.5454
1,-0.008384,-0.720086
2,0.265512,0.004291
3,-0.1746,1.203037


In [275]:
# index DataFrame by row number
df.iloc[0]

col 1    1.331587
col 2    0.715279
col 3   -1.545400
Name: 0, dtype: float64

In [276]:
# index DataFrame by row and column number
df.iloc[0, 1]

0.7152789743984055

In [277]:
# remove a column
X = df.drop(labels='col 1', axis=1)

X

Unnamed: 0,col 2,col 3
0,0.715279,-1.5454
1,0.5,-0.720086
2,0.5,0.004291
3,0.433026,1.203037


In [279]:
# remove multiple columns
df3 = df.drop(labels=['col 1', 'col 3'], axis=1)

df3

Unnamed: 0,col 2
0,0.715279
1,0.5
2,0.5
3,0.433026


In [280]:
# remove column in place without assigning result to a new variable
df2.drop(labels='col 2', axis=1, inplace=True)

df2

Unnamed: 0,col 3
0,-1.5454
1,-0.720086
2,0.004291
3,1.203037
