## Using CSV Files In Pandas

We have learned about reading, writing and using CSV files using python in the previous module. In this module we will go deeper into the efficient manipulation and analysis of CSV files by using the pandas library. While our previous module introduced fundamental techniques for working with CSV files in python, pandas makes data handling a little easier.

### Pandas Dataframes 

The pandas library provides a versatile and high-performance data structure known as a DataFrame. With pandas, we can load CSV files into DataFrames. This allows us to perform advanced operations such as filtering, sorting, and aggregating data with ease. Essentially, it has these three features: 

- Pandas can import your tabular data into a structure called a dataframe.
- Easily slice, subset, filter, and reformat data
- Basic statistics and plotting capabilities are built in

In [None]:
import pandas as pd 

df = pd.read_csv('filename.csv')

#Print the first 5 rows of the dataframe
df.head()

#Print the columns of the dataframe 
df.columns

#Print the shape of the dataframe
df.shape 

In [None]:
#rename columns

df = df.rename(columns=("DAYS": "days", df.columns[1]: "d19o"))

#find data types
df.dttypes

#### Indexing Columns

Most basic indexing: df[]

In [None]:
#get the 'DAY' column of my dataframe, then get the first value

day = df('DAY')
day[0]

In [None]:
#select multiple columns 

df[['Column1', 'Column2']]

#select by columns name

df.d18o

#save your column to an array or list

df.d18o.to_numpy()

df['d18o'].to_list()

#### Indexing Rows

In [None]:
#Select specific rows using .iolic 

#rows 1-10

df.iloc[0:10]

#rows 1, 4, 6

df.iloc[[0, 3, 5]]

#Note: If your index has labels, use .loc[] instead of .iloc[]

#### Row Logicals 

In [None]:
# get every row in my data with d180 values greater than -14
above_neg14 = df[df["d18o"] > -14]
 
# get every row in my data with d180 values equal to -13.84
equal_1384 = df[df["d18o"] == -13.84]

#### Set Index 

Say you want the index to be more meaningful:

In [None]:
df2 = df.set_index('days')

#### Plotting 

In [None]:
df.plot()

df2.plot()

df.plot(x='days',y='d18o')