# Reading and Writing CSV Files as pandas Dataframes

## lesson_4_1_2

### Reading CSV files with pandas

Documentation for [pandas.read_csv()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html)

pandas has a function named `read_csv()` that will read a csv file directly into a dataframe.
this method takes the following arguments
- filepath
- sep=',': this has a default of ',' but I encourage its use so when you encounter a file that is delimited with something else you know how to set it
- header=0: this tells it to treat the first line as the column names, refer to documentation for other options.

There are other arguments to skip blank lines, set encoding, and provide column names.  Please refer to the documentation for more information.

#### Read tips.csv into a dataframe

In [1]:
import pandas as pd

tips_df = pd.read_csv('./tips.csv', sep=',', header=0)

In [2]:
tips_df.head()


Unnamed: 0,id,weekday,meal_type,wait_staff,party_size,meal_total,tip
0,1,Saturday,Dinner,Marcia,2,100.64,16.23
1,2,Friday,Dinner,Marcia,2,109.84,5.99
2,3,Friday,Lunch,Jan,4,90.5,22.04
3,4,Monday,Dinner,Marcia,1,60.01,8.77
4,5,Monday,Breakfast,Jan,1,10.88,1.68


#### Add the tip_percent column and calculation to the dataframe

In [4]:
percent_tip = pd.Series(tips_df['tip']/tips_df['meal_total'],name='tip_percent')
tips_df = pd.concat([tips_df, percent_tip], axis=1)
tips_df.head()

Unnamed: 0,id,weekday,meal_type,wait_staff,party_size,meal_total,tip,tip_percent
0,1,Saturday,Dinner,Marcia,2,100.64,16.23,0.161268
1,2,Friday,Dinner,Marcia,2,109.84,5.99,0.054534
2,3,Friday,Lunch,Jan,4,90.5,22.04,0.243536
3,4,Monday,Dinner,Marcia,1,60.01,8.77,0.146142
4,5,Monday,Breakfast,Jan,1,10.88,1.68,0.154412


### Writing CSV files with pandas

Documentation for [pandas.to_csv()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html)

pandas has a function named `to_csv()` that will write a dataframe to a csv file.
this method takes the following arguments
- filepath
- sep=',': this has a default of ',' but I encourage its use
- header=True: this tells it to treat the first line as the column names
- index=False: prevents index values being written to file

There are other arguments; please refer to the documentation for more information.

#### Write tips_df to tips_percent.csv 

In [5]:
tips_df.to_csv('tips_percent.csv', sep=',', header=True, index=False)

#### Check the file was written correctly by reading into a new df.

In [6]:
tips_df_from_file = pd.read_csv('./tips_percent.csv', sep=',', header=0)
tips_df_from_file.head()

Unnamed: 0,id,weekday,meal_type,wait_staff,party_size,meal_total,tip,tip_percent
0,1,Saturday,Dinner,Marcia,2,100.64,16.23,0.161268
1,2,Friday,Dinner,Marcia,2,109.84,5.99,0.054534
2,3,Friday,Lunch,Jan,4,90.5,22.04,0.243536
3,4,Monday,Dinner,Marcia,1,60.01,8.77,0.146142
4,5,Monday,Breakfast,Jan,1,10.88,1.68,0.154412
