## File I/O

[Pandas Doctumentation]( https://pandas.pydata.org/pandas-docs/stable/)

#### Formats supported by pandas


Format Type |	Data Description | Reader | Writer
--- | --- | --- | ---
text |	*CSV*	| `read_csv` |	`to_csv`
text |	*JSON* |	`read_json` |	`to_json`
text |	*HTML* |	`read_html` |	`to_html`
text |	*Local clipboard*	| `read_clipboard` |	`to_clipboard`
binary |	*MS Excel* |	`read_excel` |	`to_excel`
binary |	*HDF5 Format* |	`read_hdf` |	`to_hdf`
binary |	*Feather Format* |	`read_feather` |	`to_feather`
binary |	*Parquet Format* |	`read_parquet` |	`to_parquet`
binary |	*Msgpack* |	`read_msgpack` |	`to_msgpack`
binary |	*Stata* |	`read_stata` |	`to_stata`
binary |	*SAS* |	`read_sas` | _
binary |	*Python Pickle Format* |	`read_pickle`	| `to_pickle`
SQL |	*SQL* |	`read_sql` |	`to_sql`
SQL	| *Google Big Query* |	`read_gbq` |	`to_gbq`

In [23]:
import numpy as np
import pandas as pd

[Download the dataset from here!](https://www.kaggle.com/c/titanic/data)

In [24]:
df = pd.read_csv('train_detailed.csv')
df.head()

Unnamed: 0,Titanic Dataset,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,Unnamed: 10,Unnamed: 11
0,From Kaggle,,,,,,,,,,,
1,,,,,,,,,,,,
2,Variable,Definition,Key,,,,,,,,,
3,survival,Survival,"0 = No, 1 = Yes",,,,,,,,,
4,pclass,Ticket class,"1 = 1st, 2 = 2nd, 3 = 3rd",,,,,,,,,


In [25]:
df = pd.read_csv('train_detailed.csv', skiprows=16)
df.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


In [29]:
df = pd.read_csv('train_detailed.csv', skiprows=16)
df.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


In [27]:
df = pd.read_csv('file.csv', usecols=['Survived', 'Age', 'Name'])
df.head()

Unnamed: 0,Survived,Name,Age
0,0,"Braund, Mr. Owen Harris",22.0
1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",38.0
2,1,"Heikkinen, Miss. Laina",26.0
3,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",35.0
4,0,"Allen, Mr. William Henry",35.0


### Make the passengerId index
___

In [30]:
df.index = df.PassengerId

In [31]:
df.head()

Unnamed: 0_level_0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
1,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
2,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
3,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
4,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
5,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


### Dropping a column
___

In [32]:
df.drop(['PassengerId'], axis=1, inplace=True)

In [33]:
df.head()

Unnamed: 0_level_0,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


### Writing to disk
___

In [13]:
df.to_csv('file.csv')