# Loading and Handling Pandas Data

---

## Parsing Data

Import the pandas library. **Note that the code below sets pd to reference the pandas library.**

In [None]:
import pandas as pd

Loading a tsv file through the read_csv function and specifying that tabs seperate columns in the file. 

In [None]:
df = pd.read_csv('data/tsv_example.tsv', sep='\t')
df

Loading the same tsv file as above but using a function that by default uses tabs as delimination for columns.

In [None]:
df = pd.read_table('data/tsv_example.tsv')
df

Load only the first 3 rows of the dataframe.

In [None]:
df = pd.read_csv('data/tsv_example.tsv', sep='\t', nrows=3)
df.shape # Returns the number of columsn and rows of the DataFrame (rows, columns)

---

## Headers and Indexes

Loading a csv with a with a header row.

In [None]:
df = pd.read_csv("data/noheader_example.csv")
df

Specifying that the csv does not contain a header row.

In [None]:
df = pd.read_csv("data/noheader_example.csv", header=None)
df

Load a csv with a predefined index column using the column name.

In [None]:
df = pd.read_csv('data/noheader_example.csv', index_col=0)
df

Load a csv without headers but with a predefined index column using the integer location of the column.

In [None]:
df = pd.read_csv('data/noheader_example.csv', header=None, index_col=0)
df

Add a name to the index column of the dataframe above

In [None]:
df.index.name = "Unique-ID"
df

---

## Common Data Loading Problems

### Missing Values

Loading a dataframe containing missing values that are labeled as 'Null' without letting Pandas know

In [None]:
df = pd.read_csv("data/null_values_example.csv")
df

Loading the same dataframe and telling Pandas that 'Null' values indicate missing values

In [None]:
df = pd.read_csv("data/null_values_example.csv", na_values='Null')
df

## Writing Data in Text Format

Write the dataframe `df` to a file in the data directory called new_spending_data.csv

In [None]:
df.to_csv('data/new_file.csv')

## End of Notebook
---