# Objective : Loading Data into DataFrames

<hr>

1. Sources from which dataframes can be created
2. Loading from CSV
3. Loading from JSON - Structured & Unstructured
4. Loading from Excel

<hr>

### 1. Sources from which dataframes can be created
* Reading data from different sources, here is the list.
* Also, includes writing data to different sources.

<img src="images/IO.png">

In [None]:
import pandas as pd

### 2. Reading CSV

In [None]:
rental_data = pd.read_csv('../Data/house_rental_data.csv.txt')

In [None]:
rental_data.info()

In [None]:
rental_data.head()

In [None]:
rental_data = pd.read_csv('../Data/house_rental_data.csv.txt', index_col = 'Unnamed: 0')

In [None]:
rental_data.head()

In [None]:
rental_data = pd.read_csv('../Data/house_rental_data.csv.txt', usecols=lambda c: c.startswith('B'))

In [None]:
rental_data.head()

In [None]:
rental_data = pd.read_csv('../Data/house_rental_data.csv.txt', nrows=10)

In [None]:
rental_data.head()

In [None]:
rental_data_itr = pd.read_csv('../Data/house_rental_data.csv.txt', chunksize=300)

In [None]:
for data in rental_data_itr:
    print (data.count())

In [None]:
titanic_data = pd.read_csv('../Data/titanic-train.csv.txt', index_col = 'PassengerId', na_values={'Ticket':'PC 17599'})

In [None]:
titanic_data.head()

In [None]:
pd.read_csv('../Data/sales-data.csv').head()

In [None]:
from datetime import datetime

def parser(x):
    return datetime.strptime('200'+x, '%Y-%m')
 
data = pd.read_csv('../Data/sales-data.csv', header=0, parse_dates=[0], index_col=0, date_parser=parser)

In [None]:
data.head()

### 3. Loading from JSON

In [None]:
pd.read_json('https://raw.githubusercontent.com/corysimmons/colors.json/master/colors.json', orient='records').T.head()

In [None]:
pd.set_option('display.max_colwidth', -1)
pd.read_json('../Data/raw_nyc_phil.json').head(1)

In [None]:
import json
with open('../Data/raw_nyc_phil.json') as f:
    d = json.load(f)

In [None]:
nycphil = pd.io.json.json_normalize(d['programs'])
nycphil.head(3)

In [None]:
works_data = pd.io.json.json_normalize(data=d['programs'], record_path='works', 
                            meta=['id', 'orchestra','programID', 'season'])
works_data.head(3)

In [None]:
works_data = pd.io.json.json_normalize(data=d['programs'], record_path='concerts', 
                            meta=['id', 'orchestra','programID', 'season'])
works_data.head(3)

In [None]:
soloist_data = pd.io.json.json_normalize(data=d['programs'], record_path=['works', 'soloists'], 
                              meta=['id'])
soloist_data.head(3)

### 4. Loading Excel

In [None]:
sales_data = pd.read_excel('../Data/sales-funnel.xlsx')

In [None]:
sales_data.head()

In [None]:
sales_orders = pd.read_excel('../Data/SampleData.xlsx',sheet_name='SalesOrders')

In [None]:
sales_orders.info()