# Data Integration
### Table of Contents
- [Requirements](#requirements)
- [Structuring Historical Yield Data](#structuring-historical-yield-data)
- [Structuring Historical Price Received Data](#structuring-historical-price-received-data)
- [Structuring Historical Weather Data](#structuring-historical-weather-data)
- [Integrating Data](#integrating-data)

## Requirements

In [108]:
import pandas as pd

In [109]:
states = ['ILLINOIS', 'INDIANA', 'IOWA', 'MINNESOTA', 'MISSOURI', 'NEBRASKA']

yield_raw = pd.read_csv('../../data/raw/yield_raw.csv')
price_received_raw = pd.read_csv('../../data/raw/price_received_raw.csv')
weather_raw = pd.read_csv('../../data/raw/weather_raw.csv')

## Structuring Historical Yield Data

In [110]:
yield_raw = yield_raw[(yield_raw['state_name'].isin(states)) & (yield_raw['reference_period_desc'] == 'YEAR')]\
    .drop_duplicates(subset=['year', 'state_name', 'util_practice_desc'])

yield_raw = yield_raw.pivot(
    index=['year', 'state_name'],
    columns='util_practice_desc',
    values='Value'
).reset_index()

## Structuring Historical Price Received Data

In [111]:
price_received_raw = price_received_raw.pivot(
    index=['year', 'state_name'],
    columns='reference_period_desc',
    values='Value'
).reset_index()

## Structuring Historical Weather Data

In [112]:
weather_raw = weather_raw.rename(columns={'Date':'year', 'state':'state_name'})

## Integrating Data

In [113]:
temp = yield_raw.merge(price_received_raw, on=['year', 'state_name'], how='outer')
df = temp.merge(
    weather_raw,
    on=['year', 'state_name'],
    how='outer'
)
for x in df.columns:
    df.rename(columns={x:x.lower()}, inplace=True)