# Tidying data for analysis

## Principles of tidy data
1. Columns represent separate variables
2. Rows represent individual observations
3. Observational units form a table

In [1]:
import pandas as pd

airquality = pd.read_csv('data/airquality.csv')

In [2]:
airquality.head()

Unnamed: 0,Ozone,Solar.R,Wind,Temp,Month,Day
0,41.0,190.0,7.4,67,5,1
1,36.0,118.0,8.0,72,5,2
2,12.0,149.0,12.6,74,5,3
3,18.0,313.0,11.5,62,5,4
4,,,14.3,56,5,5


## Reshaping your data using melt

Use `pd.melt()` to melt the `Ozone`, `Solar.R`, `Wind`, and `Temp` columns of airquality into rows. Do this by using `id_vars` to the column you do not wish to melt: '`Date`'.

In [6]:
# Melt airquality: airquality_melt
airquality_melt = pd.melt(airquality, id_vars='Day')

# Print the head of airquality_melt
airquality_melt.head()

Unnamed: 0,Day,variable,value
0,1,Ozone,41.0
1,2,Ozone,36.0
2,3,Ozone,12.0
3,4,Ozone,18.0
4,5,Ozone,


## Customizing melted data

When melting DataFrames, it would be better to have column names more meaningful than `variable` and `value` (the default names used by `pd.melt()`)

In [7]:
# Melt airquality: airquality_melt
airquality_melt = pd.melt(airquality, id_vars='Day', var_name='measurement', value_name='reading')

# Print the head of airquality_melt
airquality_melt.head()

Unnamed: 0,Day,measurement,reading
0,1,Ozone,41.0
1,2,Ozone,36.0
2,3,Ozone,12.0
3,4,Ozone,18.0
4,5,Ozone,
