# Pivoting & Melting

In [1]:
import pandas as pd
import numpy as np

## Pivoting

The pivot function is used to create a new derived table out of a given one. Pivot takes 3 arguements with the following names: **index**, **columns**, and **values**. As a value for each of these parameters you need to specify a column name in the original table. Then the pivot function will create a new table, whose row and column indices are the unique values of the respective parameters. The cell values of the new table are taken from column given as the values parameter.

In [12]:
from collections import OrderedDict
table = OrderedDict((
    ("Item", ['Item0', 'Item0', 'Item1', 'Item1']),
    ('CType',['Gold', 'Bronze', 'Gold', 'Silver']),
    ('USD',  ['1$', '2$', '3$', '4$']),
    ('EU',   ['1€', '2€', '3€', '4€'])
))
d = pd.DataFrame(table)
d

Unnamed: 0,Item,CType,USD,EU
0,Item0,Gold,1$,1€
1,Item0,Bronze,2$,2€
2,Item1,Gold,3$,3€
3,Item1,Silver,4$,4€


In such a table, it is not easy to see how the USD price varies over different customer types (CType). We reshape/pivot the table so that **all USD prices for an item are on the row** to compare more easily.

In [13]:
p = d.pivot(index='Item', columns='CType', values='USD')
p

CType,Bronze,Gold,Silver
Item,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Item0,2$,1$,
Item1,,3$,4$


![pivoting_simple1.png](pivoting_simple1.png)

This invocation creates a new table/DataFrame whose columns are the unique values in d.CType and whose rows are indexed with the unique values of d.Item. Each cell in the newly created DataFrame will have as a value the entry of the USD column in the original table corresponding to the same Item and CType.

In normal, if we don't pivot then we will have to access the values as below;

In [6]:
print (d[(d.Item=='Item0') & (d.CType=='Gold')].USD.values)

['1$']


In [7]:
print (p[p.index=='Item0'].Gold.values)

['1$']


## Pivoting By Multiple Columns

In this case, Pandas will create a hierarchical column index (MultiIndex) for the new table. 

In [10]:
p = d.pivot(index='Item', columns='CType', values=['USD','EU'])
p

Unnamed: 0_level_0,USD,USD,USD,EU,EU,EU
CType,Bronze,Gold,Silver,Bronze,Gold,Silver
Item,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
Item0,2$,1$,,2€,1€,
Item1,,3$,4$,,3€,4€


![pivoting_simple_multicolumn.png](pivoting_simple_multicolumn.png)

In [11]:
# Original DataFrame: Access the USD cost of Item0 for Gold customers
print(d[(d.Item=='Item0') & (d.CType=='Gold')].USD.values)

# Pivoted DataFrame: p.USD gives a "sub-DataFrame" with the USD values only
print(p.USD[p.USD.index=='Item0'].Gold.values)

['1$']
['1$']


to tidy the data, 
- one row one observations
- one column one feature

i.e in filterig In/Out SMS for each city by date city is row(observation) and date(feature). 

In [None]:
import pandas as pd

data = {
    'City' = [],
    'Date' = [],
    'SMS' = [],
    'I/O' = []
}





In [None]:
pivoted=pd.pivot_table(df,index='City', columns='Date', values='I/O',aggfunc='sum',fill_value=0)

- **wide data** - output of pivot data is also called, 
- **long data** - 

http://nikgrozev.com/2015/07/01/reshaping-in-pandas-pivot-pivot-table-stack-and-unstack-explained-with-pictures/

<hr />

## Melting

reverse of pivot