# [PIVOTS AND RESHAPING DATA](http://pandas.pydata.org/pandas-docs/stable/reshaping.html#reshaping-and-pivot-tables)

In [15]:
import pandas as pd
import numpy as np
from pandas.core.reshape import melt

In [2]:
m = pd.read_csv('../data/m.csv')
m

Unnamed: 0,No,Value,Team
0,1,45,HR
1,2,86,HR
2,3,247,HR
3,4,845,HR
4,1,325,Technical
5,2,75,Technical
6,3,36,Technical
7,4,57,Technical
8,1,89,Management
9,2,860,Management


In [3]:
m.pivot(index='No', columns='Team', values='Value')

Team,HR,Management,Technical
No,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,45,89,325
2,86,860,75
3,247,457,36
4,845,42,57


In [4]:
pd.pivot_table(m, index='No', columns=['Team'], values='Value')

Team,HR,Management,Technical
No,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,45,89,325
2,86,860,75
3,247,457,36
4,845,42,57


In [5]:
pd.pivot_table(m, columns='Team', values='Value', aggfunc=np.sum)

Team
HR            1223
Management    1448
Technical      493
Name: Value, dtype: int64

## [Stacking and unstacking](http://pandas.pydata.org/pandas-docs/stable/reshaping.html#reshaping-by-stacking-and-unstacking)
In addition to the pivot functions, the stack and unstack functions are also available on Series and DataFrames, that work on objects containing MultiIndexes.

### [The stack() function](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.stack.html)

In [6]:
mIndexed = m.set_index(['Team', 'No'])
mIndexed

Unnamed: 0_level_0,Unnamed: 1_level_0,Value
Team,No,Unnamed: 2_level_1
HR,1,45
HR,2,86
HR,3,247
HR,4,845
Technical,1,325
Technical,2,75
Technical,3,36
Technical,4,57
Management,1,89
Management,2,860


In [7]:
mUnstacked = mIndexed.unstack(level='Team')
mUnstacked

Unnamed: 0_level_0,Value,Value,Value
Team,HR,Management,Technical
No,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
1,45,89,325
2,86,860,75
3,247,457,36
4,845,42,57


In [10]:
mStacked = mUnstacked.stack(level='Team')
mStacked

Unnamed: 0_level_0,Unnamed: 1_level_0,Value
No,Team,Unnamed: 2_level_1
1,HR,45
1,Management,89
1,Technical,325
2,HR,86
2,Management,860
2,Technical,75
3,HR,247
3,Management,457
3,Technical,36
4,HR,845


The unstack() function by default unstacks the last level

In [13]:
mUnstacked1 = mIndexed.unstack()
mUnstacked1

Unnamed: 0_level_0,Value,Value,Value,Value
No,1,2,3,4
Team,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2
HR,45,86,247,845
Management,89,860,457,42
Technical,325,75,36,57


The stack() function by default sets the stacked level as the lowest level in the resulting MultiIndex on the rows:

In [14]:
mUnstacked1.stack()

Unnamed: 0_level_0,Unnamed: 1_level_0,Value
Team,No,Unnamed: 2_level_1
HR,1,45
HR,2,86
HR,3,247
HR,4,845
Management,1,89
Management,2,860
Management,3,457
Management,4,42
Technical,1,325
Technical,2,75


## Other methods to reshape DataFrames
### [Using the melt function](http://pandas.pydata.org/pandas-docs/stable/reshaping.html#reshaping-by-melt)
The melt function enables us to transform a DataFrame by designating some of its columns as ID columns. This ensures that they will always stay as columns after any pivoting transformations. The remaining non-ID columns can be treated as variable and can be pivoted and become part of a name-value two column scheme. ID columns uniquely identify a row in the DataFrame.

In [16]:
countries = pd.read_csv('../data/countries.csv')
countries

Unnamed: 0,Year,USA,Canada,England,France,Italy
0,2001,64,46,98,85,64
1,2003,43,69,44,94,63


In [20]:
melted = melt(countries, id_vars=['Year'], var_name='Country', value_name='GDP')
melted

Unnamed: 0,Year,Country,GDP
0,2001,USA,64
1,2003,USA,43
2,2001,Canada,46
3,2003,Canada,69
4,2001,England,98
5,2003,England,44
6,2001,France,85
7,2003,France,94
8,2001,Italy,64
9,2003,Italy,63


### [The pandas.get_dummies() function](http://pandas.pydata.org/pandas-docs/stable/reshaping.html#computing-indicator-dummy-variables)
This function is used to convert a categorical variable into an indicator DataFrame, which is essentially a truth table of possible values of the categorical variable. 

In [22]:
pd.get_dummies(melted['Country'])

Unnamed: 0,Canada,England,France,Italy,USA
0,0,0,0,0,1
1,0,0,0,0,1
2,1,0,0,0,0
3,1,0,0,0,0
4,0,1,0,0,0
5,0,1,0,0,0
6,0,0,1,0,0
7,0,0,1,0,0
8,0,0,0,1,0
9,0,0,0,1,0
