### Reshaping data
Transform the initial data model to the desired shape to ease the cleaning and manipulation of data for analysis
* usually performed on series or dataframe with multiple indexes



In [1]:
import pandas as pd
import numpy as np

#### Stacking
Stacking means taking the innermost level of column index from a multi-indexed dataframe and adding it as another level to the innermost row index.
If the dataframe is not multi-indexed, it takes the first column and moves it to the innermost row level.

In [2]:
df=pd.DataFrame(np.arange(12).reshape(3,4), index=['Row1','Row2','Row3'], columns=['Col1','Col2','Col3','Col4'])
df.index.name = 'Row'
df.columns.name = 'Column'

In [3]:
print("The original dataframe\n",df)

The original dataframe
 Column  Col1  Col2  Col3  Col4
Row                           
Row1       0     1     2     3
Row2       4     5     6     7
Row3       8     9    10    11


In [5]:
print('The stacked dataframe\n')
stacked_df=df.stack()
print(stacked_df)


The stacked dataframe

Row   Column
Row1  Col1       0
      Col2       1
      Col3       2
      Col4       3
Row2  Col1       4
      Col2       5
      Col3       6
      Col4       7
Row3  Col1       8
      Col2       9
      Col3      10
      Col4      11
dtype: int32


#### The stack method has converted the single indexed dataframe to a multiindexed Series.
Column index is moved to the innermost row index along with their values

#### Unstacking
* Unstacking a series or dataframe takes the innermost level of row index from a multi-indexed data frame and add it as another level to the innermost column index.
* if its not multi-indexed, it takes the first row and moves it to the innermost column level.


In [6]:
print("The stacked dataframe\n",stacked_df)

The stacked dataframe
 Row   Column
Row1  Col1       0
      Col2       1
      Col3       2
      Col4       3
Row2  Col1       4
      Col2       5
      Col3       6
      Col4       7
Row3  Col1       8
      Col2       9
      Col3      10
      Col4      11
dtype: int32


In [7]:
print("The unstacked dataframe\n")
print(stacked_df.unstack())

The unstacked dataframe

Column  Col1  Col2  Col3  Col4
Row                           
Row1       0     1     2     3
Row2       4     5     6     7
Row3       8     9    10    11


In [8]:
print("The unstacked dataframe on named index\n")
print(stacked_df.unstack('Row'))

The unstacked dataframe on named index

Row     Row1  Row2  Row3
Column                  
Col1       0     4     8
Col2       1     5     9
Col3       2     6    10
Col4       3     7    11


#### if unstacking is done on a series object, then it transforms to a dataframe

### Pivot tables
* creates a table that contains information from original table, based on parameters defined by the user.
* it describes what information the user wants to know and how the user wants to present the information
* pivot(row_index,column_index,value)

In [10]:
df1=pd.DataFrame({'Company':['Google','Microsoft','Google','Microsoft'],
                 'product':['Editor','Editor','Calendar','Azure'],
                 'price':['$200','$250','$50','$400']})
df1.index.name='Row'
df1.columns.name='Column'
print("Original dataframe\n")
print(df1)

Original dataframe

Column    Company   product price
Row                              
0          Google    Editor  $200
1       Microsoft    Editor  $250
2          Google  Calendar   $50
3       Microsoft     Azure  $400


In [12]:
print("Pivoted data frame\n")
print(df1.pivot('Company','product','price')) #pivot(row_index,column_index,value)

Pivoted data frame

product   Azure Calendar Editor
Company                        
Google      NaN      $50   $200
Microsoft  $400      NaN   $250
