# Reshaping and Pivot Tables

## 1 Main data maniputlations

We will cover in this notebook how to reshape data from LONG to WIDE and alo from WIDE to LONG format.

## 2 Two main data layouts  

**Two main data layouts**

- When working with data, we might have our DataFrame structured in two main ways: 

- a) **staked** or **LONG** format: There are **Multiple** rows for each subject where applicable.

- b) In a **record** or **WIDE** format: Typically there is **one row** for each subject.  


### 1.1 Long format

In the **stacked** or **long** format, there are multiple rows for each subject where applicable.

In [4]:
import pandas as pd

In [7]:
data = { "value": range(12),
        "variable":["A"]*3 + ["B"]*3 + ["C"]*3 + ["D"]* 3,
        "date": pd.to_datetime(["2020-01-03","2020-01-04","2020-01-05"]*4)}


We have made a dictionary that then we turn into a DataFrame

In [10]:
type(data)

dict

In [8]:
df = pd.DataFrame(data)


In [9]:
df.head()

Unnamed: 0,value,variable,date
0,0,A,2020-01-03
1,1,A,2020-01-04
2,2,A,2020-01-05
3,3,B,2020-01-03
4,4,B,2020-01-04


### 1.1 Wide format

In the **record** or **wide** format, there is one row for each subject.

## 3. Data reshape 

To be specific about what these two data reshape operations mean: 

- LONG to WIDE: We transform initial **dataframe** from **LONG** to **WIDE** format, we increase the number of **COLUMNS**, reducing the number of **rows**.

- WIDE to LONG: We transform initial **dataframe** from **WIDE** to **LONG** format, we increase the number of **ROWS**, reducing the number of **columns**.

### 3.1 LONG to WIDE

We use **pivot()** method to reshape data from LONG into WIDE

In [11]:
df.head()

Unnamed: 0,value,variable,date
0,0,A,2020-01-03
1,1,A,2020-01-04
2,2,A,2020-01-05
3,3,B,2020-01-03
4,4,B,2020-01-04


We use **pivot()** method  to turn **value** and **variable** columns fropm **LONG** to **WIDE** format: 

In [13]:
pivoted = df.pivot(index = "date", columns = "variable", values = "value")
pivoted.head()

variable,A,B,C,D
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2020-01-03,0,3,6,9
2020-01-04,1,4,7,10
2020-01-05,2,5,8,11


In [15]:
pivoted.columns

Index(['A', 'B', 'C', 'D'], dtype='object', name='variable')

- Another example of pivoting data from LONG to WIDE

In [16]:
df2 = pd.DataFrame({
    'year': [2010, 2010, 2011, 2011],
    'city': ['New York', 'Los Angeles', 'New York', 'Los Angeles'],
    'population': [8175133, 3792621, 8491079, 3971883]
})

In [17]:
df2.head()

Unnamed: 0,year,city,population
0,2010,New York,8175133
1,2010,Los Angeles,3792621
2,2011,New York,8491079
3,2011,Los Angeles,3971883


Again we pivot previous data frame from Long to Wide using **pivot()** method  to turn **value** and **variable** columns fropm **LONG** to **WIDE** format: 

In [24]:
df2_wide = df2.pivot(index = 'year', columns = 'city', values = 'population')



In [25]:
print(df2_wide)

city  Los Angeles  New York
year                       
2010      3792621   8175133
2011      3971883   8491079


### 3.2 WIDE to LONG

In this second example,  We transform initial **dataframe** from **WIDE** to **LONG** format, we increase the number of **ROWS**, reducing the number of **columns**

In a **record** or **WIDE** format: Typically there is **one row** for each subject. 

In [28]:
df3 = pd.DataFrame({
    'famid': [1, 1, 1, 2, 2, 2, 3, 3, 3],
    'birth': [1, 2, 3, 1, 2, 3, 1, 2, 3],
    'ht_one': [2.8, 2.9, 2.2, 2, 1.8, 1.9, 2.2, 2.3, 2.1],
    'ht_two': [3.4, 3.8, 2.9, 3.2, 2.8, 2.4, 3.3, 3.4, 2.9]
})

In [29]:
df3

Unnamed: 0,famid,birth,ht_one,ht_two
0,1,1,2.8,3.4
1,1,2,2.9,3.8
2,1,3,2.2,2.9
3,2,1,2.0,3.2
4,2,2,1.8,2.8
5,2,3,1.9,2.4
6,3,1,2.2,3.3
7,3,2,2.3,3.4
8,3,3,2.1,2.9


## Annex

### Online resources

<https://pandas.pydata.org/docs/user_guide/reshaping.html>

Pandas provides methods for manipulating a series and DataFrame to alter the representation of the data for further data processing or data summarization.


*Reshape wide to long using **pd.wide_to_long()** method*

<https://pandas.pydata.org/docs/reference/api/pandas.wide_to_long.html>