# **Data Transformation**

## **10. Reshaping/transposing**

In [1]:
import numpy as np 
import pandas as pd 

### 1. **What It Does and When to Use It**

#### ✅ **What it does:**

**Reshaping** and **transposing** are pandas operations used to **reorganize the structure** of a DataFrame — changing rows to columns, columns to rows, or altering the dimensional layout for analysis, modeling, or visualization.

#### ✅ **When to use:**

* When you want to **rotate a table** to get a better perspective.
* When converting **data from wide to long** or **multi-indexed form**.
* When preparing data for **pivot tables, aggregation**, or **machine learning pipelines**.
* When working with **multi-dimensional or hierarchical data**.


### 2. **Syntax and Core Parameters**

| Method                                | Purpose                                                               |
| ------------------------------------- | --------------------------------------------------------------------- |
| `df.T`                                | Transpose the entire DataFrame.                                       |
| `df.stack()`                          | "Compresses" a level in the columns into the row index (longer form). |
| `df.unstack()`                        | "Expands" a level in the row index into columns (wider form).         |
| `df.transpose()`                      | Same as `df.T`, but allows more options.                              |
| `df.set_index()` / `df.reset_index()` | Change the structure of index for reshaping.                          |

#### 🔹 Basic Syntax

```python
# Transpose
df.T  # or df.transpose()

# Stack
df.stack(level=-1, dropna=True)

# Unstack
df.unstack(level=-1, fill_value=None)
```


### 3. **Different Methods and Techniques**

| Technique                  | Description                                               |
| -------------------------- | --------------------------------------------------------- |
| `df.T` or `df.transpose()` | Flip rows and columns.                                    |
| `df.stack()`               | Convert columns to row index (long format).               |
| `df.unstack()`             | Convert row index to columns (wide format).               |
| `reshape()` (via NumPy)    | Reshape underlying array if needed for very advanced use. |
| `melt()` / `pivot()`       | Used along with reshaping (explained earlier).            |


### 4. **Common Pitfalls and Best Practices**

| Pitfall                                                         | Best Practice                                                  |
| --------------------------------------------------------------- | -------------------------------------------------------------- |
| Using `df.T` on large DataFrames may lead to memory issues.     | Use only when structure demands full transposition.            |
| `stack()`/`unstack()` require MultiIndex or hierarchical index. | Use `set_index()` and `reset_index()` smartly before/after.    |
| Stacked/unstacked data may result in `NaNs`.                    | Use `fill_value` or handle NaNs afterward.                     |
| Misalignment of axis names or levels during reshape.            | Always check `.index`, `.columns`, and `.shape` after reshape. |


### 5. **Examples on Real/Pseudo Data**

#### 📌 Example 1: Transposing a DataFrame

In [2]:
df = pd.DataFrame({
    'Math': [90, 85],
    'Science': [88, 92]
}, index=['Student1', 'Student2'])

df

Unnamed: 0,Math,Science
Student1,90,88
Student2,85,92


In [3]:
df.T

Unnamed: 0,Student1,Student2
Math,90,85
Science,88,92


#### 📌 Example 2: Using `stack()` – Columns → Index

In [4]:
df = pd.DataFrame({
    'Name': ['A', 'B'],
    'Math': [90, 80],
    'Science': [85, 88]
})

df

Unnamed: 0,Name,Math,Science
0,A,90,85
1,B,80,88


In [6]:
df.set_index('Name')

Unnamed: 0_level_0,Math,Science
Name,Unnamed: 1_level_1,Unnamed: 2_level_1
A,90,85
B,80,88


In [8]:
stacked = df.set_index('Name').stack()
stacked

Name         
A     Math       90
      Science    85
B     Math       80
      Science    88
dtype: int64

#### 📌 Example 3: Using `unstack()` – Index → Columns

In [9]:
stacked.unstack()

Unnamed: 0_level_0,Math,Science
Name,Unnamed: 1_level_1,Unnamed: 2_level_1
A,90,85
B,80,88


#### 📌 Example 4: MultiIndex reshape with `stack` and `unstack`

In [10]:
df = pd.DataFrame({
    ('Grade', 'Math'): [90, 85],
    ('Grade', 'Science'): [88, 92]
}, index=['Student1', 'Student2'])

df

Unnamed: 0_level_0,Grade,Grade
Unnamed: 0_level_1,Math,Science
Student1,90,88
Student2,85,92


In [11]:
df.stack(level=0)

  df.stack(level=0)


Unnamed: 0,Unnamed: 1,Math,Science
Student1,Grade,90,88
Student2,Grade,85,92


In [15]:
df.stack(future_stack=True)

Unnamed: 0,Unnamed: 1,Grade
Student1,Math,90
Student1,Science,88
Student2,Math,85
Student2,Science,92


In [14]:
df.stack(level=1, future_stack=True)

Unnamed: 0,Unnamed: 1,Grade
Student1,Math,90
Student1,Science,88
Student2,Math,85
Student2,Science,92


### 6. **Real World Use Cases**

| Use Case                       | Description                                                                                                   |
| ------------------------------ | ------------------------------------------------------------------------------------------------------------- |
| **Pivoting reports**           | Converting row-wise records into column-oriented reports or vice versa (e.g., monthly sales per product).     |
| **Multi-variable time series** | Reshape wide-format time series data into long format for plotting or forecasting.                            |
| **Machine Learning**           | Many ML models expect tidy (long) format with one variable per column — reshape accordingly.                  |
| **Survey Data**                | Survey responses often come in wide format; reshape to long to analyze question-wise.                         |
| **Sensor Data**                | Sensor readings stored in rows (per device, time, metric) need to be transposed or stacked for visualization. |


### ✅ Summary Table

| Method        | Type            | Description                                  |
| ------------- | --------------- | -------------------------------------------- |
| `df.T`        | Transpose       | Switch rows ↔ columns                        |
| `stack()`     | Reshape to long | Columns into rows (needs MultiIndex support) |
| `unstack()`   | Reshape to wide | Rows into columns                            |
| `transpose()` | Transpose       | Same as `.T` but with additional options     |


<center><b>Thanks</b></center>