In [3]:
import pandas as pd

# Reindexing & Aligning Data

In [4]:
# Sample dataframe
df = pd.DataFrame({
    "Name": ["Onkar", "Amit", "Sara"],
    "Salary": [50000, 65000, 55000]
}, index=[101, 102, 104])

df

Unnamed: 0,Name,Salary
101,Onkar,50000
102,Amit,65000
104,Sara,55000


## 1. What is Reindexing?

- Changing the order of rows/columns
- Adding new index labels
- Removing existing ones  

Pandas will align data by index labels, not by position.

## 2. Reindex Rows — .reindex()

The `.reindex()` can be used when we want to align data.  
If the list passed to function contains a new unknown index -> The values of that index will become Null or we can assign the `fill_value = 'val'`.

In [5]:
df.reindex([101, 102, 103, 104])

Unnamed: 0,Name,Salary
101,Onkar,50000.0
102,Amit,65000.0
103,,
104,Sara,55000.0


In [8]:
df.reindex([101, 102, 103, 104], fill_value=0)

Unnamed: 0,Name,Salary
101,Onkar,50000
102,Amit,65000
103,0,0
104,Sara,55000


In [9]:
df.reindex([104, 103, 102, 101], fill_value='@')

Unnamed: 0,Name,Salary
104,Sara,55000
103,@,@
102,Amit,65000
101,Onkar,50000


## 3. Reindex Columns

For reindexing of column -> `df.reindex(columns=[...], fill_value="val")`  
Can Reorder the columns and can add new column with null or any other value but it would be same for every index.

In [11]:
df.reindex(columns=["Salary", "Name", "Bonus"])

Unnamed: 0,Salary,Name,Bonus
101,50000,Onkar,
102,65000,Amit,
104,55000,Sara,


In [12]:
df.reindex(columns=["Salary", "Name", "Bonus"], fill_value = 100)

Unnamed: 0,Salary,Name,Bonus
101,50000,Onkar,100
102,65000,Amit,100
104,55000,Sara,100


## 4. Aligning Data — Core Pandas Concept

In pandas when we operates on objects, it aligns by index.  
Alignment happend on common index.  
Missign labels -> NaN

In [23]:
s1 = pd.Series([10, 20, 30], index=["A", "B", "C"])
s2 = pd.Series([1, 2, 3], index=["B", "C", "D"])

In [26]:
s1

A    10
B    20
C    30
dtype: int64

In [27]:
s2

B    1
C    2
D    3
dtype: int64

In [24]:
s1 + s2

A     NaN
B    21.0
C    32.0
D     NaN
dtype: float64

The common on both are B & C so the result come of only those the A & D are not common so there is NaN.

## 5. Align Two DataFrames — .align()

Aligning two dataframes can be done by `.align()` method. It returns both df's aligned.  
So the df's also align with index labels.

In [28]:
df1 = pd.DataFrame({"A": [1, 2]}, index=[1, 2])
df2 = pd.DataFrame({"A": [3, 4]}, index=[2, 3])

In [29]:
left, right = df1.align(df2)

In [31]:
left

Unnamed: 0,A
1,1.0
2,2.0
3,


In [32]:
right

Unnamed: 0,A
1,
2,3.0
3,4.0
