# 🐍 Python + Pandas 🐼

#### Utilizing the Indicator Parameter in the Pandas Merge Function

When merging DataFrames in Python, it’s often useful to know which DataFrame the merged data comes from.

In [1]:
import pandas as pd

In [2]:
person_df = [
    {"PersonID": 1, "Name": "Wikus"},
    {"PersonID": 2, "Name": "Thandi"},
    {"PersonID": 3, "Name": "Xeki"}
]

person_df = pd.DataFrame(person_df)

In [3]:
purchase_df = {
    'PersonID': [3, 3, 4],
    'Item': ['Egg', 'Potato', 'Petrol'],
    'Quantity': [5, 10, 15],
    'Price': [10.50, 29.99, 5.50]
}

purchase_df = pd.DataFrame(purchase_df)

By using the 'indicator' parameter (set to True), an additional column called '_merge' is added to the output DataFrame.

In [4]:
final_df = pd.merge(
    person_df,    
    purchase_df,  
    on="PersonID", 
    how="outer",
    indicator=True
)

In [5]:
print(final_df)

   PersonID    Name    Item  Quantity  Price      _merge
0         1   Wikus     NaN       NaN    NaN   left_only
1         2  Thandi     NaN       NaN    NaN   left_only
2         3    Xeki     Egg       5.0  10.50        both
3         3    Xeki  Potato      10.0  29.99        both
4         4     NaN  Petrol      15.0   5.50  right_only


The '_merge' column can take three values:
- left_only: if the record is only in the left DataFrame.
- right_only: if the record is only in the right DataFrame.
- both: if the record is in both DataFrames.