The join() function in Pandas is used to combine two DataFrames based on their index (or optionally, a column). It’s simpler than merge() when you're joining by index.

✅ Syntax

DataFrame.join(other, on=None, how='left', lsuffix='', rsuffix='', sort=False)

    other: The DataFrame to join.

    on: Optional column to join on (for one DataFrame).

    how: 'left' (default), 'right', 'inner', 'outer'

    lsuffix, rsuffix: Add suffixes for overlapping column names.

In [1]:
# 1. Join on Index (Default Behavior)
import pandas as pd

df1 = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Charlie']
}, index=[1, 2, 3])

df2 = pd.DataFrame({
    'Department': ['HR', 'IT', 'Sales']
}, index=[1, 2, 4])

result = df1.join(df2)
print(result)


      Name Department
1    Alice         HR
2      Bob         IT
3  Charlie        NaN


In [2]:
#2. Join with how='inner'
result = df1.join(df2, how='inner')
print(result)

    Name Department
1  Alice         HR
2    Bob         IT


In [3]:
#3. Join on a Column (not index)
df1 = pd.DataFrame({
    'EmpID': [1, 2, 3],
    'Name': ['Alice', 'Bob', 'Charlie']
})

df2 = pd.DataFrame({
    'EmpID': [1, 2, 4],
    'Department': ['HR', 'IT', 'Finance']
})

# Set EmpID as index for df2
df2.set_index('EmpID', inplace=True)

# Join on EmpID column
result = df1.join(df2, on='EmpID')
print(result)


   EmpID     Name Department
0      1    Alice         HR
1      2      Bob         IT
2      3  Charlie        NaN


join() vs merge()

 
| Feature          | `join()`           | `merge()`                       |
| ---------------- | ---------------    | ------------------------------- |
| Default join key | Index              | Common column or specified keys |
| Simpler syntax   | ✅ Yes            | ❌ More verbose                  |
| Multi-key join   | ❌ Not supported  | ✅ Supported                     |
| SQL-style joins  | ❌ Limited        | ✅ Full JOIN support             |
