# Pandas `join()` Function - Detailed Explanation

The `join()` function in pandas is used to combine two DataFrames **based on their indexes**. It provides an easy and efficient way to merge datasets that share a common index.

## Syntax:
```python
DataFrame.join(other, on=None, how='left', lsuffix='', rsuffix='', sort=False)
```

## Parameters:
- `other`: DataFrame, Series, or list of DataFrames to join with the caller.
- `on`: Column or index level name. It’s used when joining on a key other than the index.
- `how`: The type of join to perform:
  - `'left'`: Keep all rows from the left DataFrame and merge matching rows from the right.
  - `'right'`: Keep all rows from the right DataFrame and merge matching rows from the left.
  - `'outer'`: Keep all rows from both DataFrames, filling in missing values where necessary.
  - `'inner'`: Keep only the rows that exist in both DataFrames.
- `lsuffix`: Suffix to apply to overlapping column names from the left DataFrame.
- `rsuffix`: Suffix to apply to overlapping column names from the right DataFrame.
- `sort`: If `True`, the result is sorted by the join key.

## Example DataFrames:


In [1]:
import pandas as pd

# Creating two sample DataFrames
df1 = pd.DataFrame({
    'ID': [1, 2, 3],
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35]
}).set_index('ID')

df2 = pd.DataFrame({
    'ID': [1, 2, 4],
    'Salary': [50000, 60000, 70000]
}).set_index('ID')

# Performing different types of joins
inner_join = df1.join(df2, how='inner')  # Only matching IDs
left_join = df1.join(df2, how='left')  # All IDs from df1
right_join = df1.join(df2, how='right')  # All IDs from df2
outer_join = df1.join(df2, how='outer')  # All IDs from both DataFrames

# Display results
(inner_join, left_join, right_join, outer_join)


(     Name  Age  Salary
 ID                    
 1   Alice   25   50000
 2     Bob   30   60000,
        Name  Age   Salary
 ID                       
 1     Alice   25  50000.0
 2       Bob   30  60000.0
 3   Charlie   35      NaN,
      Name   Age  Salary
 ID                     
 1   Alice  25.0   50000
 2     Bob  30.0   60000
 4     NaN   NaN   70000,
        Name   Age   Salary
 ID                        
 1     Alice  25.0  50000.0
 2       Bob  30.0  60000.0
 3   Charlie  35.0      NaN
 4       NaN   NaN  70000.0)

### Utilizando el parametro on

In [None]:

# Crear el primer DataFrame
df1 = pd.DataFrame({
    'ID': [1, 2, 3],
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35]
})

# Crear el segundo DataFrame sin el mismo índice
df2 = pd.DataFrame({
    'ID': [1, 2, 4],
    'Salary': [50000, 60000, 70000]
})

# Realizar la unión utilizando la columna "ID"
result = df1.join(df2.set_index('ID'), on='ID', how='left')

# Mostrar el resultado
print(result)

## Exercises

Try to solve the following exercises to deepen your understanding of the `join()` function.

1. **Create your own DataFrames**: Create two DataFrames with a common column as the index and join them using different `how` parameters.
2. **Use `on` parameter**: Modify one DataFrame to have a different index and use the `on` parameter to join them based on a column.
3. **Handle missing values**: Perform an outer join and handle missing values using `.fillna()`.
4. **Use suffixes**: Add columns with the same name to both DataFrames and use `lsuffix` and `rsuffix` to differentiate them.
5. **Sort the results**: Perform a join operation and sort the resulting DataFrame.

Try implementing these exercises in the following code cell.


In [None]:
# Your code here for exercises
import pandas as pd

# Create your DataFrames and test different join operations

