## DATA CONCATENATION AND MERGING
Think of it like joining tables in Excel or SQL, but pandas gives you two main tools depending on how the data is related.

#### Data Merging using `merge()`
- Combines datasets using a common key. Like *SQL JOIN*
- **Real-world use case**
    1. Orders table + Customers table
    2. Student details + Exam results
    3. Product ID + Price table

- **Syntax: `pd.merge(df1, df2, on='id')`**
    - Here 'on=' is a common key which both dataFrames should have in common to merge both df.
    - Or you can say `on=` is the criteria.

- **When to use merge**
    - ✔ Related datasets
    - ✔ Common column exists
    - ✔ You want relational joining

In [11]:
import pandas as pd

In [12]:
temp_df_us =  pd.DataFrame({
    'city': ["New York", "Los Angeles", "Chicago"],
    "temperature_c": [28, 24, 20]
    })
temp_df_us

Unnamed: 0,city,temperature_c
0,New York,28
1,Los Angeles,24
2,Chicago,20


In [13]:
windspeed_df_us =  pd.DataFrame({
    'city': ["New York", "Los Angeles", "Chicago"],
    "wind_speed_kmh": [20, 18, 39]
    })
windspeed_df_us

Unnamed: 0,city,wind_speed_kmh
0,New York,20
1,Los Angeles,18
2,Chicago,39


In [16]:
df = pd.merge(temp_df_us,windspeed_df_us, on='city')
df

Unnamed: 0,city,temperature_c,wind_speed_kmh
0,New York,28,20
1,Los Angeles,24,18
2,Chicago,20,39


In **`merge()`** there's a concept of Join similar to **sql joins**

- **SQL JOINS**![alt text](0_WVRM40nLnJCf0YWa.webp)
- **PANDAS MERGE JOIN(how=)**![alt text](join-types-merge-names.jpg)

**JOIN TYPES**
- *inner Join* = `how=inner` : Joins Only matching rows.
- *Left Join* = `how=left` : Joins All from left table.
- *right Join* = `how=right` : Joins All from right table.
- *outer join* = `how=outer` : Joins Everything.

In [21]:
df1 = pd.DataFrame({
    'city':["New York", "Los Angeles", "Washington"],
    "temperature_c": [28, 24, 20],
})
df1

Unnamed: 0,city,temperature_c
0,New York,28
1,Los Angeles,24
2,Washington,20


In [20]:
df2 = pd.DataFrame({
    'city':["New York", "Los Angeles", "Chicago",'Miami'],
    "wind_speed_kmh": [20, 18, 39,42]
})
df2


Unnamed: 0,city,wind_speed_kmh
0,New York,20
1,Los Angeles,18
2,Chicago,39
3,Miami,42


In [24]:
pd.merge(df1,df2, on='city', how='inner')

Unnamed: 0,city,temperature_c,wind_speed_kmh
0,New York,28,20
1,Los Angeles,24,18


In [23]:
pd.merge(df1,df2, on='city', how='left')


Unnamed: 0,city,temperature_c,wind_speed_kmh
0,New York,28,20.0
1,Los Angeles,24,18.0
2,Washington,20,


In [26]:
pd.merge(df1,df2, on='city', how='right')


Unnamed: 0,city,temperature_c,wind_speed_kmh
0,New York,28.0,20
1,Los Angeles,24.0,18
2,Chicago,,39
3,Miami,,42


In [25]:
pd.merge(df1,df2, on='city', how='outer')


Unnamed: 0,city,temperature_c,wind_speed_kmh
0,Chicago,,39.0
1,Los Angeles,24.0,18.0
2,Miami,,42.0
3,New York,28.0,20.0
4,Washington,20.0,
