# Concat() and Merge() Methods in Pandas

## 1. Why do we use join() or merge() in Pandas?
In data analysis, we often work with multiple datasets that are related. The goal of join() or merge() is to combine these datasets based on common columns or indexes — similar to joins in SQL.

## 2. Key Reasons to Use join() or merge()
📌 1. Combining related data
Example: You have one table with customer info and another with orders. You want to combine them by customer ID.

📌 2. Bringing in additional data
Example: You want to add region or salary data to an employee DataFrame.

📌 3. Data enrichment
By merging datasets, you can enrich your main data with more columns or rows from another source.

📌 4. Handling real-world data structure
In practice, data often comes in normalized form (spread across multiple tables). Merging helps consolidate them for analysis.

In [2]:
# import libraries
import pandas as pd
import numpy as np

In [4]:
customers = {
    "CustomerID": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
    "FirstName": ["Emre","Ali","Veli","Ayşe","Fatma","Ceren","Hasan","Poyraz","Kuzey","Dilruba", "Deniz","Meryem"],
    "LastName": ["Üstübeç","Karaağaç","Meşe","Sultan","Hatun","Bilgin","Yüksel","Selcan","Uysal","Oltacı","Kasap","Gülsen"],
    "Age": [26, 32, 18, 7, 85, 17, 26, 54, 12, 3, 65, 59]
}


orders = {
    "OrderID":[100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111],
    "CustomerID": [1, 2, 2, 4, 5, 7, 7, 8, 9, 10, 10, 12],
    "OrderDate": ["2025-06-08", "2025-06-09", "2025-08-25", "2025-11-14","2025-11-14","2025-11-14","2025-12-30","2025-12-30","2025-09-22","2025-10-18","2025-02-14","2025-02-14"]
}


df_customers = pd.DataFrame(customers, columns =["CustomerID","FirstName","LastName","Age"])
df_orders = pd.DataFrame(orders, columns = ["OrderID","CustomerID","OrderDate"])

print("*"*20, "CUSTOMERS","*"*20)
print(df_customers,"\n")
print("*"*22, "ORDERS","*"*22)
print(df_orders,"\n")



******************** CUSTOMERS ********************
    CustomerID FirstName  LastName  Age
0            1      Emre   Üstübeç   26
1            2       Ali  Karaağaç   32
2            3      Veli      Meşe   18
3            4      Ayşe    Sultan    7
4            5     Fatma     Hatun   85
5            6     Ceren    Bilgin   17
6            7     Hasan    Yüksel   26
7            8    Poyraz    Selcan   54
8            9     Kuzey     Uysal   12
9           10   Dilruba    Oltacı    3
10          11     Deniz     Kasap   65
11          12    Meryem    Gülsen   59 

********************** ORDERS **********************
    OrderID  CustomerID   OrderDate
0       100           1  2025-06-08
1       101           2  2025-06-09
2       102           2  2025-08-25
3       103           4  2025-11-14
4       104           5  2025-11-14
5       105           7  2025-11-14
6       106           7  2025-12-30
7       107           8  2025-12-30
8       108           9  2025-09-22
9       109  

In [5]:
# Merge Examples
merged_df = pd.merge(df_customers, df_orders, how = "inner")
print("********************* Merged DataFrame *********************")
print(merged_df) 

********************* Merged DataFrame *********************
    CustomerID FirstName  LastName  Age  OrderID   OrderDate
0            1      Emre   Üstübeç   26      100  2025-06-08
1            2       Ali  Karaağaç   32      101  2025-06-09
2            2       Ali  Karaağaç   32      102  2025-08-25
3            4      Ayşe    Sultan    7      103  2025-11-14
4            5     Fatma     Hatun   85      104  2025-11-14
5            7     Hasan    Yüksel   26      105  2025-11-14
6            7     Hasan    Yüksel   26      106  2025-12-30
7            8    Poyraz    Selcan   54      107  2025-12-30
8            9     Kuzey     Uysal   12      108  2025-09-22
9           10   Dilruba    Oltacı    3      109  2025-10-18
10          10   Dilruba    Oltacı    3      110  2025-02-14
11          12    Meryem    Gülsen   59      111  2025-02-14


In [None]:
# how = left
left_merged_df = pd.merge(df_customers, df_orders, how = "left")
print("********************* Left Merged DataFrame *********************")
print(left_merged_df) # get all customers 

********************* Left Merged DataFrame *********************
    CustomerID FirstName  LastName  Age  OrderID   OrderDate
0            1      Emre   Üstübeç   26    100.0  2025-06-08
1            2       Ali  Karaağaç   32    101.0  2025-06-09
2            2       Ali  Karaağaç   32    102.0  2025-08-25
3            3      Veli      Meşe   18      NaN         NaN
4            4      Ayşe    Sultan    7    103.0  2025-11-14
5            5     Fatma     Hatun   85    104.0  2025-11-14
6            6     Ceren    Bilgin   17      NaN         NaN
7            7     Hasan    Yüksel   26    105.0  2025-11-14
8            7     Hasan    Yüksel   26    106.0  2025-12-30
9            8    Poyraz    Selcan   54    107.0  2025-12-30
10           9     Kuzey     Uysal   12    108.0  2025-09-22
11          10   Dilruba    Oltacı    3    109.0  2025-10-18
12          10   Dilruba    Oltacı    3    110.0  2025-02-14
13          11     Deniz     Kasap   65      NaN         NaN
14          12    M

In [7]:
# how = "right"
right_merged_df = pd.merge(df_customers, df_orders, how = "right")
print("********************* Right Merged DataFrame *********************")
print(right_merged_df)  # get all orders 

********************* Right Merged DataFrame *********************
    CustomerID FirstName  LastName  Age  OrderID   OrderDate
0            1      Emre   Üstübeç   26      100  2025-06-08
1            2       Ali  Karaağaç   32      101  2025-06-09
2            2       Ali  Karaağaç   32      102  2025-08-25
3            4      Ayşe    Sultan    7      103  2025-11-14
4            5     Fatma     Hatun   85      104  2025-11-14
5            7     Hasan    Yüksel   26      105  2025-11-14
6            7     Hasan    Yüksel   26      106  2025-12-30
7            8    Poyraz    Selcan   54      107  2025-12-30
8            9     Kuzey     Uysal   12      108  2025-09-22
9           10   Dilruba    Oltacı    3      109  2025-10-18
10          10   Dilruba    Oltacı    3      110  2025-02-14
11          12    Meryem    Gülsen   59      111  2025-02-14


In [13]:
# Concat example

student_list1 = {
    "StudentNumber":[123, 128, 132, 147, 160, 256, 412],
    "StudentName":["Cemre","Aytaç","Halil","Can","Defne","Derya","Ozan"],
    "StudentSurname":["Beyoğlu","Arıkan","Öz","Derinler","Saman","Gözüpek","Güner"],
    "StudentGrade":["12A","12A","11A","12B","12C","11A","11B"]
    
}
student_list2 = {
    "StudentNumber":[124, 128, 142, 167, 180, 216, 284],
    "StudentName":["Demir","Kadir","Hülya","Canan","Seğer","Güney","Banu"],
    "StudentSurname":["Türkeş","Mollaoğlu","Köse","Işık","Turan","Canik","Süzgün"],
    "StudentGrade":["9A","10A","10B","10B","10C","11A","12B"]
}

df_students1 = pd.DataFrame(student_list1, columns = ["StudentNumber","StudentName","StudentSurname","StudentGrade"])
df_students2 = pd.DataFrame(student_list2, columns = ["StudentNumber","StudentName","StudentSurname","StudentGrade"])

# set options
pd.set_option("display.max_rows", 250)
pd.set_option("display.max_columns",250),
pd.set_option("display.width", 300)
# axis = 0 => row % axis = 1 => column
concatenated_df = pd.concat([df_students1, df_students2]) 
print("************************* Concat Example1 *************************")
print(concatenated_df,"\n")

concatenated_df2 = pd.concat([df_students1, df_students2], axis= 1) 
print("************************************************** Concat Example2 **************************************************")
print(concatenated_df2)

************************* Concat Example1 *************************
   StudentNumber StudentName StudentSurname StudentGrade
0            123       Cemre        Beyoğlu          12A
1            128       Aytaç         Arıkan          12A
2            132       Halil             Öz          11A
3            147         Can       Derinler          12B
4            160       Defne          Saman          12C
5            256       Derya        Gözüpek          11A
6            412        Ozan          Güner          11B
0            124       Demir         Türkeş           9A
1            128       Kadir      Mollaoğlu          10A
2            142       Hülya           Köse          10B
3            167       Canan           Işık          10B
4            180       Seğer          Turan          10C
5            216       Güney          Canik          11A
6            284        Banu         Süzgün          12B 

************************************************** Concat Example2 ********