
# 4.4 Left Join Template

1. [Identify the *Left* Table](#1.-Identify-the-Left-Table)    
2. [Identify the *Right* Table](#2.-Identify-the-Right-Table)   
3. [Identify the *Merge* Field/Column](#3.-Identify-the-Merge-Field)   
4. [Merge the Two Dataframes](#4.-Merge-the-Two-Dataframes)  
5. [Verify that the Merge is Correct](#5.-Verify-that-the-Merge-is-Correct)  
6. [Save the Merged Dataframe as a Pickle File](#6.-Save-the-Merged-Dataframe-as-a-Pickle-File)

In [22]:
import pandas as pd
import pickle

# 1. Identify the Left Table  

### Questions to answer:  
1. What is the Left Table? Orders
1. Why is this the Left table/data set?  We are merging Orders with Employees.

2. Number of Rows:  196  
3. Number of Columns:  5

In [23]:
file = 'Data/w3schools_Data.xlsx'

In [24]:
# Read the Left Table data into a dataframe named: df_Left
df_Left = pd.read_excel(file, 'Orders')

In [25]:
# Display the first rows in the dataframe
df_Left.head()

Unnamed: 0,OrderID,CustomerID,EmployeeID,OrderDate,ShipperID
0,10248,90,5,35250,3
1,10249,81,6,35251,1
2,10250,34,4,35254,2
3,10251,84,3,35254,1
4,10252,76,4,35255,2


In [26]:
print("Number of Rows in the Left dataframe:  ", df_Left.shape[0])
print("Number of Columns in Left dataframe:  ", df_Left.shape[1])

Number of Rows in the Left dataframe:   196
Number of Columns in Left dataframe:   5


# 2. Identify the Right Table  

### Questions to answer:  
1. What is the Right Table? The right table is the Employees. 
1. Why is this the Right table/data set?  This is the right table because we are adding information from the orders sheet to this sheet.
2. What columns does it contain that we want to add to the columns in the Left table/dataframe?  LastName, FirstName, BirthDate, Photo, Notes
2. Number of Rows:  10   
3. Number of Columns:  6

In [27]:
# Read the Right Table data into a dataframe named: df_Right
df_Right = pd.read_excel(file, 'Employees')

In [28]:
# Display the first rows in the dataframe
df_Right.head()

Unnamed: 0,EmployeeID,LastName,FirstName,BirthDate,Photo,Notes
0,1,Davolio,Nancy,25180,EmpID1.pic,Education includes a BA in psychology from Col...
1,2,Fuller,Andrew,19043,EmpID2.pic,Andrew received his BTS commercial and a Ph.D....
2,3,Leverling,Janet,23253,EmpID3.pic,Janet has a BS degree in chemistry from Boston...
3,4,Peacock,Margaret,21447,EmpID4.pic,Margaret holds a BA in English literature from...
4,5,Buchanan,Steven,20152,EmpID5.pic,Steven Buchanan graduated from St. Andrews Uni...


In [29]:
print("Number of Rows in the Right dataframe:  ", df_Right.shape[0])
print("Number of Columns in Right dataframe:  ", df_Right.shape[1])

Number of Rows in the Right dataframe:   10
Number of Columns in Right dataframe:   6


# 3. Identify the Merge Field  

Questions to answer:  
1. What is the Merge Column?  EmployeeID
2. Is this Merge Column in both dataframes? Yes, both dataframes have EmployeeID as a similar column.  
3. Why is this the correct Merge Column? This is the correct merge column because it is similar in each dataframe.

# 4. Merge the Two Dataframes 

In [30]:
# Create a new Dataframe from a Left Join of two existing dataframes
df_left_join = pd.merge(df_Left, df_Right, on='EmployeeID', how='left')

In [31]:
# Display the first rows in the merged dataframe
df_left_join.head()

Unnamed: 0,OrderID,CustomerID,EmployeeID,OrderDate,ShipperID,LastName,FirstName,BirthDate,Photo,Notes
0,10248,90,5,35250,3,Buchanan,Steven,20152,EmpID5.pic,Steven Buchanan graduated from St. Andrews Uni...
1,10249,81,6,35251,1,Suyama,Michael,23194,EmpID6.pic,Michael is a graduate of Sussex University (MA...
2,10250,34,4,35254,2,Peacock,Margaret,21447,EmpID4.pic,Margaret holds a BA in English literature from...
3,10251,84,3,35254,1,Leverling,Janet,23253,EmpID3.pic,Janet has a BS degree in chemistry from Boston...
4,10252,76,4,35255,2,Peacock,Margaret,21447,EmpID4.pic,Margaret holds a BA in English literature from...


# 5. Verify that the Merge is Correct  

1. Display the first several rows of the merged dataframe.
2. Display the number of rows and columns in the merged dataframe  


### Questions to answer:  
1. Number of Rows:  196
  1. How many should there be? 196  
2. Number of Columns:   10
  1. How many should there be?  The number in the Left Table + number in Right Table - 1 (Merge Column is not duplicated) 10
3. Are all the original rows and columns from the Left Table still here?  Yes, all the original columns are still here.
4. Are the additional columns from the Right table now here? Yes, the additional columns are here.

In [32]:
# Display Left dataframe info
print("Left dataframe: ")
print("Number of Rows in Left dataframe:  ", df_Left.shape[0])
print("Number of Cols in Left dataframe:  ", df_Left.shape[1])
print()

# Display Right dataframe info
print("Right dataframe: ")
print("Number of Rows in Right dataframe:  ", df_Right.shape[0])
print("Number of Cols in Right dataframe:  ", df_Right.shape[1])
print()

# Display Merged dataframe info
print("Merged dataframe: ")
print("Number of Rows in Merged dataframe:  ", df_left_join.shape[0], "(Note: This should be same as the Left dataframe)")
print("Number of Cols in Merged dataframe:  ", df_left_join.shape[1], "(Note: This should be Left Cols + Right Cols - 1)")
print()

# Check for correct number of Rows and Cols
expected_merge_cols = df_Left.shape[1] + df_Right.shape[1] - 1

assert df_left_join.shape[0] == df_Left.shape[0], "The Number of Rows in the Merge is NOT Correct!!"
assert df_left_join.shape[1] == expected_merge_cols, "The Number of Columns in the Merge is NOT Correct!!"

print()
print("YOUR MERGE WAS SUCCESSFULL!!!")

Left dataframe: 
Number of Rows in Left dataframe:   196
Number of Cols in Left dataframe:   5

Right dataframe: 
Number of Rows in Right dataframe:   10
Number of Cols in Right dataframe:   6

Merged dataframe: 
Number of Rows in Merged dataframe:   196 (Note: This should be same as the Left dataframe)
Number of Cols in Merged dataframe:   10 (Note: This should be Left Cols + Right Cols - 1)


YOUR MERGE WAS SUCCESSFULL!!!


# 6. Save the Merged Dataframe as a Pickle File

In [33]:
df_left_join.to_pickle('Data/Ex_4.4_Part2_Orders_Employees_Merge.pkl')