In [None]:
# 7.3.2

In [None]:
# Use Different Types of Joins

# Joins are common in every coding language that deals with data. There are several types of joins available for use, 
# depending on what data we want displayed. 

# For example, in the retirement_info table, we want all of the information from the table. 

# We'll be joining it with the dept_emp table, but what information do we want included? 

# Just the dept_no, so we know which department each employee works in.



In [None]:
# When we talk about joining tables, try to imagine one table on the left and a second table on the right, instead of 
# in a list.

In [None]:
# Inner Join

# An inner join, also known as a simple join, will return matching data from two tables. 

# Look at the Departments and Managers tables in our ERD. Imagine that Departments is Table 1 and Managers is Table 2 
# in the diagram. (venn diagram, only data in the area where the diagram overlaps is highlighted)

# It's important to have the order of the tables correct, too. 

# SQL views them as the first and second tables, or the left and right tables.

# Say we want to view the department number and name as well as the managers' employee numbers. 

# We only want the matching data from both tables, though. This means that if there are any NaNs from either table, 
# they won't be included in the data that gets returned. 

# So if a department doesn't have a manager, it wouldn't be listed in the data.


In [None]:
# Left Join

# A left join (or "left outer join") will take all of the data from Table 1 and only the matching data from Table 2. 

# An inner join is different because only the matching data from the second table is included. 

# So how do we visualize a left join?

# (venn diagram, left circle and area of overlap are highlighted)

# when rows from the Managers table do not have matching data for every row in the Departments table, a NaN is 
# inserted into that column and row intersection instead. 

# This way, all of the data we want included from Table 1 is still present, and data from Table 2 isn't rearranged 
# and mismatched (blank spaces aren't automatically filled with present data).

# If we wanted to add information from the Managers table to the Departments table, we'd use a left join. 
# This way, every row of the Departments table would be returned with or without manager information. 

# If a department doesn't have a manager, a NaN would appear in place of actual data

In [None]:
# Right Join

# The right join (or "right outer join") is the inverse of a left join. 

# A right join takes all of the data from Table 2 and only the matching data from Table 1. 
#(visually, the opposite of a left join)


In [None]:
# Full Outer Join

# A full outer join is a comprehensive join that combines all data from both tables.
# (visually, both circles in venn diagram are fully highlighted)

# The results are potentially massive. This is dangerous for a few reasons:

# If the data being returned is extremely large, generating it can bog down or even crash your computer.

# After the query has been returned, navigating the data can also bog down or crash your computer.

# The data has the potential to be full of null values, or NaNs.




In [None]:
# 7.3.3 Joins in Action

# We'll practice not only joining multiple tables, but also a way to clear up code by using aliases, where we assign 
# a nickname to a table.

In [None]:
# Use Inner Join for Departments and dept-manager Tables

# Let's create a query that will return each department name from the Departments table as well as the employee 
# numbers and the from- and to- dates from the dept_manager table. 

# We'll use an inner join because we want all of the matching rows from both tables.

In [None]:
# -- Joining departments and dept_manager tables
# SELECT departments.dept_name,
#      dept_manager.emp_no,
#      dept_manager.from_date,
#      dept_manager.to_date
# FROM departments
# INNER JOIN dept_manager
# ON departments.dept_no = dept_manager.dept_no;

In [None]:
# The SELECT statement selects only the columns we want to view from each table.

# The FROM statement points to the first table to be joined, Departments (Table 1).

# INNER JOIN points to the second table to be joined, dept_manager (Table 2).

# ON departments.dept_no = managers.dept_no; indicates where Postgres should look for matches.

In [None]:
# As Bobby is working through this first join, he realizes that he overlooked something important: start and end dates. 

In [None]:
# Use Left Join to Capture retirement-info Table

# We'll need to help Bobby recreate this list. 
# Think about what we need to have a fully accurate retirement_info table:

# Employee number
# Employee name (first and last)
# If the person is presently employed with PH

# Which tables have this information? Our current retirement_info is already filtered to list only the employees born 
# and hired within the correct time frame. The dept_emp table has the last bit we need. 
# We'll need to perform a join to get this information into one spot. Let's get started.



In [None]:
# First, we start with the SELECT statement.
    
# -- Joining retirement_info and dept_emp tables
# SELECT retirement_info.emp_no,
#     retirement_info.first_name,
# retirement_info.last_name,
#     dept_emp.to_date    

In [None]:
# Next, we assign the left table with FROM.

# FROM retirement_info


In [None]:
# Then we specify the join we'll use. This time, use a LEFT JOIN to include every row of the first table 
# (retirement_info). 

# This also tells Postgres which table is second, or on the right side (dept_emp).

# LEFT JOIN dept_emp


In [None]:
# Now we need to tell Postgres where the two tables are linked with the ON clause.

# ON retirement_info.emp_no = dept_emp.emp_no;

In [None]:
# Full Code Using Left Join to Capture retirement-info Table

# -- Joining retirement_info and dept_emp tables (Left Join)
# SELECT retirement_info.emp_no,
#     retirement_info.first_name,
# retirement_info.last_name,
#     dept_emp.to_date    
# FROM retirement_info
# LEFT JOIN dept_emp
# ON retirement_info.emp_no = dept_emp.emp_no;

In [None]:
# Run the code to see what the data looks like. The "Data Output" tab should contain each of the four columns specified earlier: emp_no (employee number), first_name, last_name, and the to_date. Data from two different tables have been successfully merged into one!
    