# JOIN Operations



## Introduction

The practice problems for this Lab provide hands on experience with the different kinds of join operations using the HR Database.

**IMPORTANT:** This lab assumes that you have already created and populated the tables below in a HR database schema in your local MySQL Database. If you have not done this yet, please open these links to **[create](https://github.com/ttadesusi/IBM-Data-Science-Professional-Certification/blob/master/5.%20Databases%20and%20SQL%20for%20Data%20Science/Script_Create_Tables.sql)** and **[insert/load](https://github.com/ttadesusi/IBM-Data-Science-Professional-Certification/blob/master/5.%20Databases%20and%20SQL%20for%20Data%20Science/PETSALE-CREATE.sql)** the data.

<center>
    <img src="https://drive.google.com/uc?id=1cZkI3_UOQ64vGDzolfJumoT-IKjMTMrj" width="800" alt="Sample HR Database" />
</center>

## Connection to MySQL Database

In [1]:
%load_ext sql

In [2]:
import os 

from dotenv import load_dotenv
load_dotenv() 

myuser = os.environ.get('mysql_username')      # e.g. 'root'
mypassword= os.environ.get('mysql_password')   # e.g. 'sample-password' 

connection_url = 'mysql://{user}:{password}@localhost/hr_database'.format(user=myuser,password=mypassword)

%sql {connection_url}

In [3]:
%%sql

show full tables

 * mysql://root:***@localhost/hr_database
5 rows affected.


Tables_in_hr_database,Table_type
departments,BASE TABLE
employees,BASE TABLE
job_history,BASE TABLE
jobs,BASE TABLE
locations,BASE TABLE


In [4]:
%%sql

select 
    *
from
    DEPARTMENTS;

 * mysql://root:***@localhost/hr_database
3 rows affected.


DEPT_ID,DEPT_NAME,MANAGER_ID,LOCT_ID
2,Architect Group,30001,L0001
5,Software Group,30002,L0002
7,Design Team,30003,L0003


In [5]:
%%sql

select 
    *
from
    EMPLOYEES;

 * mysql://root:***@localhost/hr_database
10 rows affected.


EMPL_ID,F_NAME,L_NAME,SSN,B_DATE,SEX,ADDRESS,JOBS_ID,SALARY,MANAGER_ID,DEPT_ID
E1001,John,Thomas,123456,1976-09-01,M,"5631 Rice, OakPark,IL",100,100000.0,30001,2
E1002,Alice,James,123457,1972-07-31,F,"980 Berry ln, Elgin,IL",200,80000.0,30002,5
E1003,Steve,Wells,123458,1980-10-08,M,"291 Springs, Gary,IL",300,50000.0,30002,5
E1004,Santosh,Kumar,123459,1985-07-20,M,"511 Aurora Av, Aurora,IL",400,60000.0,30004,5
E1005,Ahmed,Hussain,123410,1981-04-01,M,"216 Oak Tree, Geneva,IL",500,70000.0,30001,2
E1006,Nancy,Allen,123411,1978-06-02,F,"111 Green Pl, Elgin,IL",600,90000.0,30001,2
E1007,Mary,Thomas,123412,1975-05-05,F,"100 Rose Pl, Gary,IL",650,65000.0,30003,7
E1008,Bharath,Gupta,123413,1985-06-05,M,"145 Berry Ln, Naperville,IL",660,65000.0,30003,7
E1009,Andrea,Jones,123414,1990-09-07,F,"120 Fall Creek, Gary,IL",234,70000.0,30003,7
E1010,Ann,Jacob,123415,1982-03-30,F,"111 Britany Springs,Elgin,IL",220,70000.0,30004,5


In [6]:
%%sql

select 
    * 
from 
    JOB_HISTORY;

 * mysql://root:***@localhost/hr_database
10 rows affected.


EMPL_ID,START_DATE,JOBS_ID,DEPT_ID
E1001,2000-01-08,100,2
E1002,2001-01-08,200,5
E1003,2001-08-16,300,5
E1004,2000-08-16,400,5
E1005,2000-05-30,500,2
E1006,2001-08-16,600,2
E1007,2002-05-30,650,7
E1008,2010-06-05,660,7
E1009,2016-08-16,234,7
E1010,2016-08-16,220,5


In [7]:
%%sql

select 
    * 
from 
    JOBS;

 * mysql://root:***@localhost/hr_database
10 rows affected.


JOBS_ID,JOBS_TITLE,MIN_SALARY,MAX_SALARY
100,Sr. Architect,60000.0,100000.0
200,Sr. Software Developer,60000.0,80000.0
220,Sr. Designer,70000.0,90000.0
234,Sr. Designer,70000.0,90000.0
300,Jr.Software Developer,40000.0,60000.0
400,Jr.Software Developer,40000.0,60000.0
500,Jr. Architect,50000.0,70000.0
600,Lead Architect,70000.0,100000.0
650,Jr. Designer,60000.0,70000.0
660,Jr. Designer,60000.0,70000.0


In [8]:
%%sql

select 
    * 
from 
    LOCATIONS;

 * mysql://root:***@localhost/hr_database
3 rows affected.


LOCT_ID,DEPT_ID
L0001,2
L0002,5
L0003,7


## Queries

#### Query 1A: Select the names and job start dates of all employees who work for the department number 5.

**Using Subqueries:** The outer query accesses the employees table and the subquery on job_history filters the result set of the outer query.

In [9]:
%%sql

select 
    F_NAME, L_NAME, START_DATE
from
    EMPLOYEES E,
    JOB_HISTORY JH
where
    E.EMPL_ID in (select
                      JH.EMPL_ID
                  from
                      JOB_HISTORY
                  where
                  JH.DEPT_ID = 5)
order by
   START_DATE,F_NAME;

 * mysql://root:***@localhost/hr_database
4 rows affected.


F_NAME,L_NAME,START_DATE
Santosh,Kumar,2000-08-16
Alice,James,2001-01-08
Steve,Wells,2001-08-16
Ann,Jacob,2016-08-16


**Using Implicit Join:** it joins and displays the rows of both tables (`employees` and `job_history` ) without using the join operator explicitly.

In [10]:
%%sql

select 
    F_NAME, L_NAME, START_DATE
from
    EMPLOYEES E,
    JOB_HISTORY JH
where
    E.EMPL_ID = JH.EMPL_ID and E.DEPT_ID = 5
order by 
    START_DATE , F_NAME;

 * mysql://root:***@localhost/hr_database
4 rows affected.


F_NAME,L_NAME,START_DATE
Santosh,Kumar,2000-08-16
Alice,James,2001-01-08
Steve,Wells,2001-08-16
Ann,Jacob,2016-08-16


**Using Inner Join:** displays only the rows of both tables (`job_history` and `employees`) that match the criteria speicified in the query.

In [11]:
%%sql

select 
    F_NAME, L_NAME, START_DATE
from
    EMPLOYEES E
        inner join
    JOB_HISTORY JH on E.EMPL_ID = JH.EMPL_ID
where 
    E.DEPT_ID = 5
order by 
    START_DATE, F_NAME;

 * mysql://root:***@localhost/hr_database
4 rows affected.


F_NAME,L_NAME,START_DATE
Santosh,Kumar,2000-08-16
Alice,James,2001-01-08
Steve,Wells,2001-08-16
Ann,Jacob,2016-08-16


**Using Left Outer Join:** displays all the rows in the left table (`employees`) and combines the rows on the right table (`job_history`) that match the criteria speicified in the query.

In [12]:
%%sql

select 
    F_NAME, L_NAME, START_DATE
from
    EMPLOYEES E
        left join
    JOB_HISTORY JH on E.EMPL_ID = JH.EMPL_ID
where 
    E.DEPT_ID = 5
order by 
    START_DATE, F_NAME;

 * mysql://root:***@localhost/hr_database
4 rows affected.


F_NAME,L_NAME,START_DATE
Santosh,Kumar,2000-08-16
Alice,James,2001-01-08
Steve,Wells,2001-08-16
Ann,Jacob,2016-08-16


**Using Right Outer Join:** displays all the rows in the right table (`job_history`) and combines the rows on the left table (`employees`) that match the criteria speicified in the query.

In [13]:
%%sql

select 
    F_NAME, L_NAME, START_DATE
from
    EMPLOYEES E
        right join
    JOB_HISTORY JH on E.EMPL_ID = JH.EMPL_ID
where 
    E.DEPT_ID = 5
order by 
    START_DATE, F_NAME;

 * mysql://root:***@localhost/hr_database
4 rows affected.


F_NAME,L_NAME,START_DATE
Santosh,Kumar,2000-08-16
Alice,James,2001-01-08
Steve,Wells,2001-08-16
Ann,Jacob,2016-08-16


**Using Full Outer Join:** It displays all the rows from both tables (`employees` & `job_history`)

In [14]:
%%sql

select 
    F_NAME, L_NAME, START_DATE
from
    EMPLOYEES E
        left join
    JOB_HISTORY JH on E.EMPL_ID = JH.EMPL_ID
where 
    E.DEPT_ID = 5
    
union

select 
    F_NAME, L_NAME, START_DATE
from
    EMPLOYEES E
        right join
    JOB_HISTORY JH on E.EMPL_ID = JH.EMPL_ID
where 
    E.DEPT_ID = 5
    
order by 
START_DATE, F_NAME;

 * mysql://root:***@localhost/hr_database
4 rows affected.


F_NAME,L_NAME,START_DATE
Santosh,Kumar,2000-08-16
Alice,James,2001-01-08
Steve,Wells,2001-08-16
Ann,Jacob,2016-08-16


***The above queries with different types of join operations give the same result. Unless it is specifically stated, any of the join operations can be used to retrieve information from the data set.***

#### Query 1B:  Select the names, job start dates, and job titles of all employees who work for the department number 5.

**Using Implicit Join:**

In [15]:
%%sql

select
    F_NAME, L_NAME, JOBS_TITLE
from 
    EMPLOYEES E,
    JOBS J
where
    E.JOBS_ID = J.JOBS_ID
    and E.DEPT_ID = 5
order by
   JOBS_TITLE,  F_NAME;

 * mysql://root:***@localhost/hr_database
4 rows affected.


F_NAME,L_NAME,JOBS_TITLE
Santosh,Kumar,Jr.Software Developer
Steve,Wells,Jr.Software Developer
Ann,Jacob,Sr. Designer
Alice,James,Sr. Software Developer


**Using Inner Join:**

In [16]:
%%sql

select
    F_NAME, L_NAME, JOBS_TITLE
from 
    EMPLOYEES E
        inner join
    JOBS J on E.JOBS_ID = J.JOBS_ID
where
        E.DEPT_ID = 5
order by
   JOBS_TITLE,  F_NAME;

 * mysql://root:***@localhost/hr_database
4 rows affected.


F_NAME,L_NAME,JOBS_TITLE
Santosh,Kumar,Jr.Software Developer
Steve,Wells,Jr.Software Developer
Ann,Jacob,Sr. Designer
Alice,James,Sr. Software Developer


#### Query 2A: Perform a Left Outer Join on the EMPLOYEES and DEPARTMENT tables and select employee id, last name, department id and department name for all employees

In [17]:
%%sql

select 
    EMPL_ID, L_NAME, E.DEPT_ID, DEPT_NAME
from
    EMPLOYEES E
        left join
    DEPARTMENTS D on E.DEPT_ID = D.DEPT_ID
order by
    DEPT_NAME, L_NAME;

 * mysql://root:***@localhost/hr_database
10 rows affected.


EMPL_ID,L_NAME,DEPT_ID,DEPT_NAME
E1006,Allen,2,Architect Group
E1005,Hussain,2,Architect Group
E1001,Thomas,2,Architect Group
E1008,Gupta,7,Design Team
E1009,Jones,7,Design Team
E1007,Thomas,7,Design Team
E1010,Jacob,5,Software Group
E1002,James,5,Software Group
E1004,Kumar,5,Software Group
E1003,Wells,5,Software Group


#### Query 2B: Re-write the query for 2A to limit the result set to include only the rows for employees born before 1980.

In [18]:
%%sql

select 
    EMPL_ID, L_NAME, E.DEPT_ID, DEPT_NAME
from
    EMPLOYEES E
        left join
    DEPARTMENTS D on E.DEPT_ID = D.DEPT_ID
where
    year(B_DATE) < 1980
order by
    DEPT_NAME, L_NAME;

 * mysql://root:***@localhost/hr_database
4 rows affected.


EMPL_ID,L_NAME,DEPT_ID,DEPT_NAME
E1006,Allen,2,Architect Group
E1001,Thomas,2,Architect Group
E1007,Thomas,7,Design Team
E1002,James,5,Software Group


#### Query 2C: Re-write the query for 2A to have the result set include all the employees but department names for only the employees who were born before 1980.

In [19]:
%%sql

select 
    EMPL_ID, L_NAME, E.DEPT_ID, DEPT_NAME
from
    EMPLOYEES E
        left join
    DEPARTMENTS D on E.DEPT_ID = D.DEPT_ID and year(B_DATE) < 1980
order by
    DEPT_NAME, L_NAME;

 * mysql://root:***@localhost/hr_database
10 rows affected.


EMPL_ID,L_NAME,DEPT_ID,DEPT_NAME
E1008,Gupta,7,
E1005,Hussain,2,
E1010,Jacob,5,
E1009,Jones,7,
E1004,Kumar,5,
E1003,Wells,5,
E1006,Allen,2,Architect Group
E1001,Thomas,2,Architect Group
E1007,Thomas,7,Design Team
E1002,James,5,Software Group


#### Query 3A: Perform a Full Join on the EMPLOYEES and DEPARTMENT tables and select the First name, Last name and Department name of all employees.

In [20]:
%%sql

select 
    F_NAME, L_NAME, DEPT_NAME
from
    EMPLOYEES E
        join
    DEPARTMENTS D on E.DEPT_ID = D.DEPT_ID
order by
    DEPT_NAME, F_NAME;

 * mysql://root:***@localhost/hr_database
10 rows affected.


F_NAME,L_NAME,DEPT_NAME
Ahmed,Hussain,Architect Group
John,Thomas,Architect Group
Nancy,Allen,Architect Group
Andrea,Jones,Design Team
Bharath,Gupta,Design Team
Mary,Thomas,Design Team
Alice,James,Software Group
Ann,Jacob,Software Group
Santosh,Kumar,Software Group
Steve,Wells,Software Group


#### Query 3B: Re-write Query 3A to have the result set include all employee names but department id and department names only for male employees.

**Using Left Join:**

In [21]:
%%sql

select 
    F_NAME, L_NAME, D.DEPT_ID, DEPT_NAME
from
    EMPLOYEES E
        left join
    DEPARTMENTS D on E.DEPT_ID = D.DEPT_ID and SEX = 'M'
order by
    DEPT_NAME, F_NAME;

 * mysql://root:***@localhost/hr_database
10 rows affected.


F_NAME,L_NAME,DEPT_ID,DEPT_NAME
Alice,James,,
Andrea,Jones,,
Ann,Jacob,,
Mary,Thomas,,
Nancy,Allen,,
Ahmed,Hussain,2.0,Architect Group
John,Thomas,2.0,Architect Group
Bharath,Gupta,7.0,Design Team
Santosh,Kumar,5.0,Software Group
Steve,Wells,5.0,Software Group


**Using Full Join:**

In [22]:
%%sql

select 
    F_NAME, L_NAME, D.DEPT_ID, DEPT_NAME
from
    EMPLOYEES E
        left join
    DEPARTMENTS D on E.DEPT_ID = D.DEPT_ID and SEX = 'M'
    
union

select 
    F_NAME, L_NAME, D.DEPT_ID, DEPT_NAME
from
    EMPLOYEES E
        right join
    DEPARTMENTS D on E.DEPT_ID = D.DEPT_ID and SEX = 'M'
    
order by
    DEPT_NAME, F_NAME;

 * mysql://root:***@localhost/hr_database
10 rows affected.


F_NAME,L_NAME,DEPT_ID,DEPT_NAME
Alice,James,,
Andrea,Jones,,
Ann,Jacob,,
Mary,Thomas,,
Nancy,Allen,,
Ahmed,Hussain,2.0,Architect Group
John,Thomas,2.0,Architect Group
Bharath,Gupta,7.0,Design Team
Santosh,Kumar,5.0,Software Group
Steve,Wells,5.0,Software Group


## Summary

In this lab you have learned how to work with different join operations using Python-SQL querries to retrieve information from MySQL database.

## Author

[Temitope Adesusi](https://www.linkedin.com/in/ttadesusi)

## Reference

[IBM Data Science](https://www.coursera.org/professional-certificates/ibm-data-science?) 

[Script to Create Table](https://github.com/ttadesusi/IBM-Data-Science-Professional-Certification/blob/master/5.%20Databases%20and%20SQL%20for%20Data%20Science/Script_Create_Tables.sql)

[Script to Insert Data to Table](https://github.com/ttadesusi/IBM-Data-Science-Professional-Certification/blob/master/5.%20Databases%20and%20SQL%20for%20Data%20Science/PETSALE-CREATE.sql)