# Many to Many Relationships

Think about a Human Resources database for a large company

* Departments
* Employees
* Titles
* Managers
* Salaries



The start of a data model

![](manymany.png)

## Evolving a Many to Many

![](intersectionentity.png)

## From Model to Table

```
employees=# \d departments
          Table "public.departments"
  Column   |         Type          | Modifiers
-----------+-----------------------+-----------
 dept_no   | character(4)          | not null
 dept_name | character varying(40) | not null


employees=# \d dept_emp
       Table "public.dept_emp"
  Column   |     Type     | Modifiers
-----------+--------------+-----------
 emp_no    | integer      | not null
 dept_no   | character(4) | not null
 from_date | date         | not null
 to_date   | date         | not null


employees=# \d employees
            Table "public.employees"
   Column   |         Type          | Modifiers
------------+-----------------------+-----------
 emp_no     | integer               | not null
 birth_date | date                  | not null
 first_name | character varying(14) | not null
 last_name  | character varying(16) | not null
 gender     | character varying(1)  | not null
 hire_date  | date                  | not null

```


In [62]:
import warnings
warnings.filterwarnings('ignore')

In [63]:
%load_ext sql


The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [2]:
%sql postgresql://millbr02:@localhost/employees

'Connected: millbr02@employees'

In [93]:
%%sql

select dept_name, count(*) 
from departments natural join dept_emp 
where  now() < to_date
group by dept_name
order by count

9 rows affected.


dept_name,count
Finance,12437
Human Resources,12898
Quality Management,14546
Marketing,14842
Research,15441
Customer Service,17569
Sales,37701
Production,53304
Development,61386


OK, that is mildly interesting, but we can ask something more interesting, such as how many employees currently work for the marketing department?



In [78]:
%%sql

select dept_name, count(*) 
from departments natural join dept_emp 
where now() < to_date
group by dept_name

9 rows affected.


dept_name,count
Research,15441
Sales,37701
Customer Service,17569
Marketing,14842
Human Resources,12898
Production,53304
Quality Management,14546
Development,61386
Finance,12437


1. **Note:** the ``now()`` function is just a convenience for manually entereing the current date.
2. **Note:** we can use aggregates without a grouby, realizing that said aggregate applies to the whole result.

In [80]:
%%sql



2 rows affected.


first_name,last_name,dept_name
Margareta,Markovitch,Marketing
Vishwani,Minakawa,Marketing


In [96]:
%%sql 

select first_name, last_name, from_date, to_date
from departments natural join dept_manager natural join employees 
where dept_name = 'Sales'



2 rows affected.


first_name,last_name,from_date,to_date
Przemyslawa,Kaelbling,1985-01-01,1991-03-07
Hauke,Zhang,1991-03-07,9999-01-01


## The Final LDS

![](employeedb.png)

In [24]:
%%sql

select dept_name, first_name, last_name
from departments natural join dept_manager natural join employees 
where now() < to_date;

9 rows affected.


dept_name,first_name,last_name
Marketing,Vishwani,Minakawa
Finance,Isamu,Legleitner
Human Resources,Karsten,Sigstam
Production,Oscar,Ghazalie
Development,Leon,DasSarma
Quality Management,Dung,Pesch
Sales,Hauke,Zhang
Research,Hilary,Kambil
Customer Service,Yuchang,Weedman


In [64]:
%sql select distinct title from titles

7 rows affected.


title
Technique Leader
Senior Engineer
Staff
Assistant Engineer
Engineer
Senior Staff
Manager


In [58]:
%%sql

select * 
from departments natural join dept_emp 
natural join employees natural join salaries limit 10;




10 rows affected.


emp_no,from_date,to_date,dept_no,dept_name,birth_date,first_name,last_name,gender,hire_date,salary
286829,1994-03-14,1995-01-11,d001,Marketing,1954-11-10,Kazuhisa,Vecchio,F,1986-03-17,48149
476388,1995-09-21,1995-11-28,d001,Marketing,1960-07-01,Elvis,Kroll,M,1991-10-13,65007
210440,1996-09-18,1996-11-10,d001,Marketing,1964-07-11,Chandrasekaran,Vernadat,M,1994-06-10,49760
465625,1996-10-04,1997-07-07,d001,Marketing,1961-10-02,Claudi,Piveteau,M,1996-10-04,49259
19196,2000-01-28,2000-05-12,d001,Marketing,1956-08-10,Leucio,Sury,M,1988-09-03,66890
259294,1991-02-07,1991-05-17,d001,Marketing,1962-03-13,Demos,Peyn,F,1988-01-26,66971
80329,1999-08-06,1999-09-05,d001,Marketing,1959-09-09,Moon,Ponthieu,F,1988-11-07,40000
268194,1994-05-27,1994-06-29,d001,Marketing,1960-02-12,Gila,Aamodt,F,1994-04-06,70796
294469,1995-08-17,1996-05-22,d001,Marketing,1960-07-18,Tetsushi,Biran,M,1987-06-21,68907
12505,1999-08-24,2000-08-23,d001,Marketing,1952-08-24,Barton,Goldhammer,M,1991-04-20,40423


## Question --

The above query seems OK, but there is a problem.  Can you spot what it is?

We know that our friend Barton Goldhammer worked here for more than a year, lets use the above query but limit it to him.  We want to snoop on our friend's salary history...

In [97]:
%%sql

select * 
from departments natural join dept_emp 
natural join employees natural join salaries
where employees.emp_no = 12505





1 rows affected.


emp_no,from_date,to_date,dept_no,dept_name,birth_date,first_name,last_name,gender,hire_date,salary
12505,1999-08-24,2000-08-23,d001,Marketing,1952-08-24,Barton,Goldhammer,M,1991-04-20,40423


Thats even more odd.  Only one row, we know he worked in both marketing and sales and was here until at least 2002.

Think about the `natural join` again...

## JOIN ON

In [99]:
%%sql

select * 
from departments natural join dept_emp natural join employees 
join salaries on employees.emp_no = salaries.emp_no 
where employees.emp_no = 12505
order by salaries.from_date;



6 rows affected.


emp_no,dept_no,dept_name,from_date,to_date,birth_date,first_name,last_name,gender,hire_date,emp_no_1,salary,from_date_1,to_date_1
12505,d001,Marketing,1999-08-24,2000-08-23,1952-08-24,Barton,Goldhammer,M,1991-04-20,12505,40423,1999-08-24,2000-08-23
12505,d007,Sales,2000-08-23,2002-02-18,1952-08-24,Barton,Goldhammer,M,1991-04-20,12505,40423,1999-08-24,2000-08-23
12505,d001,Marketing,1999-08-24,2000-08-23,1952-08-24,Barton,Goldhammer,M,1991-04-20,12505,43758,2000-08-23,2001-08-23
12505,d007,Sales,2000-08-23,2002-02-18,1952-08-24,Barton,Goldhammer,M,1991-04-20,12505,43758,2000-08-23,2001-08-23
12505,d001,Marketing,1999-08-24,2000-08-23,1952-08-24,Barton,Goldhammer,M,1991-04-20,12505,43942,2001-08-23,2002-02-18
12505,d007,Sales,2000-08-23,2002-02-18,1952-08-24,Barton,Goldhammer,M,1991-04-20,12505,43942,2001-08-23,2002-02-18


What is the average salary of the marketing department?


In [29]:
%%sql

select dept_name, min(salary), avg(salary), max(salary)
from  departments natural join dept_emp natural join employees natural join salaries
group by dept_name

9 rows affected.


dept_name,min,avg,max
Customer Service,40000,43684.34782608696,91638
Development,40000,49085.105339105336,100285
Finance,40000,61479.84470588235,105985
Human Resources,40000,43069.74278846154,73900
Marketing,40000,59373.38571428571,100242
Production,40000,49057.929073856976,98370
Quality Management,40000,46026.19718309859,79334
Research,40000,49130.98181818182,81790
Sales,40000,69663.25889967638,112513


In [41]:
%%sql



80 rows affected.


first_name,last_name,hire_date
Przemyslawa,Kaelbling,1985-01-01
Adil,Furedi,1985-02-01
Shir,Munck,1985-02-01
Yongdong,Pileggi,1985-02-01
Guenter,Tanemo,1985-02-02
Erzsebet,Schwartzbauer,1985-02-02
Kousuke,Swist,1985-02-02
Chaosheng,Sommen,1985-02-02
Holgard,Pena,1985-02-02
Limsoon,Macedo,1985-02-02


In [44]:
%sql select last_name, count(*) from employees group by last_name order by count(*)  limit 10;

10 rows affected.


last_name,count
Sadowsky,145
Merro,147
Zykh,148
Georgatos,148
Guardalben,148
Rosar,150
Zambonelli,151
Gonthier,151
Nollmann,151
Dulli,151


![](nj_v_j.png)

```
select * from
dept_emp natural join salaries
```

```
emp_no  dept_no from to salary
1       2       X    Y  10
```


```
select * from
dept_emp join salaries on dept_emp.emp_no = salaries.emp_no
```



```
emp_no   dept_no  from_1  to_1 from_2 to_2 salary
1        2        X       Y    X      Y    10
1        3        Y       X    X      Y    10
2        1        X       Y    Y      Z    12

```



example queries

* find the salary history for Holgard Pena
* find the job title history for Holgard Pena
* current job title counts per department
* average salary by job title  - based on current salary and current title
* who has held the most titles?


In [136]:
%%sql

select  emp_no, to_char(salary,'$999,999.00'), from_date, to_date 
from employees natural join salaries
where first_name = 'Holgard' and last_name='Pena';

13 rows affected.


emp_no,to_char,from_date,to_date
241707,"$ 83,852.00",1990-04-01,1991-04-01
241707,"$ 84,691.00",1991-04-01,1992-03-31
241707,"$ 88,281.00",1992-03-31,1993-03-31
241707,"$ 89,637.00",1993-03-31,1994-03-31
241707,"$ 92,069.00",1994-03-31,1995-03-31
241707,"$ 93,627.00",1995-03-31,1996-03-30
241707,"$ 93,471.00",1996-03-30,1997-03-30
241707,"$ 95,055.00",1997-03-30,1998-03-30
241707,"$ 97,593.00",1998-03-30,1999-03-30
241707,"$ 101,019.00",1999-03-30,2000-03-29


In [108]:
%%sql

select title, from_date, to_date
from employees natural join titles
where first_name = 'Holgard' and last_name = 'Pena';

2 rows affected.


title,from_date,to_date
Senior Staff,1996-03-31,9999-01-01
Staff,1990-04-01,1996-03-31


In [137]:
%%sql

select dept_name, title, count(*)
from departments natural join dept_emp  
   join titles on dept_emp.emp_no = titles.emp_no
where now() < titles.to_date
group by dept_name, title
order by dept_name, title

45 rows affected.


dept_name,title,count
Customer Service,Assistant Engineer,298
Customer Service,Engineer,2362
Customer Service,Manager,4
Customer Service,Senior Engineer,2027
Customer Service,Senior Staff,13925
Customer Service,Staff,16150
Customer Service,Technique Leader,309
Development,Assistant Engineer,7769
Development,Engineer,58135
Development,Manager,2


In [139]:
%%sql

select title, to_char(avg(salary), '999,999.99') AvgSalary
from titles join salaries on titles.emp_no = salaries.emp_no
where now() < salaries.to_date and now() < titles.to_date
group by title
order by avgsalary desc

7 rows affected.


title,avgsalary
Senior Staff,80706.5
Manager,77723.67
Senior Engineer,70823.44
Technique Leader,67506.59
Staff,67330.67
Engineer,59602.74
Assistant Engineer,57317.57


In [126]:
%%sql

select title, avg(salary)
from titles join salaries on titles.emp_no = salaries.emp_no
where now() < salaries.to_date
group by title

7 rows affected.


title,avg
Technique Leader,67507.98233684385
Senior Engineer,70823.40193386236
Staff,77513.73401860919
Assistant Engineer,67433.36252799204
Senior Staff,80705.98552986066
Engineer,67941.02044417664
Manager,79546.25


In [129]:
%%sql

select *
from titles join salaries on titles.emp_no = salaries.emp_no
where now() < salaries.to_date and now() < titles.to_date 
limit 20;

20 rows affected.


emp_no,title,from_date,to_date,emp_no_1,salary,from_date_1,to_date_1
95972,Technique Leader,1991-07-20,9999-01-01,95972,65293,2002-07-17,9999-01-01
95973,Senior Engineer,2000-08-20,9999-01-01,95973,62266,2001-08-18,9999-01-01
95974,Engineer,2001-01-21,9999-01-01,95974,53334,2002-01-20,9999-01-01
95976,Senior Staff,1998-08-09,9999-01-01,95976,41528,2001-08-08,9999-01-01
95977,Engineer,1998-11-27,9999-01-01,95977,55132,2001-11-26,9999-01-01
95978,Senior Staff,1994-04-07,9999-01-01,95978,77320,2002-04-03,9999-01-01
95979,Staff,1995-09-15,9999-01-01,95979,54743,2001-09-13,9999-01-01
95980,Senior Engineer,1999-01-02,9999-01-01,95980,58392,2001-12-30,9999-01-01
95981,Senior Engineer,1999-10-07,9999-01-01,95981,60292,2001-10-04,9999-01-01
95982,Senior Staff,2001-11-18,9999-01-01,95982,74959,2001-11-17,9999-01-01
