# Many to Many Relationships

Think about a Human Resources database for a large company

* Departments
* Employees
* Titles
* Managers
* Salaries

The start of a data model

![](manymany.png)

## Evolving a Many to Many

![](intersectionentity.png)

## From Model to Table

```
employees=# \d departments
          Table "public.departments"
  Column   |         Type          | Modifiers
-----------+-----------------------+-----------
 dept_no   | character(4)          | not null
 dept_name | character varying(40) | not null


employees=# \d dept_emp
       Table "public.dept_emp"
  Column   |     Type     | Modifiers
-----------+--------------+-----------
 emp_no    | integer      | not null
 dept_no   | character(4) | not null
 from_date | date         | not null
 to_date   | date         | not null


employees=# \d employees
            Table "public.employees"
   Column   |         Type          | Modifiers
------------+-----------------------+-----------
 emp_no     | integer               | not null
 birth_date | date                  | not null
 first_name | character varying(14) | not null
 last_name  | character varying(16) | not null
 gender     | character varying(1)  | not null
 hire_date  | date                  | not null

```


In [62]:
import warnings
warnings.filterwarnings('ignore')

In [63]:
%load_ext sql


The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [2]:
%sql postgresql://millbr02:@localhost/employees

'Connected: millbr02@employees'

In [4]:
%%sql

select * from departments natural join dept_emp where dept_name = 'Marketing' limit 10;

10 rows affected.


dept_no,dept_name,emp_no,from_date,to_date
d001,Marketing,10017,1993-08-03,9999-01-01
d001,Marketing,10055,1992-04-27,1995-07-22
d001,Marketing,10058,1988-04-25,9999-01-01
d001,Marketing,10108,1999-12-06,2001-10-20
d001,Marketing,10140,1991-03-14,9999-01-01
d001,Marketing,10175,1988-09-24,1995-05-24
d001,Marketing,10208,1995-02-05,1999-05-15
d001,Marketing,10228,1993-01-28,9999-01-01
d001,Marketing,10239,1996-05-04,9999-01-01
d001,Marketing,10259,1987-07-25,1994-08-15


OK, that is mildly interesting, but we can ask something more interesting, such as how many employees currently work for the marketing department?



In [6]:
%%sql

select count(*) from departments natural join dept_emp where dept_name = 'Marketing' and now() < to_date;

1 rows affected.


count
14842


**Note:** the ``now()`` function is just a convenience for manually entereing the current date.

In [10]:
%%sql

select dept_name, count(*) from departments natural join dept_emp group by dept_name order by count;

9 rows affected.


dept_name,count
Finance,17346
Human Resources,17786
Quality Management,20117
Marketing,20211
Research,21126
Customer Service,23580
Sales,52245
Production,73485
Development,85707


In [15]:
%%sql 

select * from departments natural join dept_emp natural join employees limit 10;


10 rows affected.


emp_no,dept_no,dept_name,from_date,to_date,birth_date,first_name,last_name,gender,hire_date
10001,d005,Development,1986-06-26,9999-01-01,1953-09-02,Georgi,Facello,M,1986-06-26
10002,d007,Sales,1996-08-03,9999-01-01,1964-06-02,Bezalel,Simmel,F,1985-11-21
10003,d004,Production,1995-12-03,9999-01-01,1959-12-03,Parto,Bamford,M,1986-08-28
10004,d004,Production,1986-12-01,9999-01-01,1954-05-01,Chirstian,Koblick,M,1986-12-01
10005,d003,Human Resources,1989-09-12,9999-01-01,1955-01-21,Kyoichi,Maliniak,M,1989-09-12
10006,d005,Development,1990-08-05,9999-01-01,1953-04-20,Anneke,Preusig,F,1989-06-02
10007,d008,Research,1989-02-10,9999-01-01,1957-05-23,Tzvetan,Zielinski,F,1989-02-10
10008,d005,Development,1998-03-11,2000-07-31,1958-02-19,Saniya,Kalloufi,M,1994-09-15
10009,d006,Quality Management,1985-02-18,9999-01-01,1952-04-19,Sumant,Peac,F,1985-02-18
10010,d004,Production,1996-11-24,2000-06-26,1963-06-01,Duangkaew,Piveteau,F,1989-08-24


In [22]:
%%sql

select dept_name, last_name 
from departments natural join dept_emp natural join employees 
where first_name = 'Timothy';

0 rows affected.


dept_name,last_name


In [24]:
%%sql

select dept_name, first_name, last_name
from departments natural join dept_manager natural join employees 
where now() < to_date;

9 rows affected.


dept_name,first_name,last_name
Marketing,Vishwani,Minakawa
Finance,Isamu,Legleitner
Human Resources,Karsten,Sigstam
Production,Oscar,Ghazalie
Development,Leon,DasSarma
Quality Management,Dung,Pesch
Sales,Hauke,Zhang
Research,Hilary,Kambil
Customer Service,Yuchang,Weedman


## The Final LDS

![](employeedb.png)

In [64]:
%sql select distinct title from titles

7 rows affected.


title
Technique Leader
Senior Engineer
Staff
Assistant Engineer
Engineer
Senior Staff
Manager


In [58]:
%%sql

select * from departments natural join dept_emp natural join employees natural join salaries limit 10;




10 rows affected.


emp_no,from_date,to_date,dept_no,dept_name,birth_date,first_name,last_name,gender,hire_date,salary
286829,1994-03-14,1995-01-11,d001,Marketing,1954-11-10,Kazuhisa,Vecchio,F,1986-03-17,48149
476388,1995-09-21,1995-11-28,d001,Marketing,1960-07-01,Elvis,Kroll,M,1991-10-13,65007
210440,1996-09-18,1996-11-10,d001,Marketing,1964-07-11,Chandrasekaran,Vernadat,M,1994-06-10,49760
465625,1996-10-04,1997-07-07,d001,Marketing,1961-10-02,Claudi,Piveteau,M,1996-10-04,49259
19196,2000-01-28,2000-05-12,d001,Marketing,1956-08-10,Leucio,Sury,M,1988-09-03,66890
259294,1991-02-07,1991-05-17,d001,Marketing,1962-03-13,Demos,Peyn,F,1988-01-26,66971
80329,1999-08-06,1999-09-05,d001,Marketing,1959-09-09,Moon,Ponthieu,F,1988-11-07,40000
268194,1994-05-27,1994-06-29,d001,Marketing,1960-02-12,Gila,Aamodt,F,1994-04-06,70796
294469,1995-08-17,1996-05-22,d001,Marketing,1960-07-18,Tetsushi,Biran,M,1987-06-21,68907
12505,1999-08-24,2000-08-23,d001,Marketing,1952-08-24,Barton,Goldhammer,M,1991-04-20,40423


## Question --

The above query seems OK, but there is a problem.  Can you spot what it is?

We know that our friend Barton Goldhammer worked here for more than a year, lets use the above query but limit it to him.  We want to snoop on our friend's salary history...

In [61]:
%%sql

select * from departments natural join dept_emp natural join employees natural join salaries
where employees.emp_no = 12505
 limit 10;




1 rows affected.


emp_no,from_date,to_date,dept_no,dept_name,birth_date,first_name,last_name,gender,hire_date,salary
12505,1999-08-24,2000-08-23,d001,Marketing,1952-08-24,Barton,Goldhammer,M,1991-04-20,40423


Thats even more odd.  Only one row, we know he worked in both marketing and sales and was here until at least 2002.

Think about the `natural join` again...

## JOIN ON

In [65]:
%%sql

select * from departments natural join dept_emp natural join employees join salaries on employees.emp_no = salaries.emp_no 
where employees.emp_no = 12505



6 rows affected.


emp_no,dept_no,dept_name,from_date,to_date,birth_date,first_name,last_name,gender,hire_date,emp_no_1,salary,from_date_1,to_date_1
12505,d001,Marketing,1999-08-24,2000-08-23,1952-08-24,Barton,Goldhammer,M,1991-04-20,12505,40423,1999-08-24,2000-08-23
12505,d001,Marketing,1999-08-24,2000-08-23,1952-08-24,Barton,Goldhammer,M,1991-04-20,12505,43758,2000-08-23,2001-08-23
12505,d001,Marketing,1999-08-24,2000-08-23,1952-08-24,Barton,Goldhammer,M,1991-04-20,12505,43942,2001-08-23,2002-02-18
12505,d007,Sales,2000-08-23,2002-02-18,1952-08-24,Barton,Goldhammer,M,1991-04-20,12505,40423,1999-08-24,2000-08-23
12505,d007,Sales,2000-08-23,2002-02-18,1952-08-24,Barton,Goldhammer,M,1991-04-20,12505,43758,2000-08-23,2001-08-23
12505,d007,Sales,2000-08-23,2002-02-18,1952-08-24,Barton,Goldhammer,M,1991-04-20,12505,43942,2001-08-23,2002-02-18


What is the average salary of the marketing department?


In [29]:
%%sql

select dept_name, min(salary), avg(salary), max(salary)
from  departments natural join dept_emp natural join employees natural join salaries
group by dept_name

9 rows affected.


dept_name,min,avg,max
Customer Service,40000,43684.34782608696,91638
Development,40000,49085.105339105336,100285
Finance,40000,61479.84470588235,105985
Human Resources,40000,43069.74278846154,73900
Marketing,40000,59373.38571428571,100242
Production,40000,49057.929073856976,98370
Quality Management,40000,46026.19718309859,79334
Research,40000,49130.98181818182,81790
Sales,40000,69663.25889967638,112513


In [41]:
%%sql



80 rows affected.


first_name,last_name,hire_date
Przemyslawa,Kaelbling,1985-01-01
Adil,Furedi,1985-02-01
Shir,Munck,1985-02-01
Yongdong,Pileggi,1985-02-01
Guenter,Tanemo,1985-02-02
Erzsebet,Schwartzbauer,1985-02-02
Kousuke,Swist,1985-02-02
Chaosheng,Sommen,1985-02-02
Holgard,Pena,1985-02-02
Limsoon,Macedo,1985-02-02


In [44]:
%sql select last_name, count(*) from employees group by last_name order by count(*)  limit 10;

10 rows affected.


last_name,count
Sadowsky,145
Merro,147
Zykh,148
Georgatos,148
Guardalben,148
Rosar,150
Zambonelli,151
Gonthier,151
Nollmann,151
Dulli,151
