# SQL - Conquering Relational Division

You’ve probably encountered relational division before, even if you’re unfamiliar with the term. We will learn challenging aspects of relational division and three different techniques to solve them, highlighting the pros and cons of each approach. By the end of this lab, you will gain valuable insight into how SQL works, be able to identify relational division challenges, and know how to implement the best solutions.

## Setup

In [1]:
%run ./02-connect.ipynb

In [2]:
%%sql
DROP DATABASE IF EXISTS hr;
CREATE DATABASE hr;

In [2]:
%%sql
USE hr;

In [3]:
%%sql

CREATE TABLE candidates (
    candidate VARCHAR(30) NOT NULL PRIMARY KEY
);

CREATE TABLE roles (role VARCHAR(30) NOT NULL PRIMARY KEY);

CREATE TABLE skillcategories (
    category VARCHAR(30) NOT NULL PRIMARY KEY
);

CREATE TABLE skills (
    skill VARCHAR(30) NOT NULL PRIMARY KEY,
    category VARCHAR(30) NOT NULL REFERENCES skillcategories(category)
);

CREATE TABLE candidateskills (
    candidate VARCHAR(30) NOT NULL REFERENCES candidates(candidate),
    skill VARCHAR(30) NOT NULL REFERENCES skills(skill),
    PRIMARY KEY (candidate, skill)
);

CREATE TABLE roleskills (
    role VARCHAR(30) NOT NULL REFERENCES roles(role),
    skill VARCHAR(30) NOT NULL REFERENCES skills(skill),
    PRIMARY KEY (role, skill)
);

In [None]:
%%sql

INSERT INTO candidates (candidate)
VALUES ('natasha'),
    ('chen'),
    ('praveena'),
    ('kelly'),
    ('darrin');
    
INSERT INTO roles (role)
VALUES ('db architect'),
    ('front end developer'),
    ('office manager');
    
INSERT INTO skillcategories (category)
VALUES ('professional'),
    ('personal');
    
INSERT INTO skills (skill, category)
VALUES ('sql', 'professional'),
    ('db design', 'professional'),
    ('c#', 'professional'),
    ('python', 'professional'),
    ('java', 'professional'),
    ('office', 'professional'),
    ('team player', 'personal'),
    ('leader', 'personal'),
    ('passionate', 'personal');
    
INSERT INTO roleskills (role, skill)
VALUES ('db architect', 'sql'),
    ('db architect', 'db design'),
    ('db architect', 'python'),
    ('db architect', 'team player'),
    ('db architect', 'passionate'),
    ('front end developer', 'java'),
    ('front end developer', 'c#'),
    (
        'front end developer',
        'team player'
    ),
    ('front end developer', 'passionate'),
    ('office manager', 'passionate'),
    ('office manager', 'team player'),
    ('office manager', 'office');

INSERT INTO candidateskills (candidate, skill)
VALUES ('natasha', 'sql'),
    ('natasha', 'db design'),
    ('natasha', 'team player'),
    ('natasha', 'passionate'),
    ('chen', 'sql'),
    ('chen', 'db design'),
    ('chen', 'python'),
    ('chen', 'team player'),
    ('chen', 'passionate'),
    ('praveena', 'java'),
    ('praveena', 'c#'),
    ('praveena', 'team player'),
    ('praveena', 'passionate'),
    ('praveena', 'python'),
    ('kelly', 'passionate'),
    ('kelly', 'leader'),
    ('darrin', 'sql'),
    ('darrin', 'db design'),
    ('darrin', 'c#'),
    ('darrin', 'python'),
    ('darrin', 'java'),
    ('darrin', 'office'),
    ('darrin', 'team player'),
    ('darrin', 'leader'),
    ('darrin', 'passionate');

In [5]:
%sql SELECT * FROM candidates

Unnamed: 0,candidate
0,chen
1,darrin
2,kelly
3,natasha
4,praveena


In [6]:
%sql SELECT * FROM roles

Unnamed: 0,role
0,db architect
1,front end developer
2,office manager


In [7]:
%sql SELECT * FROM skillcategories

Unnamed: 0,category
0,personal
1,professional


In [8]:
%sql SELECT * FROM skills

Unnamed: 0,skill,category
0,c#,professional
1,db design,professional
2,java,professional
3,leader,personal
4,office,professional
5,passionate,personal
6,python,professional
7,sql,professional
8,team player,personal


In [9]:
%sql SELECT * FROM roleskills

Unnamed: 0,role,skill
0,db architect,db design
1,db architect,passionate
2,db architect,python
3,db architect,sql
4,db architect,team player
5,front end developer,c#
6,front end developer,java
7,front end developer,passionate
8,front end developer,team player
9,office manager,office


In [13]:
%sql SELECT * FROM candidateskills

Unnamed: 0,candidate,skill
0,chen,db design
1,chen,passionate
2,chen,python
3,chen,sql
4,chen,team player
5,darrin,c#
6,darrin,db design
7,darrin,java
8,darrin,leader
9,darrin,office


## The Aggregations

### Relational Division using Aggregation

Q: Does Praveena have all skills?

In [3]:
%%sql

SELECT COUNT(*) AS praveenaskills
FROM candidateskills AS cs
WHERE cs.candidate = 'praveena'
GROUP BY cs.candidate;

Unnamed: 0,praveenaskills
0,5


Q: Candidates that fit the DB Architect role

In [8]:
%%sql

SELECT CS.Candidate
FROM candidateskills AS CS
WHERE CS.Skill IN (
        SELECT RS.Skill
        FROM roleskills AS RS
        WHERE RS.Role = 'DB Architect'
    )
GROUP BY CS.Candidate
HAVING COUNT(*) = (
        SELECT COUNT(*)
        FROM roleskills AS RS1
        WHERE RS1.Role = 'DB Architect'
        GROUP BY RS1.Role
    );

Unnamed: 0,Candidate
0,chen
1,darrin


Q: Use a JOIN instead of IN

In [9]:
%%sql

WITH dbarchitectskills AS (
    SELECT rs.skill
    FROM roleskills AS rs
    WHERE rs.role = 'db architect'
)
SELECT cs.candidate
FROM candidateskills AS cs
    INNER JOIN dbarchitectskills AS dbs ON dbs.skill = cs.skill
GROUP BY cs.candidate
HAVING COUNT(*) = (
        SELECT COUNT(*)
        FROM dbarchitectskills
    );

Unnamed: 0,candidate
0,chen
1,darrin


### Exact Division Challenge

You need to modify the above query so that dividends with remainders are eliminated. To put it nicely, I don’t want to see Darrin in the result. He has more skills than what the role requires. #EliminateDarrin

Hints:
- An exact relational division is just a standard relational division, with an additional constraint sprinkled on top.
- There was a good reason that I used a JOIN instead an IN predicate.

In [10]:
%%sql

WITH dbarchitectskills AS (
    SELECT rs.skill
    FROM roleskills AS rs
    WHERE rs.role = 'DB Architect'
)
SELECT cs.candidate
FROM candidateskills AS cs
    LEFT OUTER JOIN dbarchitectskills AS dbs ON dbs.skill = cs.skill
GROUP BY cs.candidate
HAVING COUNT(dbs.skill) = (
        SELECT COUNT(*)
        FROM dbarchitectskills
    )
    AND COUNT(*) = COUNT(dbs.skill);

Unnamed: 0,candidate
0,chen
