In [1]:
%run helper/setup_notebook.ipynb

Successfully connected to sql_lab database.


In [2]:
%%sql 

SELECT *
FROM employee;

emp_id,first_name,last_name,dept_id,manager_id,office_id
1,Sally,Jones,3,2.0,5
2,Mark,Smith,2,4.0,3
3,John,Andrews,1,4.0,3
4,Michelle,Johnson,2,,5
5,Brian,Grand,2,2.0,3


In [3]:
%%sql 

SELECT * 
FROM department;

dept_id,dept_name
1,Sales
2,IT
3,Support


##### Display the employee name and count per department.

In [4]:
%%sql 

-- step 1. find the department id and the count of each department
SELECT dept_id, COUNT(*) as dept_count
FROM employee
GROUP BY dept_id;

dept_id,dept_count
3,1
2,3
1,1


In [5]:
%%sql 

-- step 2. run an inner join based on employee's department id 

SELECT e.first_name, 
e.last_name, 
d.dept_count
FROM employee e
INNER JOIN (
    SELECT dept_id, COUNT(*) AS dept_count
    FROM employee 
    GROUP BY dept_id 
) d
ON e.dept_id = d.dept_id

first_name,last_name,dept_count
Sally,Jones,1
Mark,Smith,3
John,Andrews,1
Michelle,Johnson,3
Brian,Grand,3


#### ***`CTE`*** allow us to move the subquery and define it separately. 

In [6]:
%%sql 

WITH dept_count_table AS (
    SELECT dept_id, COUNT(*) AS dept_count
    FROM employee
    GROUP BY dept_id
)
SELECT e.first_name,
e.last_name,
dct.dept_count
FROM employee e
INNER JOIN dept_count_table dct ON e.dept_id = dct.dept_id


first_name,last_name,dept_count
Sally,Jones,1
Mark,Smith,3
John,Andrews,1
Michelle,Johnson,3
Brian,Grand,3


#### We can create multiple CTEs seperated by `comma(,)`

In [7]:
%%sql 

WITH dept_count_table AS(
    SELECT dept_id, COUNT(*) AS dept_count
    FROM employee
    GROUP BY dept_id
),
mng_table AS(
    SELECT CONCAT(mng.first_name, ' ', mng.last_name) AS manager_full_name, 
        mng.emp_id,
        emp.dept_id
    FROM employee mng 
    INNER JOIN employee emp ON mng.emp_id = emp.manager_id
)
SELECT CONCAT(e.first_name,' ',e.last_name) AS employee_full_name,
dct.dept_count,
mt.manager_full_name
FROM employee e
INNER JOIN dept_count_table dct ON e.dept_id = dct.dept_id
LEFT JOIN mng_table mt ON e.dept_id = mt.dept_id AND e.manager_id = mt.emp_id


employee_full_name,dept_count,manager_full_name
Sally Jones,1,Mark Smith
Mark Smith,3,Michelle Johnson
John Andrews,1,Michelle Johnson
Michelle Johnson,3,
Brian Grand,3,Mark Smith


## Key takeaways:

- CTEs can enhance the readability of a query and make it easier to understand by replacing multiple subqueries with named temporary result sets. 

- Unlike temporary tables, CTEs don't create new database objects, and thus have no significant impact on query performance.

# ***RECURSIVE*** Common Table Expression

- Operates on its own result.
- Allows you to perform recursive operations in SQL, where a query refers to a ***subquery*** that refers back to the ***main*** query.
- Used to work with hierarchical data such as org chart or file system. 

In [8]:
%%sql 

-- initial state of the employee table
SELECT emp_id, first_name, manager_id, dept_id 
FROM employee

emp_id,first_name,manager_id,dept_id
1,Sally,2.0,3
2,Mark,4.0,2
3,John,4.0,1
4,Michelle,,2
5,Brian,2.0,2


In [9]:
%%sql 

-- if you want to see the details of the manager 
SELECT e.emp_id, e.first_name, e.manager_id, e.dept_id,
m.emp_id, m.first_name, m.manager_id, m.dept_id
FROM employee e 
LEFT JOIN employee m 
ON e.manager_id = m.emp_id;

emp_id,first_name,manager_id,dept_id,emp_id_1,first_name_1,manager_id_1,dept_id_1
1,Sally,2.0,3,2.0,Mark,4.0,2.0
2,Mark,4.0,2,4.0,Michelle,,2.0
3,John,4.0,1,4.0,Michelle,,2.0
4,Michelle,,2,,,,
5,Brian,2.0,2,2.0,Mark,4.0,2.0


- The issue is that we can only see one level of hirerachy wihtout Recursive CTE unless we keep adding UNION ALL and LEFT JOIN 

In [13]:
%%sql 

WITH RECURSIVE cteEmp(emp_id, first_name, manager_id, emplevel) AS (
    SELECT emp_id, first_name, manager_id, 1
    FROM employee 
    WHERE manager_id IS NULL
    UNION ALL
    SELECT e.emp_id, e.first_name, e.manager_id, r.emplevel+1
    FROM employee e
    INNER JOIN cteEmp r
    ON e.manager_id = r.emp_id
)
SELECT emp_id,
first_name,
manager_id,
emplevel 
FROM cteEmp
ORDER BY emplevel;

emp_id,first_name,manager_id,emplevel
4,Michelle,,1
2,Mark,4.0,2
3,John,4.0,2
1,Sally,2.0,3
5,Brian,2.0,3


In [21]:
%%sql 


WITH RECURSIVE cteEmp(emp_id, first_name, manager_id, emplevel) AS (
    SELECT emp_id, first_name, manager_id, 1
    FROM employee 
    WHERE manager_id IS NULL
    UNION ALL
    SELECT e.emp_id, e.first_name, e.manager_id, r.emplevel+1
    FROM employee e
    INNER JOIN cteEmp r
    ON e.manager_id = r.emp_id
)
SELECT emp_id,
first_name,
manager_id,
CASE 
    WHEN emplevel = 1 THEN "CEO"
    WHEN emplevel = 2 THEN "Managing Director"
    ELSE "VP"
END AS title
FROM cteEmp
ORDER BY emplevel;

emp_id,first_name,manager_id,title
4,Michelle,,CEO
2,Mark,4.0,Managing Director
3,John,4.0,Managing Director
1,Sally,2.0,VP
5,Brian,2.0,VP



## ***Query Breakdown*** 

- First we define a recursive Common Table Expression (CTE) called ***cteEmp***. The CTE defines four columns: ***emp_id***, ***first_name***, ***manager_id***, and ***emplevel***.

    ```sql
        WITH RECURSIVE cteEmp (emp_id, first_name, manager_id, emplevel) AS ...
     ```


- The first part of the CTE is a ***SELECT*** statement that retrieves data from the ***employee*** table where ***manager_id*** is ***NULL***. This represents the top level of the employee hierarchy.
    ```sql
        SELECT emp_id, first_name, manager_id, 1
        FROM employee 
        WHERE manager_id IS NULL
    ```

- The second part of the CTE is also a ***SELECT*** statement that retrieves data from the ***employee*** table and the previous recursive result set (***cteEmp***) using an ***INNER JOIN***. This will recursively retrieve data for all levels of the employee hierarchy. The ***emplevel*** column is incremented by 1 for each recursive level.

    ```sql
        UNION ALL
        SELECT e.emp_id, e.first_name, e.manager_id, r.emplevel+1
        FROM employee e
        INNER JOIN cteEmp r
        ON e.manager_id = r.emp_id
    ```

- The ***SELECT*** statement after the CTE retrieves data from the ***cteEmp*** result set. A ***CASE*** statement is used to assign a title based on the ***emplevel*** value. The result set is sorted by ***emplevel***.The output includes the **emp_id**, **first_name**, **manager_id**, and **title** columns.

    ```sql
        SELECT emp_id,
        first_name,
        manager_id,
        CASE 
            WHEN emplevel = 1 THEN "CEO"
            WHEN emplevel = 2 THEN "Managing Director"
            ELSE "VP"
        END AS title
        FROM cteEmp
        ORDER BY emplevel;
    ```

