# db2 Product Analytics

IBM is analysing how their employees are utilising the Db2 database by tracking the SQL queries executed by their employees. The objective is to generate data to populate a hitogram that shows the number of unique queries run by employees during the third quarter of 2023 (July to September). Additionally, it should count the number of employees who did not run queries during this period.

Display the number of unique queries as histogram categories, along with the count of employees who executed that number of unique queries.

# Answer

There will be two tables I'll work with: `queries` & `employees`. This is the `queries` table.

```
CREATE TABLE queries (
	employee_id smallint,
	query_id integer,
	query_starttime timestamp,
	execution_time smallint
);

COPY queries
FROM '/YourDirectory/queries.csv'
WITH (FORMAT CSV, HEADER);

SELECT * FROM queries;
```

<img src = "queries Table.png" width = "600" style = "margin:auto"/>

This is the `employees` table.

```
CREATE TABLE employees (
	employee_id smallserial,
	full_name text,
	gender varchar(10)
);

COPY employees
FROM '/YourDirectory/employees.csv'
WITH (FORMAT CSV, HEADER);

SELECT * FROM employees;
```

First, I think I'll join the tables together, group by `full_name` & count the number of unique queries each employee executed for queries ran during the third quarter of 2023.

```
WITH emp_queries_q3_2023
AS (
	SELECT employees.employee_id,
		   queries.query_id
	FROM queries
	FULL OUTER JOIN employees
		ON queries.employee_id = employees.employee_id
	WHERE query_id IS NULL
		OR (date_part('year', 
			queries.query_starttime) = 2023
			AND date_part('month', 
			queries.query_starttime) IN (7, 8, 9))
)
SELECT employee_id,
	   count(DISTINCT query_id)
FROM emp_queries_q3_2023
GROUP BY employee_id;
```

<img src = "Queries Ran by Employees in Q3 2023.png" width = "600" style = "margin:auto"/>

We can then use the resulting query to find the number of employees who for each number of unique queries.

```
WITH emp_unique_queries
AS (
	WITH emp_queries_q3_2023
	AS (
		SELECT employees.employee_id,
			   queries.query_id
		FROM queries
		FULL OUTER JOIN employees
			ON queries.employee_id = employees.employee_id
		WHERE query_id IS NULL
			OR (date_part('year', 
				queries.query_starttime) = 2023
				AND date_part('month', 
				queries.query_starttime) IN (7, 8, 9))
	)
	SELECT employee_id,
		   count(DISTINCT query_id) AS unique_queries
	FROM emp_queries_q3_2023
	GROUP BY employee_id
	)
SELECT unique_queries,
	   count(employee_id) AS employee_count
FROM emp_unique_queries
GROUP BY unique_queries
ORDER BY unique_queries;
```

<img src = "Count of Employees By Number of Unique Queries.png" width = "600" style = "margin:auto"/>

In quarter 3 of 2023, 22 employees wrote zero queries, 86 employees wrote one query, 46 employees wrote two unique queries, 19 employees wrote three unique queries, 4 employees wrote four unique queries, & 1 employee wrote five unique queries.