This project was created as part of my learning journey to strengthen my SQL skills, particularly in writing and understanding subqueries. It was completed alongside the SQL for Data Science course on Coursera using PostgreSQL (pgAdmin 14).
- PostgreSQL (local database)
- pgAdmin 14
- Coursera interactive environment (Rhyme)
- Practice using subqueries in different SQL clauses (
SELECT,WHERE,FROM,ORDER BY) - Explore correlated subqueries and layered subqueries
- Apply subqueries to real-world inspired scenarios (e.g., customer purchase analysis, department manager filters)
- Subqueries returning single values, single columns, and single rows
IN,NOT IN, andBETWEENoperators with subqueries- Filtering with subqueries in
WHEREclause - Subqueries in the
FROMclause as derived tables - Alternative query structures using CTEs (
WITHclauses) and modernJOINsyntax
Below are selected examples demonstrating my use of subqueries in this project:
-- Retrieve a list of employees that are not managers (subquery in WHERE clause)
SELECT * FROM employees
WHERE emp_no NOT IN
(SELECT emp_no FROM dept_manager);-- Retrieve a list of managers, their department names, and their names (subquery in FROM clause)
SELECT dm.*, d.dept_name, e.first_name, e.last_name
FROM dept_manager AS dm, employees AS e,
(SELECT dept_no, dept_name FROM departments) AS d
WHERE dm.dept_no = d.dept_no AND e.emp_no = dm.emp_no;-- Retrieve customers that bought the most from the store (subquery with grouping and ordering)
SELECT c.customer_id, c.customer_name, a.cust_count
FROM customers AS c,
(SELECT customer_id, COUNT(*) AS cust_count
FROM sales
GROUP BY customer_id
ORDER BY cust_count DESC
LIMIT 1) AS a
WHERE c.customer_id = a.customer_id;-- Retrieve customers with the most purchases and highest total sales
SELECT c.customer_id, c.customer_name, c.segment, a.cust_count, a.total_sales
FROM customers AS c,
(SELECT customer_id, COUNT(*) AS cust_count, SUM(sales) AS total_sales
FROM sales
GROUP BY customer_id
ORDER BY total_sales DESC, cust_count DESC) AS a
WHERE c.customer_id = a.customer_id
ORDER BY a.total_sales DESC, a.cust_count DESC
LIMIT 5;👉 Full solutions and additional exercises can be found in Mastering-SQL-Subqueries-Working.sql.
-
BETWEEN with subqueries: In the graded assessment, I initially answered that the
BETWEENoperator can be used with subqueries, as this is valid in general SQL when the subqueries return single scalar values for the lower and upper bounds. However, this was marked incorrect in the context of the graded task, as the specific structure of the subqueries in that scenario did not align with that usage. -
JOINs: In the graded assessment, I selected that it is important to join on the related entity when writing JOIN statements. This was marked incorrect because, in SQL, we join on common or related fields (columns), not entire entities (tables). This clarified the importance of precise terminology when describing join conditions.
-
Overall, the graded assessment highlighted the importance of aligning SQL answers not only with general technical correctness but also with the specific examples and logic demonstrated in a given learning context.
- Rhyme environment performance: While the course provided access to the Rhyme interactive environment, I experienced a noticeable delay between my mouse movements and the environment's response. This impacted speed and ease of use. As a result, I chose to complete the majority of the project locally using pgAdmin 14 and PostgreSQL, which provided a smoother and more efficient experience for testing and refining queries.
data/Customers.csv: Customer data used for exercises and testing subqueriesdata/Sales.csv: Original sales data filedata/Sales_CLEAN.csv: Cleaned version of sales data for importscripts/scripts to create tables.txt: Table creation scripts for recreating the database structurescripts/Mastering-SQL-Subqueries-Working.sql: My working file containing solutions to exercises (both my solutions and instructor’s solutions)scripts/Mastering-SQL-Subqueries.sql: Main SQL script for core subquery practicescripts/solution-to-exercises.sql: Instructor-provided or reference solutions to exercisesscripts/EmployeesExcerpt.sql: Example data or queries related to employee exercises
- Coursera: SQL for Data Science (course completed)
- My GitHub profile
- Apply subquery techniques to more complex datasets
- Explore performance optimization for queries with nested subqueries
- Use window functions alongside subqueries for advanced analytics
Feel free to explore more of my work on GitHub or connect with me on LinkedIn.