This repository contains a series of structured SQL queries designed to extract, clean, and analyze data from the AdventureWorks database. The project follows a progressive learning path, starting from basic data retrieval to complex aggregations and debugging legacy code.
The goal of this project was to solve specific business questions using SQL, while maintaining high standards for code readability, documentation, and performance.
- Table Joins: INNER, LEFT, and Composite Joins.
- Aggregations: GROUP BY, HAVING, and complex mathematical functions (AVG, SUM, DATE_DIFF).
- Data Investigation: Identifying and fixing data duplication (Fan-out) and logical errors.
- Code Standards: Implementation of
snake_casealiasing, consistent indentation, and professional commenting.
The project is organized into three main modules:
1_1_product_subcategory_overview.sql: Basic retrieval with inner joins.1_2_product_category_overview.sql: Multilevel joins for categorization.1_3_high_price_active_bikes.sql: Advanced filtering and active product status analysis.
2_1_work_order_analysis.sql: Aggregating workload and costs by location.2_2_work_order_efficiency.sql: Calculating production lead times usingDATE_DIFF.2_3_expensive_work_orders.sql: UsingHAVINGto filter aggregated results.
3_1_special_offer_fix.sql: Solving row duplication issues caused by incomplete join keys.3_2_vendor_info_refactoring.sql: Refactoring poorly written legacy code for better maintainability.
- Modern SQL Syntax: All queries are optimized for BigQuery/Standard SQL.
- Readability: Every script includes a header block detailing the business logic and technical approach.
- Best Practices: Keywords are capitalized, and aliases are descriptive to ensure ease of debugging.
The project utilizes the following tables from the AdventureWorks dataset:
Product,ProductSubcategory,ProductCategoryWorkOrderRouting,LocationSalesOrderDetail,SpecialOfferProduct,SpecialOfferVendor,VendorContact,Address