This project demonstrates how to design and build a modern data warehouse using MySQL, covering ETL, data modeling, and data analytics.
π Features
- Data Sources: Import data from CSV files (e.g. ERP + CRM)
- Data Integration: Merge sources into a unified, user-friendly structure
- Data Quality: Cleanse, deduplicate, and standardize data
- ETL Pipelines: Extract β transform β load workflows
This script truncates and reloads the Bronze Layer tables from CSV files located in the datasets/ folder.
All paths are relative to the project root, making it portable and easy to reuse.
This script transforms data from the Bronze layer into a cleaned and conformed Silver layer. It standardizes formats, deduplicates records, validates values, and applies business rules to prepare data for analytic consumption.
This script builds business-ready models in the Gold layer, such as fact and dimension tables. It aggregates, enriches, and optimizes data for reporting, ensuring high performance and clarity for BI and advanced analytics.
Goal: Leverage SQL-driven analysis to uncover key business intelligence across multiple dimensions, including:
- Customer patterns and engagement
- Product performance and adoption
- Sales growth and trend analysis
The insights generated from these queries provide stakeholders with clear visibility into operational performance, enabling informed and strategic decision-making.
β Project Management & Documentation
This project is organized and maintained in Notion for tracking progress, design documentation, and task planning.
π Notion Workspace: https://www.notion.so/Data-Warehouse-Project-28ae9857685080778287e4d04aa080b3?source=copy_link
Licensed under [MIT] β feel free to reuse or adapt.
Iβm LA, an aspiring data engineer learning to build clean, scalable data systems. This project reflects my early efforts in data architecture, pipelines, and analytics.