Welcome to the Data Warehouse and Analytics Project repository! 🚀
This project showcases a complete data warehousing and analytics solution, including building a SQLServerManagementStudio data warehouse and generating insights from integrated data sources. It is intended as a portfolio project and demonstrates practical data engineering and analytics skills.
Credits: Project inspired by and based on content from Data with Baraa.
This project includes:
- Data Architecture: Building a data warehouse using the Medallion Architecture with Bronze, Silver, and Gold layers.
- ETL Pipelines: Extracting, transforming, and loading data from multiple source systems into the warehouse.
- Data Modeling: Designing fact and dimension tables for efficient analytical queries.
- Data Analysis: Using SQL queries to generate insights.
- Datasets: Access the project datasets (CSV files).
- SQLServerManagementStudio: Open-source relational database for hosting your data warehouse.
- Notion Project Steps: Access all project phases and tasks.
SQLServerManagementStudio-Data-Warehouse-Project/
│
├── datasets/ # Raw datasets used for the project (ERP and CRM data)
│
├── docs/ # Project documentation and architecture details
│ ├── ETL_process.png # A diagram created in Draw.io showing ETL techniques and methods used in this project
│ ├── data_architecture.png # A diagram created in Draw.io showing the project's architecture
│ ├── data_catalogue.md # Catalog of datasets, including field descriptions and metadata
│ ├── data_flow.png # A diagram created in Draw.io for the data flow diagram
│ ├── data_integration_model.png # A diagram created in Draw.io that shows how tables are related to each other.
│ ├── gold_layer_data_model.png # A diagram created in Draw.io that shows gold layer data model (star schema)
│ ├── naming-conventions.md # Consistent naming guidelines for tables, columns, and files
│
├── scripts/ # SQL scripts for ETL and transformations
│ ├── bronze/ # Scripts for extracting and loading raw data
│ ├── silver/ # Scripts for cleaning and transforming data
│ ├── gold/ # Scripts for creating analytical models
│
├── tests/ # Test scripts and quality files
│
├── README.md # Project overview and instructions
├── LICENSE # License information for the repository
├── .gitignore # Files and directories to be ignored by Git
Build using SQLServerManagementStudio a data warehouse that combines sales data for reporting and analysis.
- Load data from two CSV sources (ERP and CRM).
- Clean and handle data quality issues before analysis.
- Integrate both sources into a single, analytics-friendly data model.
- Use only the latest dataset (no historical tracking required).
- Document the data model for business users and analytics teams.
Develop SQL-based analytics to deliver granular insights into:
- Customer Behavior
- Product Performance
- Sales Trends
These insights empower stakeholders with critical business metrics, enabling strategic decision-making.
The data architecture for this project adheres to the Medallion Architecture, comprising Bronze, Silver, and Gold layers:
- Bronze Layer: Stores raw data in its original form from the source systems. Data is ingested from CSV files into the SQLServerManagementStudio database.
- Silver Layer: Involves data cleansing, standardization, and normalization processes to prepare data for analysis.
- Gold Layer: Hosts business-ready data modeled into a star schema for reporting and analytics.
This project is licensed under the MIT License. You are free to use, modify, and share this project with proper attribution.
Hi there! I'm Adrian El-Rahi, a computer engineering student at Usek university. I am currently applying the SQL I have been taught in GIN300 - Database Course provided by the university. It's my first data engineering project!
