Skip to content

akhila2-code/SQL_datawarehouse_project_1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

44 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

SQL Data Warehouse Project

πŸ“Š Project Overview

This project demonstrates the end-to-end Data Warehouse pipeline using a multi-layered architecture β€” Bronze, Silver, and Gold layers. It integrates data from CRM and ERP systems, performs data cleaning, transformation, and analytics, and applies Exploratory Data Analysis (EDA) and Advanced Data Analysis (ADA).

🧱 Data Architecture

1. Bronze Layer (Raw Data Layer)

  • Data is extracted directly from source systems:

    • CRM (Customer Relationship Management)
    • ERP (Enterprise Resource Planning)
  • Data files included:

    • CRM: cust_info.xlsx, prd_info.xlsx, sales_details.xlsx
    • ERP: CUST_AZ12.xlsx, LOC_A101.xlsx, PX_CAT_G1V2.xlsx
  • This layer stores raw, unprocessed data.

2. Silver Layer (Cleaned & Transformed Data Layer)

  • Cleaned and standardized data from Bronze Layer.

  • Key transformations performed:

    • Removed duplicates and null values.
    • Standardized column names and formats.
    • Handled missing values.
    • Created relationships between CRM and ERP datasets.
  • Tables include:

    • crm_sales_details
    • crm_cust_info
    • crm_prd_info
    • erp_cust_az12
    • erp_loc_a101
    • erp_px_cat_g1v2

3. Gold Layer (Analytical Layer)

  • Contains final, analysis-ready dimension and fact tables.

  • Tables:

    • fact_sales
    • dim_customers
    • dim_products
  • Relationships established between these tables:

    • fact_sales links to both dim_customers and dim_products using foreign keys.

πŸ” Exploratory Data Analysis (EDA)

Performed in this phase:

  • Analyzed customer demographics (country, gender, marital status).
  • Reviewed sales trends by product and location.
  • Checked for outliers and missing values.
  • Validated data consistency between CRM and ERP sources.

πŸ“ˆ Advanced Data Analysis (ADA)

  • Top-performing products.
  • Sales distribution by country and category.
  • Customer segmentation based on purchase behavior.

βš™οΈ Tools & Technologies

  • SQL – for data transformation and analysis
  • Excel / CSV – for source data

SQL_datawarehouse_project_1/ β”‚ β”œβ”€β”€ datasets/ β”‚ β”œβ”€β”€ source_crm/ β”‚ β”‚ β”œβ”€β”€ cust_info.xlsx β”‚ β”‚ β”œβ”€β”€ prd_info.xlsx β”‚ β”‚ └── sales_details.xlsx β”‚ └── source_erp/ β”‚ β”œβ”€β”€ CUST_AZ12.xlsx β”‚ β”œβ”€β”€ LOC_A101.xlsx β”‚ └── PX_CAT_G1V2.xlsx β”‚ β”œβ”€β”€ bronze/ β”œβ”€β”€ silver/ β”œβ”€β”€ gold/ └── README.md

βœ… Summary

This project demonstrates a complete Data Warehouse ETL process, starting from raw data ingestion to analytical insights. It showcases skills in:

  • SQL Data Modeling
  • ETL and Data Cleaning
  • EDA & ADA
  • Building a structured multi-layer architecture

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published