## Objective

Develop a modern data warehouse using SQL Server to consolidate sales data, enabling analytical reporting and informed decision-making.

### Specifications

- **Data Sources**: Import data from two source systems (ERP and CRM) provided as CSV files.
- **Data Quality**: Cleanse and resolve data quality issues prior to analysis.
- **Integration**: Combine both sources into a single, user-friendly data model designed for analytical queries.
- **Scope**: Focus on the latest dataset only; historization of data is not required.
- **Documentation**: Provide clear documentation of the data model to support both business stakeholders and analytics teams.

### Selecting a data management paradigm / using the _'Medallion'_ architecture

<br>

![data architecture figure 1 image](img/journal_fig1.png)

<br>


| | Bronze Layer | Silver Layer | Gold Layer |
| - | - | - | - |
| **Definition** | Raw, unprocessed data as-is from sources | Clean and standardised data | Business-ready data | 
| **Objective** | Traceability & debugging | (Intermediate layer) Prepare data for analysis | Provide data to be consumed for reporting & analytics |
| **Object Type** | Tables | Tables | Views |
| **Load Method** | Full load (truncate & insert) | Full load (truncate & insert) | None |
| **Data Transformation** | None (as-is) | Data cleaning, standardisation, normalisation, enrichment & derived columns | Data integration, aggregation, business logic & rules |
| **Data Modeling** | None (as-is) | None (as-is) | Start schema, aggregated objects, flat tables |
| **Target Audience** | Data engineers | Data engineers & analysts | Data analysts & business users |  

<br>

### Layers for Seperation of Concerns (SoC)

The above layers mean that we have seperation of concerns (SoC) - an important principle where we take a complex system and break it down into independent parts, each focused on a specific responsibility or operation without overlapping with others. So for a data warehouse, SoC means breaking the architecture into independent layers—such as ingestion, transformation, storage, and consumption—so each layer handles its own responsibility without interfering with the others...

<br>

![data architecture figure 2 image](img/journal_fig2.png)

### 