This project focuses on implementing a decision support system for monitoring the activity of a healthcare institution. It is structured into four distinct phases ("lots") and leverages technologies such as Snowflake for data warehousing and Power BI for reporting.
-
Create and activate a virtual environment
# Create virtual environment python -m venv venv # Activate virtual environment # On Windows: source venv\Scripts\activate # On macOS/Linux: source venv/bin/activate
-
Install required packages
pip install -r requirements.txt
-
Configure environment variables
-
Create a
.envfile in the project root -
Add your Snowflake credentials:
SNOWFLAKE_USER=your_username SNOWFLAKE_PASSWORD=your_password SNOWFLAKE_ACCOUNT=your_account SNOWFLAKE_WAREHOUSE=your_warehouse SNOWFLAKE_ROLE=your_role
-
-
Run the installation script
python src/install_sid.py sql/install_sid.sql python src/launch_load_sid.py python src/stg_to_wrk.py sql/stg_to_wrk.sql python src/wrk_to_socle.py sql/wrk_to_socle.sql
The project is organized into the following main directories:
.
├── data/ # Data files for ingestion
├── figures/ # Project diagrams and visualizations
├── script/ # Utility scripts
├── sql/ # SQL scripts for database operations
│ ├── install_sid.sql # Database and schema creation
│ ├── staging.sql # Staging layer tables
│ ├── stg_to_wrk.sql # Staging to Working transformations
│ ├── wrk_to_socle.sql # Working to Core transformations
│ └── socle.sql # Core layer tables
└── src/ # Python source code
├── commons.py # Common utilities and connections
├── install_sid.py # Installation script
├── launch_load_sid.py # Data loading script
├── stg_to_wrk.py # Staging to Working transformations
└── wrk_to_socle.py # Working to Core transformationsThe global architecture includes:
- Data Ingestion (Batch from flat files)
- Staging / ODS Layer
- Core / Working / Historical layers
- Quality and unification mechanisms
- Control and rejection handling
- Orchestration of the entire processing pipeline
- Power BI dashboards
- Set up project tools: Taïga, VSCode, GitHub, etc
- Define the physical data model for the data warehouse
- Create UML diagrams for each layer (staging, transition, core)
- SQL scripts in
sql/install_sid.sqlto create databases and tables - Python script
src/install_sid.py:- Can be rerun without errors
- Recreates STG and WRK tables each time
- Avoids recreating existing SOC and TCH tables
- Logs all execution
- SQL scripts in
sql/staging.sqlfor staging tables - Python script
src/launch_load_sid.pyto automate ingestion - Common utilities in
src/commons.py
- SQL scripts in
sql/stg_to_wrk.sqlfor WRK layer transformations - Python script
src/stg_to_wrk.pyto orchestrate transformations - SQL scripts in
sql/socle.sqlfor SOC layer - Python script
src/wrk_to_socle.pyfor final transformations
- Create views on data warehouse tables
- Export data to CSV/XLSX
- Design dashboards using Power BI
-
Lot 1:
- Environment description
- Physical data model
- UML diagrams for each layer
-
Lot 2:
- SQL scripts for DB and table creation
- SQL scripts for STG data ingestion
- Python automation scripts
-
Lot 3:
- SQL scripts for WRK/SOC layer loading
- Python automation scripts for transformations
-
Lot 4:
- Power BI dashboards
- Project presentation date: 20/05/2025
- Project delivery date: 19/06/2025
© 2025 SMART TEAM 5 – All rights reserved.