# Master Notebook 
## Lunara Tech - Full Pipeline


This master notebook serves as the central controller for executing the entire data science workflow for Lunara Tech's ADS-508 project: **"Optimizing Workplace Health Policies Through Predictive Analytics."** It sequentially runs all modular notebooks that correspond to key phases of the project, including:

1. **Data Ingestion** – Uploading data to Amazon S3
2. **Data Processing and Analysis** – Performing EDA and analysis
3. **Feature Engineering** – Cleaning, transforming, and preparing datasets
4. **Modeling and Evaluation** – Training and evaluating predictive models
5. **Resorce Release** - Release resources from AWS for savings

Each phase is encapsulated in a standalone notebook to ensure clarity, modularity, and reusability. This master notebook orchestrates these steps using Jupyter’s `%run` magic command, ensuring that all phases execute in a logical, reproducible order. This approach also avoids variable conflicts and allows for easier debugging and iteration.


In [None]:
# Persist pipeline run variable to ensure notebooks do not end resource during the pipeline run through the master notebook
IS_PIPELINE_RUN = True 
%store IS_PIPELINE_RUN

### Step 0: Data Ingestion

In [None]:
print(" * Uploading data to s3")
%run ./00_data_upload.ipynb

### Step 1: Data Processing and Analysis

In [None]:
print(" * Running Data Processing and Analysis")
%run ./01_data_processing_and_analysis_pipeline.ipynb

### Step 2: Feature Engineering

In [None]:
print("* Data preparation... ")

%store -r IS_PIPELINE_RUN 

try:
    if IS_PIPELINE_RUN:
        import sys
        %run ./02_data_preparation.ipynb 
except NameError: 
    print("'IS_PIPELINE_RUN' not found - assuming interactive mode.")

### Step 3: Modeling and Evaluation

In [None]:
print("* Modeling...")
%run ./03_modeling.ipynb

### Final Step: Resource Release

In [None]:
print("* Resource release...")

%run ./04_resource_release.ipynb