Skip to content

Fcevalerio/Roche_Capstone_Project

Repository files navigation

Experiment Tracking MLOps Pipeline

Capstone Project - MBDS Term 3

Author: Roche - Group 1

Date: March 16, 2026

Project Version: 0.2.0

Table of Contents

Overview

This project implements an end-to-end MLOps pipeline for tracking experiments at risk of delay in a laboratory setting. The system predicts operational risk scores for experiments based on workflow logs, instrument telemetry, reagent data, and queue management information.

Key Features

  • Data Ingestion: Automated collection from lab instruments and workflow systems
  • Data Processing: Multi-layer data lake architecture (Bronze, Silver, Gold) with event-driven ETL
  • Risk Prediction: LightGBM final model for experiment delay prediction (holdout ROC-AUC 0.977, PR-AUC 0.94)
  • Real-time Monitoring: Event-driven architecture via AWS EventBridge for continuous risk assessment
  • Automated Alerts: Email notifications via SendGrid for high-risk experiments
  • Interactive Dashboard: Streamlit application with embedded Tableau for experiment monitoring
  • Model Governance: MLflow tracking for model versioning and artifact management
  • Drift Detection: Continuous model performance monitoring with automated retraining via SageMaker
  • Model Training at Scale: Dockerized SageMaker training infrastructure for automated model updates

Business Value

The system helps laboratory managers:

  • Proactively identify experiments likely to experience delays
  • Optimize resource allocation and scheduling
  • Reduce operational costs through predictive maintenance
  • Improve overall lab efficiency and throughput

Project Structure

├── 01-Documents/                       # Project documentation
├── 02-Architecture/                    # Architecture diagrams
│   └── Roche_RFP_Architecture.drawio
├── 03-Data/                            # Data generation and processing
│   ├── 01_generate_workflow_logs.py
│   ├── 02_generate_instrument_telemetry.py
│   ├── 03_generate_reagent_logs.py
│   ├── 04_generate_queue_logs.py
│   ├── 05_dataset_generator.py
│   ├── config.py
│   ├── data_review.ipynb
│   ├── Raw/                            # Raw generated data
│   ├── Processed/                      # Processed datasets with feature importance
│   └── Documents/
├── 04-EDA/                             # Exploratory Data Analysis notebooks
│   ├── EDA_processed_file.ipynb        # Analysis of processed features
│   ├── EDA_workflow.ipynb
│   ├── EDA_telemetry.ipynb
│   ├── EDA_reagent.ipynb
│   ├── EDA_queue.ipynb
│   ├── helpers.py
│   └── figures/                        # EDA visualizations and summaries
├── 05-Experiment/                      # Machine learning experiments
│   ├── ML Final Model.ipynb            # Finalized LightGBM model
│   ├── ML Model I.ipynb
│   ├── ML Model II.ipynb
│   ├── ML Model III.ipynb
│   ├── helpers.py
│   └── ml_files/
├── 06-Deployment/                      # Production deployment code
│   ├── Docs/
│   ├── Experiment_interface/           # Streamlit dashboard application
│   │   ├── app.py
│   │   ├── Dockerfile
│   │   └── requirements.txt
│   ├── Inference_API/                  # Flask ML API for real-time inference
│   │   ├── app.py
│   │   ├── detect_drift.py
│   │   ├── inference.py
│   │   ├── retraining.py
│   │   ├── send_retraining_alert.py
│   │   ├── Dockerfile
│   │   └── requirements.txt
│   ├── Lambda_functions/               # Serverless processing functions
│   │   ├── template.yaml
│   │   ├── consolidate_dataset/        # Data consolidation ETL
│   │   ├── dashboard_data/             # Dashboard data synchronization
│   │   ├── generate_datasets/          # Dataset generation triggers
│   │   ├── run_inference/              # Inference orchestration
│   │   └── send_email_alert/           # Risk alert notifications
│   └── Sagemaker_Training_Image/       # Docker image for SageMaker training
│       ├── Dockerfile
│       ├── requirements.txt
│       └── retraining.py
├── 07-Deliverables/                    # Final project deliverables
│   └── Roche_G1_Dashboard.twb          # Dashboard embeded in Streamlit          
│   └── Roche_G1_ML_Final_Model.ipynb   # Final trained model and results
│   └── Roche_G1_Poster.pdf             
│   └── Roche_G1_ppt.pdf                
├── 00-Backups/                         # Previous versions and backups
├── pyproject.toml                      # Project configuration
├── requirements.txt                    # Dependencies
└── README.md                           # This file

Architecture

High-Level Architecture

The system follows an event-driven lakehouse architecture deployed on AWS, utilizing serverless components and containerized services for scalability and cost-efficiency.

flowchart TD

    Sources["🔬 Lab Instruments & Workflow Systems"]

    subgraph AWS[" "]
        subgraph DataLake["Data Lake Architecture"]
            S3Bronze["S3 Bronze Layer<br/>Raw Data"]
            S3Silver["S3 Silver Layer<br/>Cleaned Data"]
            S3Gold["S3 Gold Layer<br/>Features"]
        end

        subgraph Events["Event Orchestration"]
            EventBridge["AWS EventBridge<br/>Rules Engine"]
        end

        subgraph Serverless["Serverless Processing"]
            LambdaConsolidate["Lambda:<br/>Consolidate"]
            LambdaInference["Lambda:<br/>Inference"]
            LambdaEmail["Lambda:<br/>Email Alert"]
            LambdaDashboard["Lambda:<br/>Dashboard"]
        end

        subgraph MLOps["ML Operations"]
            EC2API["Flask API<br/>EC2 Docker"]
            SageMaker["SageMaker<br/>Training"]
            MLflow["MLflow<br/>Tracking"]
            Models["S3 Models &<br/>Artifacts"]
        end

        subgraph Presentation["Visualization & Alerts"]
            Dashboard["Streamlit<br/>Dashboard"]
            Tableau["Tableau<br/>Embedded"]
            SendGrid["SendGrid<br/>Alerts"]
        end
    end

    CI_CD["GitHub Actions<br/>CI/CD"]

    %% Data Flow
    Sources -->|Raw Data| S3Bronze
    S3Bronze -->|Trigger| EventBridge
    EventBridge -->|Process| LambdaConsolidate
    LambdaConsolidate -->|Store| S3Silver

    S3Silver -->|Trigger| EventBridge
    EventBridge -->|Execute| LambdaInference
    LambdaInference -->|Invoke| EC2API
    EC2API -->|Predict| S3Gold

    EC2API -->|Detect Drift| MLflow
    MLflow -->|Trigger| SageMaker
    SageMaker -->|Update| Models
    EC2API -->|Load| Models

    S3Gold -->|Monitor| EventBridge
    EventBridge -->|Alert| LambdaEmail
    LambdaEmail -->|Send| SendGrid

    EventBridge -->|Sync| LambdaDashboard
    LambdaDashboard -->|Update| Dashboard
    Dashboard -->|Display| Tableau

    CI_CD -->|Deploy| EC2API
    CI_CD -->|Deploy| LambdaConsolidate
    CI_CD -->|Deploy| LambdaInference
    CI_CD -->|Deploy| LambdaEmail
    CI_CD -->|Deploy| LambdaDashboard
    CI_CD -->|Build| SageMaker
Loading

Event-driven data pipeline with serverless processing, ML inference, and real-time monitoring dashboards.

Low-Level Architecture

Detailed component interactions showing the complete MLOps workflow with training infrastructure.

flowchart TB

    Sources["🔬 Lab Instruments &<br/>Workflow Systems"]

    subgraph AWS["AWS Event-Driven Lakehouse MLOps Platform"]

        subgraph DataLake["📊 Data Lake"]
            Bronze["S3 Bronze<br/>Raw Data<br/>Ingestion"]
            Silver["S3 Silver<br/>Cleaned &<br/>Transformed"]
            Gold["S3 Gold<br/>Feature Store<br/>Ready for ML"]
            DashboardData["S3 Gold/<br/>dashboard_data"]
        end

        subgraph EventOrch["⚙️ Event Orchestration"]
            EventBridge["AWS EventBridge<br/>Rule Engine"]
        end

        subgraph ETL["🔄 ETL Lambdas"]
            ConsolidateLambda["Lambda:<br/>Consolidate Dataset<br/>Bronze→Silver"]
            GenDataLambda["Lambda:<br/>Generate Datasets"]
        end

        subgraph Inference["🎯 Inference Layer"]
            InferenceLambda["Lambda:<br/>Run Inference<br/>Trigger"]
            FlaskAPI["Flask API<br/>on EC2 Docker<br/>REST Endpoints"]
            Preprocessor["Model<br/>Preprocessor"]
            Model["LightGBM<br/>Final Model<br/>ROC-AUC 0.977 (holdout)"]
        end

        subgraph Training["🚀 Training Infrastructure"]
            DriftDetect["Drift Detection<br/>Module"]
            SageMaker["SageMaker<br/>Training Job<br/>Docker Container"]
            TrainScript["retraining.py<br/>Training Logic"]
        end

        subgraph MLGov["📋 ML Governance"]
            MLflow["MLflow<br/>Tracking Server"]
            RDS[("RDS SQL<br/>Tracking DB")]
            MLArtifacts["S3 ml/<br/>mlflow_artifacts"]
            Models["S3 ml/<br/>models &<br/>preprocessors"]
        end

        subgraph Notifications["📧 Alerts"]
            EmailLambda["Lambda:<br/>Send Risk Email"]
            SendGrid["SendGrid<br/>Email Service"]
        end

        subgraph Dashboard["📈 Visualization"]
            DashboardLambda["Lambda:<br/>Dashboard Sync"]
            ElasticBeanstalk["Elastic Beanstalk<br/>Host"]
            Streamlit["Streamlit<br/>Dashboard"]
            Tableau["Embedded Tableau<br/>Analytics"]
        end

        subgraph CICD["🔧 DevOps"]
            GitHub["GitHub Actions<br/>CI/CD"]
            ECR["ECR<br/>Repositories"]
        end
    end

    %% Data Ingestion Flow
    Sources -->|Raw Data| Bronze
    Bronze -->|Trigger Event| EventBridge
    EventBridge -->|Process| ConsolidateLambda
    ConsolidateLambda -->|Store| Silver

    %% Inference Flow
    Silver -->|Trigger Event| EventBridge
    EventBridge -->|Invoke| InferenceLambda
    InferenceLambda -->|Call API| FlaskAPI
    Models -->|Load| Preprocessor
    Models -->|Load| Model
    Preprocessor -->|Transform| FlaskAPI
    FlaskAPI -->|Predict| Model
    FlaskAPI -->|Store Predictions| Gold

    %% Drift & Retraining
    FlaskAPI -->|Monitor| DriftDetect
    DriftDetect -->|Detected| SageMaker
    SageMaker -->|Execute| TrainScript
    TrainScript -->|Log Metrics| MLflow
    MLflow -->|Track| RDS
    MLflow -->|Store| MLArtifacts
    TrainScript -->|Save| Models
    DriftDetect -->|Alert if Drift| SendGrid

    %% Alert Flow
    Gold -->|Trigger Event| EventBridge
    EventBridge -->|High Risk| EmailLambda
    EmailLambda -->|Send Alert| SendGrid

    %% Dashboard Flow
    Gold -->|Trigger Event| EventBridge
    EventBridge -->|Sync| DashboardLambda
    DashboardLambda -->|Update| DashboardData
    DashboardData -->|Load| Streamlit
    Streamlit -->|Display| Tableau
    ElasticBeanstalk -->|Host| Streamlit

    %% CI/CD
    GitHub -->|Build & Push| ECR
    GitHub -->|Deploy| ConsolidateLambda
    GitHub -->|Deploy| InferenceLambda
    GitHub -->|Deploy| EmailLambda
    GitHub -->|Deploy| DashboardLambda
    GitHub -->|Deploy| FlaskAPI
    GitHub -->|Deploy| ElasticBeanstalk
    ECR -->|Image| SageMaker
Loading

Key Components

Data Lake Layers (Medallion Architecture)

  • Bronze Layer: Raw data ingestion from lab instruments and workflow systems
  • Silver Layer: Cleaned, validated, and transformed data after quality checks
  • Gold Layer: Aggregated features and curated datasets ready for ML and analytics

Event Orchestration

  • AWS EventBridge: Decouples services and triggers workflows based on S3 events and custom rules

Serverless Data Processing (Lambda Functions)

  • Consolidate Dataset: ETL pipeline aggregating raw data from Bronze to Silver
  • Generate Datasets: Triggers dataset generation and feature engineering on schedule
  • Run Inference: Orchestrates model predictions on new data, triggers Flask API
  • Send Risk Email: Sends high-risk experiment alerts via SendGrid
  • Dashboard Sync: Synchronizes Gold layer data to dashboard updates

ML Inference Layer

  • Flask API (EC2 Docker): RESTful API for real-time risk predictions
  • Final Model (LightGBM): Selected production model (holdout ROC-AUC 0.977, PR-AUC 0.94)
  • Model Preprocessor: Standardized feature engineering and transformation

Model Training Infrastructure

  • SageMaker Training: Containerized training environment for model retraining
  • Training Script (retraining.py): Orchestrates model retraining with latest data
  • Drift Detection: Monitors data and model performance drift

ML Governance & Tracking

  • MLflow Tracking Server: Experiment tracking and artifact management
  • RDS SQL Server: Database for MLflow tracking metadata
  • S3 Artifact Storage: Stores models, preprocessors, and training artifacts

Notifications & Monitoring

  • SendGrid Email Service: Delivers alerts for high-risk experiments and drift notifications

Visualization & Analytics

  • Streamlit Dashboard: Interactive web application for monitoring and analytics
  • Embedded Tableau: Advanced analytics and business intelligence visualizations
  • Elastic Beanstalk: Managed hosting for dashboard applications

CI/CD & DevOps

  • GitHub Actions: Automated testing, building, and deployment workflows
  • ECR Repositories: Docker image registry for containerized services

CI/CD Pipeline Architecture

The project utilizes GitHub Actions for continuous integration and deployment, with automated testing and deployment triggered on push to the master branch.

flowchart LR
    subgraph Source["🔀 Source Control"]
        GitHub["GitHub<br/>master"]
    end
    
    subgraph Routing["🔀 Path-Based Routing"]
        PathFilter["Route on<br/>Changed Files"]
    end
    
    subgraph Workflows["⚙️ Workflows"]
        DeployLambdas["Lambdas<br/>deploy_lambdas.yml"]
        DeployMLAPI["ML API<br/>deploy_ml_api.yml"]
        DeployWebsite["Dashboard<br/>deploy_website.yml"]
        DeploySageMaker["SageMaker<br/>deploy_sagemaker_training.yml"]
    end
    
    subgraph AWS["☁️ AWS Deployment"]
        Lambda["Lambda<br/>Functions"]
        EC2["EC2<br/>Docker"]
        Beanstalk["Elastic<br/>Beanstalk"]
        ECRDeploy["ECR<br/>Repository"]
    end

    subgraph Monitoring["📊 Monitoring"]
        Health["Health<br/>Checks"]
        Logs["CloudWatch<br/>Logs"]
    end
    
    GitHub -->|Push| PathFilter
    
    PathFilter -->|Lambda/**| DeployLambdas
    PathFilter -->|API/**| DeployMLAPI
    PathFilter -->|Interface/**| DeployWebsite
    PathFilter -->|Training/**| DeploySageMaker
    
    DeployLambdas -->|Update| Lambda
    DeployMLAPI -->|Deploy| EC2
    DeployWebsite -->|Deploy| Beanstalk
    DeploySageMaker -->|Push| ECRDeploy
    
    Lambda -->|Monitor| Health
    EC2 -->|Monitor| Health
    Beanstalk -->|Monitor| Health
    Health -->|Log| Logs
Loading

CI/CD Workflow Configuration

All workflows trigger on push to master branch with path-specific filters to run only when relevant code changes:

  1. deploy_lambdas.yml - Deploys serverless Lambda functions

    • Trigger: Push to 06-Deployment/Lambda_functions/
    • Steps: Checkout → Configure AWS → Login ECR → Build Docker → Push to ECR → Deploy Lambdas
    • Functions: consolidate_dataset, run_inference, send_email_alert, dashboard_data
  2. deploy_ml_api.yml - Deploys Flask ML API to EC2

    • Trigger: Push to 06-Deployment/Inference_API/
    • Steps: Checkout → Configure AWS → Login ECR → Build Docker → Push to ECR → Update EC2 Container
    • Endpoint: http://endpoint:5000 (inference and retraining)
  3. deploy_website.yml - Deploys Streamlit dashboard to Elastic Beanstalk

    • Trigger: Push to 06-Deployment/Experiment_interface/
    • Steps: Checkout → Configure AWS → Install EB CLI → Deploy to Elastic Beanstalk
    • Interface: Interactive web dashboard for experiment monitoring
  4. deploy_sagemaker_training.yml - Builds and pushes SageMaker training image

    • Trigger: Push to 06-Deployment/Sagemaker_Training_Image/
    • Steps: Checkout → Configure AWS → Login ECR → Build Docker → Tag → Push to ECR
    • Usage: SageMaker uses this image for automated model retraining
  5. delete_artifacts.yml - Manual cleanup of build artifacts

    • Trigger: Manual workflow dispatch
    • Steps: Require confirmation → Delete all artifacts from GitHub Actions

Deployment Strategy

  • Automated on push: All deployments run automatically when code is pushed to master with relevant path changes
  • Path-based filtering: Only components with code changes are rebuilt and deployed
  • AWS credentials: All workflows use GitHub Secrets for AWS authentication
  • Docker-based: Services are containerized for consistency across environments

Data Pipeline

Data Sources

  1. Workflow Logs: Experiment execution details, timing, and status
  2. Instrument Telemetry: Real-time instrument performance metrics
  3. Reagent Logs: Reagent usage and availability tracking
  4. Queue Logs: Laboratory queue management and wait times

Data Processing Flow

  1. Ingestion: Raw data collected from lab systems into S3 Bronze layer
  2. Validation & Cleaning: Data quality checks and basic transformations
  3. Feature Engineering: Aggregation and feature creation for ML models
  4. Storage: Processed data stored in optimized formats (Parquet)

Generated Dataset Features

  • Experiment metadata (type, priority, scientist, instrument)
  • Temporal features (start/end times, duration, delays)
  • Resource utilization metrics
  • Risk scores and predictions

Machine Learning Pipeline

Problem Statement

Predict the operational risk score for experiments, indicating likelihood of delay or failure.

Model Development

Exploratory Data Analysis

  • Univariate and multivariate analysis
  • Correlation analysis and feature importance
  • Time-series analysis of telemetry data
  • Queue congestion pattern identification

Model Selection

  • LightGBM for primary risk prediction (final selection)
  • XGBoost as an alternative gradient-boosting model
  • Ensemble methods for improved accuracy

Business Perspective — Why LightGBM

From a business and operational point of view we selected LightGBM as the production model because it best satisfies Roche's objectives of minimizing operational cost while remaining deployable and maintainable:

  • Cost-aware decisioning: LightGBM yielded the lowest expected operational misclassification cost at the optimized decision threshold, which directly translates to fewer missed risky experiments and lower overall operational losses.
  • High recall for risk detection: The model delivers strong recall on the risky class (reducing missed alerts), which aligns with the business priority of proactively catching at-risk experiments.
  • Stable, well-calibrated probabilities: Acceptable calibration and stable probability estimates make thresholds reliable for operational workflows and escalation playbooks.
  • Production efficiency & lower infra cost: Faster training and inference with lower memory footprint reduces compute and hosting costs (important for SageMaker jobs, EC2 containers, and Lambda-based orchestration).
  • Operational flexibility: The model adapts well to cost-sensitive threshold tuning, enabling product owners to change FP/FN trade-offs without retraining.
  • Explainability & governance: LightGBM integrates well with SHAP and MLflow for explainability and audit trails, supporting regulatory and stakeholder review.

Together, these business-aligned attributes—cost minimization, high-risk recall, operational efficiency, and explainability—make LightGBM the preferred production choice for Roche's experiment risk pipeline.

Feature Engineering

  • Temporal aggregations
  • Categorical encoding
  • Interaction features
  • Time-series derived metrics

Model Performance with optimized threshold (test set)

Metric XGBoost LightGBM
ROC-AUC 0.981 0.980
Precision 0.811 0.795
Recall 0.908 0.919
F1-Score 0.856 0.852

Model Monitoring

  • Drift Detection: Statistical tests for data drift
  • Performance Monitoring: Continuous evaluation metrics
  • Automated Retraining: Trigger-based model updates via SageMaker Training Image

Deliverables

The 07-Deliverables folder contains:

  • ML Final Model.ipynb: Complete final model implementation with:
    • LightGBM model trained and optimized on full dataset (final production model)
    • Feature importance analysis and visualization
    • Model evaluation metrics and performance analysis
    • Predictions with interpreted risk scores
    • Documentation of model decisions and trade-offs

Deployment

Infrastructure as Code

  • AWS Services: Lambda, EC2, S3, RDS, EventBridge, Elastic Beanstalk, SageMaker
  • Containerization: Docker for API, dashboard, and training services
  • CI/CD: GitHub Actions for automated deployment

Production Components

1. Data Processing (Lambda Functions)

  • consolidate_dataset: ETL pipeline to consolidate and transform raw data from Bronze to Silver layer
  • generate_datasets: Triggers dataset generation and feature engineering workflows
  • run_inference: Orchestrates model inference on processed data, stores predictions in Gold layer
  • send_email_alert: Sends alerts via SendGrid for high-risk experiments
  • dashboard_data: Synchronizes processed data for visualization in dashboards

2. ML API (Flask on EC2)

RESTful API container deployed on EC2 providing:

  • Real-time risk predictions via /process endpoint
  • Model retraining triggers via /retraining endpoint with drift detection
  • Integration with MLflow for model versioning and governance
  • Automated drift detection and retraining orchestration

3. Dashboard Application

Streamlit-based web interface for:

  • Real-time experiment risk visualization
  • Historical trend analysis
  • Model performance monitoring
  • Alert management and investigation

4. Model Training Infrastructure

SageMaker Training Image provides:

  • Dockerized training environment for model retraining
  • Integration with processed datasets from S3
  • Artifact storage in MLflow and S3
  • Automated hyperparameter optimization

Scalability Considerations

  • Serverless architecture for automatic scaling
  • Event-driven processing for efficient resource utilization
  • Multi-layer caching for improved performance
  • Containerized services for easy horizontal scaling

Installation

Prerequisites

  • Python 3.12 or higher
  • AWS CLI configured with appropriate credentials
  • Docker and Docker Compose (for local deployment)
  • Git

Local Development Setup

  1. Clone the repository:
git clone <repository-url>
cd "Capstone Project"
  1. Create and activate virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Configure environment variables (if needed):
cp .env.example .env
# Edit .env with your AWS and SendGrid configuration

Data Generation Setup

Run the data generation scripts in sequence from 03-Data directory:

cd 03-Data
python 01_generate_workflow_logs.py
python 02_generate_instrument_telemetry.py
python 03_generate_reagent_logs.py
python 04_generate_queue_logs.py
python 05_dataset_generator.py

This generates synthetic lab data simulating real-world experiment workflows for model training and testing.

Usage

Reviewing the Final Model

For the complete final model implementation and results:

Open 07-Deliverables/ML Final Model.ipynb in Jupyter

Running EDA Analyses

Explore data patterns across different data sources:

cd 04-EDA
jupyter notebook EDA_workflow.ipynb        # Workflow log analysis
jupyter notebook EDA_telemetry.ipynb       # Instrument telemetry analysis
jupyter notebook EDA_reagent.ipynb         # Reagent usage patterns
jupyter notebook EDA_queue.ipynb           # Queue congestion analysis
jupyter notebook EDA_processed_file.ipynb  # Final processed feature analysis

Model Experimentation and Training

Review model development process:

cd 05-Experiment
jupyter notebook "ML Final Model.ipynb"   # Final production model
jupyter notebook "ML Model I.ipynb"       # Initial model iteration
jupyter notebook "ML Model II.ipynb"      # Improved model iteration
jupyter notebook "ML Model III.ipynb"     # Alternative model approaches

Starting the Inference API (Production)

Deploy the ML API for real-time predictions:

cd 06-Deployment/Inference_API
python app.py

The API will start on http://localhost:5000

Running the Dashboard Application

Start the interactive monitoring dashboard:

cd 06-Deployment/Experiment_interface
streamlit run app.py

The dashboard will open in your default browser

API Endpoints

  • POST /process: Submit experiment data and get risk prediction

    • Input: Experiment features (workflow, telemetry, reagent, queue data)
    • Output: Risk score and prediction confidence
  • POST /retraining: Triggers model retraining if drift is detected

    • Input: Current dataset for drift evaluation
    • Output: Retraining status and updated model metrics

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Guidelines

  • Follow PEP 8 style guidelines
  • Add tests for new features and Lambda functions
  • Update documentation for API changes and Lambda modifications
  • Ensure all tests pass before submitting PR
  • For SageMaker training changes, test locally with Docker first
  • Update README with any new deployments or configurations

Testing

# Run unit tests
pytest tests/

# Test Docker containers locally
docker build -t experiment-api:latest 06-Deployment/Inference_API/
docker run -p 5000:5000 experiment-api:latest
  • Recent Updates (March, 2026)

  • ✅ Finalized ML model with LightGBM (holdout ROC-AUC: 0.977, PR-AUC: 0.94, Recall: 0.911)

  • ✅ Completed comprehensive EDA across all data sources (workflow, telemetry, reagent, queue)

  • ✅ Implemented SageMaker Training Image for automated model retraining

  • ✅ Enhanced Lambda functions for complete ETL pipeline

  • ✅ Added detailed feature importance analysis and interpretation

  • ✅ Project documentation and deliverables finalized

Known Issues and Future Enhancements

  • Model retraining currently requires manual trigger via API endpoint (future: fully automated via SageMaker schedules)
  • Data generation is synthetic (future: integrate with actual lab systems)
  • Dashboard currently supports single Tableau instance (future: multi-tenant support)

License

This project is licensed under the MIT License - see the LICENSE file for details.


This capstone project demonstrates the application of MLOps principles to solve real-world laboratory management challenges through predictive analytics and automated monitoring systems.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages