# Project: Optimizing the Hospital Patient Journey via Process Mining and MLOps

## Introduction and Context
This project aims to analyze operational and clinical flows within the hospital facility using advanced **Process Mining** techniques. The goal is to reconstruct the actual *Patient Journey* starting from **Event Logs**, identifying inefficiencies and bottlenecks. In parallel, we will develop an **MLOps** pipeline to predict critical events or excessive durations.


## Project Objectives
1.  **Process Visualization:** Map the actual flow of activities (admission, exams..).
2.  **Performance Analysis:** Calculate throughput times and identify bottlenecks.
3.  **Anomaly Detection (Outlier Analysis):** Identify non-compliant paths or cases with anomalous timings.
4.  **Predictive Modeling:** Develop a predictive model for waiting times or the next clinical event.

## Methodology and Tech Stack
* **Language:** Python 
* **Process Mining:** PM4Py (for discovery and conformance checking).
* **Machine Learning:** - 
* **Data Manipulation:** Pandas, NumPy.
* **Visualization:** Matplotlib, Seaborn, Plotly 
* **MLOps:** MLflow, Git

---

## Role Division

To maximize efficiency, we will divide the team into specialized roles

### Role 1: Data Engineer & MLOps Architect (Elisa)
* **Responsibilities:** Creation of the data ingestion pipeline. Preliminary cleaning and log formatting (XES/CSV). Environment configuration (Git, DVC, Docker).
* **Focus:** "Garbage in, garbage out". Ensures that data arrives clean to the analysts.

### Role 2: Data Scientist (Process Mining Analyst and Discovery) (Rosa)
* **Responsibilities:** Using PM4Py to generate process maps and graphs. Analysis of path variants.

### Role 3: Data Scientist (Conformance & Bottlenecks) (Lorenza)
* **Responsibilities:** Temporal performance analysis. Identification of bottlenecks (where do patients wait the most?). Comparison between the theoretical model and the real one.
* **Focus:** Understanding "why is it slow" or "where do we deviate from the standard".

### Role 4: Feature Engineering and Software Developer (Malik)
* **Responsibilities:** Transforming logs into tabular datasets for ML. Feature creation ("time elapsed since last event", "day of the week"). Statistical analysis of outliers.
* **Focus:** Preparing the ground for prediction.

### Role 5: MLOPs Engineering (Predictive Modeling) (Gleb)
* **Responsibilities:** Training and validation of predictive models. Hyperparameter tuning and experiment tracking with MLflow.
* **Focus:** Answering the question "what will happen next?".

### Role 6: Project Manager & Data Storyteller (Anna)
* **Responsibilities:** Team coordination. Defining success KPIs. Interpretation of technical results from a business/clinical perspective. Creation of the final dashboard and presentation.
 **Privacy Guardian:** Ethical oversight of data usage and verification of compliance (GDPR) regarding the produced results.