![banner.png](banner.png)

<h2 style="color:#ffffff; background-color:#004aac; padding: 10px; text-align:left; border: 1px solid #004aac;">A. System Overview</h2>

This machine learning system is designed to showcase an end-to-end implementation covering the entire lifecycle of a machine learning model. It integrates various MLOps tools and technologies discussed during the DASCI 270 sessions, that includes facilitating data ingestion, preprocessing, training, validation, deployment, and monitoring of the model. The system is also structured to handle drift detection to ensure the model remains effective and accurate over time. This document will guide you through interacting with the deployed **Equipment Failure Prediction** system, including making predictions, retrieving model information, and checking data drift, as well as provide detailed documentation of each component and their functionalities.

<h2 style="color:#d7633a; padding: 5px; text-align:left; border: 1px solid #d7633a;">Components</h2>

1. **Machine Learning Model** - This system utilizes XGBoost, a highly efficient and scalable machine learning algorithm for classification tasks. It serves as the core predictive model in our system, specifically designed to predict equipment failure.

2. **Data Pipeline (Dagster)** - Orchestrates the workflow for data ingestion, preprocessing, and preparation. Dagster manages the sequence of these operations to ensure data flows correctly from one process to another, maintaining a clear and manageable execution order.

3. **Experiment Tracking (MLflow)** - Provides a framework to track experiments, including model training runs, parameters, metrics, and artifacts, enabling easier debugging and optimization. It stores models, performance metrics, and custom objects like drift reports, making them easy to access and compare across different runs.

4. **Model Serving (FastAPI)** - Deploys the trained model through a REST API using FastAPI, facilitating easy access to the model’s predictive capabilities. The API handles requests for predictions and provides model metadata, ensuring input validation and structured outputs.

5. **Containerization (Docker)** - Containerizes the services making up the system to ensure consistency and reproducibility across environments. Each isolated environment ("container") contains all necessary dependencies for each service, which can be easily deployed on any system supporting Docker.

6. **Drift Detection (Evidently AI)** - Integrates Evidently AI to monitor the model for any signs of data or concept drift. This component is crucial for maintaining the model's accuracy, providing insights into how the data characteristics and relationships are changing over time.

7. **Data Validation (Pydantic)** - Ensures that the data received by the API matches the expected format and type. This prevents errors during model prediction and ensures reliable model performance.

8. **Testing (Pytest and Github Actions)** - Uses Pytest to develop and run comprehensive tests that validate the correctness of the data processing and feature engineering components. GitHub Actions automates these tests, ensuring that all code integrations meet quality standards and function as expected.

<h2 style="color:#d7633a; padding: 5px; text-align:left; border: 1px solid #d7633a;">Dataset</h2>

This project utilized the AI4I 2020 Predictive Maintenance dataset, a synthetic dataset sourced from UCI Machine Learning Repository: https://archive.ics.uci.edu/dataset/601/ai4i+2020+predictive+maintenance+dataset. It contains 10,000 data points with the following columns/variables:

| Variables          | Type        | Description                                                                                         |
|--------------------|-------------|-----------------------------------------------------------------------------------------------------|
| UID                | Integer     | Unique identifier ranging from 1 to 10000                                                           |
| Product ID         | Categorical | Consists of a letter (L, M, H) for low, medium, and high product quality variants with a serial number |
| Type               | Categorical | Not specified                                                                                       |
| Air temperature    | Continuous  | Generated using a random walk process, normalized around 300 K with a standard deviation of 2 K      |
| Process temperature| Continuous  | Generated using a random walk process, normalized to a standard deviation of 1 K, plus air temperature + 10 K |
| Rotational speed   | Integer     | Calculated based on a power of 2860 W with normally distributed noise                               |
| Torque             | Continuous  | Normally distributed around 40 Nm with a standard deviation of 10 Nm, no negative values             |
| Tool wear          | Integer     | Tool wear added by quality variants H/M/L are 5/3/2 minutes respectively                             |
| Machine failure    | Integer     | Indicates if the machine has failed (1) or not (0) in this particular datapoint                      |
| TWF                | Integer     | Target feature indicating if a specific type of failure occurred or not (1/0)                        |


<h2 style="color:#ffffff; background-color:#004aac; padding: 10px; text-align:left; border: 1px solid #004aac;">B. Setup Instructions</h2>

This section provides detailed instructions on how to set up and run the system using Docker Compose. The instructions assume you have Docker and Docker Compose installed on your machine. If not, please install them from the [Docker official website](https://www.docker.com/) before proceeding.

<h2 style="color:#d7633a; padding: 5px; text-align:left; border: 1px solid #d7633a;">Clone the Repository</h2>

First, clone the project repository from GitHub to get the necessary code and configuration files. Use the following command:

```bash
git clone https://github.com/raymundojavajr/ml-fp.git
cd ml-fp
```

This command clones the repository into a directory named ml-fp on your local machine and changes into that directory.

<h2 style="color:#d7633a; padding: 5px; text-align:left; border: 1px solid #d7633a;">Configure Environment Variables</h2>

Create a .venv file in the root directory of the project to store environment variables. This can be conveniently done using uv:

```bash
uv sync
```

It can also be done using pip:

```bash
python -m .venv venv
pip install -r requirements.txt
```

<h2 style="color:#d7633a; padding: 5px; text-align:left; border: 1px solid #d7633a;">Install Dependencies/Packages</h2>

Ensure you have the necessary Python packages installed:

```bash
pip install requests pandas mlflow evidently fastapi uvicorn dagster dagster-docker dagster-postgres
```

**List of Libraries used in the pipeline:**



<h2 style="color:#ffffff; background-color:#004aac; padding: 10px; text-align:left; border: 1px solid #004aac;">C. Orchestrating ML Workflows with Dagster</h2>

This section explains the execution of the **Machine Failure Prediction** pipeline, which is fully managed within Dagster. This includes setting up and running the pipeline components via the Dagster UI.

**Pipeline Overview**

The pipeline is designed to handle various stages from data processing to model deployment, ensuring thorough analysis and prediction of machine failures.

**Dagster Assets in the Pipeline**

- **Data Ingestion (Upstream Assets):**
  - `download_machine_data` retrieves raw sensor and operation data from sources.
  - `raw_machine_data` stores the fetched data for subsequent processing.
- **Data Preparation (Downstream from Raw Data):**
  - `cleaned_machine_data` performs data cleaning and preprocessing to prepare for analysis.
- **Model Development (Downstream from Prepared Data):**
  - `train_test_split_data` segments the data into training and testing subsets.
  - `feature_preprocessor` applies necessary transformations for the predictive model.
  - `trained_model` encompasses the training of the failure prediction model.
- **Operational Deployment (Downstream from Model Development):**
  - `model_evaluation` assesses the predictive accuracy and robustness of the model.
  - `saved_model` handles the storage of the validated model.
- **Continuous Improvement (Downstream from Deployment):**
  - `drift_detection_data` simulates potential drift in machine operation data.
  - `cleaned_drift_data` prepares this data for drift analysis.
  - `preprocessed_drift_data` further processes the data to fit analysis models.
  - `drift_reports` creates detailed reports on any detected drift, helping in proactive maintenance.

**Executing the Pipeline**

To operationalize all assets in the **Machine Failure Prediction** pipeline, navigate to the **Dagster UI** at (`http://localhost:3000`) and follow these steps:

- **Start with Upstream Assets:** Initiate the pipeline by materializing assets such as `download_machine_data` and `raw_machine_data`.
- **Advance through the Pipeline:** Continue activating downstream assets in sequence to ensure a logical flow of data processing and model training.
- **Finalize with Evaluation and Monitoring:** Complete the execution by materializing assets related to model evaluation and drift detection, which are crucial for maintaining the accuracy and relevance of the prediction model.

This structured approach ensures that each component of the pipeline functions efficiently and that dependencies are correctly managed from data collection through to drift detection.

<h2 style="color:#ffffff; background-color:#004aac; padding: 10px; text-align:left; border: 1px solid #004aac;">D. Deploying the Full ML Pipeline Stack with Docker Compose</h2>

This project leverages Docker Compose for containerization, ensuring cohesive and efficient functioning of all components.

**Containerized Services in `docker-compose.yaml`**

Below are the services configured in the `docker-compose.yaml` file, outlining their roles and the ports they operate on:

| **Service** | **Purpose** | **Port** |
|-------------|-------------|---------|
| **PostgreSQL (`postgres_service`)** | Database for storing pipeline data | Internal |
| **MinIO (`minio_service`)** | S3-compatible storage for artifacts | `9000`, `9001` |
| **MLflow Tracking Server (`mlflow_service`)** | Manages experiment tracking and artifact logging | `5000` |
| **FastAPI Model Server (`fastapi_service`)** | Hosts the machine failure prediction model | `8000` |
| **Dagster Server (`dagster_service`)** | Orchestrates and monitors the pipeline | `3000` |

### 📌 Step 1: Launch All Services
Execute the following command to build and initiate all services:

```bash
docker-compose up --build -d
```

This command:
- Constructs any necessary Docker images.
- Launches all services in detached mode (`-d`).

### 📌 Step 2: Verify Service Containers
To confirm that all Docker containers are active, run:

```bash
docker ps
```

### 📌 Step 3: Service Access Points
Here are the URLs to access each service's user interface:

| **Service**               | **URL**                                 |
|---------------------------|-----------------------------------------|
| **Dagster UI**            | [http://localhost:3000](http://localhost:3000) |
| **MLflow Tracking UI**    | [http://localhost:5000](http://localhost:5000) |
| **FastAPI Model Server**  | [http://localhost:8000](http://localhost:8000) |
| **MinIO Console**         | [http://localhost:9001](http://localhost:9001) |

### 📌 Step 4: Configure MinIO for MLflow
Prior to using Dagster for asset materialization, configure a storage bucket in MinIO:

1. Visit [http://localhost:9001](http://localhost:9001)
2. Log in using:
   - **Username:** `minio_user`
   - **Password:** `minio_password`
3. Establish a new bucket named `mlflow`.
4. Amend your `.env` file with the MinIO Access Keys obtained:

   ```ini
   # MinIO Access Keys
   MINIO_ACCESS_KEY=your_generated_access_key
   MINIO_SECRET_ACCESS_KEY=your_generated_secret_key
   ```

### 📌 Step 5: Materialize Assets in Dagster
With all services operational, proceed to materialize assets:

- Navigate to the [Dagster UI](http://localhost:3000).
- Select "Assets" → "Materialize Selected" to execute the pipeline.

### 📌 Service Management Commands

| **Action**                | **Command**                          |
|---------------------------|--------------------------------------|
| **Restart all services**  | `docker-compose down && docker-compose up -d` |
| **Stop all services**     | `docker-compose down`                |

<h2 style="color:#ffffff; background-color:#004aac; padding: 10px; text-align:left; border: 1px solid #004aac;">E. Happy Path</h2>

<h2 style="color:#d7633a; padding: 5px; text-align:left; border: 1px solid #d7633a;">Prediction Request</h2>

In this section, we demonstrate how to use the FastAPI model server for predictions. We'll send a set of equipment parameters and status data to receive predictions on potential machine failures.

- **Input Data:** This includes a series of variables such as equipment type, air and process temperatures, rotational speed, torque, tool wear, and previous instances of machine failure.
- **API Endpoint:** This is the URL where the FastAPI server accepts prediction requests.
- **Response Handling:** Upon receiving the prediction request, the server provides a forecast indicating whether a machine is likely to fail. This prediction is displayed in the output.

Execute the following cell to initiate a prediction request.

In [None]:
import requests

# FastAPI server URL
FASTAPI_URL = "http://localhost:8000/predict"

# Define the input data
# In order of the features: UDI, Air temperature (K), Process temperature (K), Rotational speed (rpm), Torque (Nm), Tool wear (min), Type encoded, Product ID encoded, Failure Type encoded
# Ensure to adjust the data based on actual possible inputs and features expected by the model
payload = {
    "features": [
        2,          # UDI
        310.5,      # Air temperature in Kelvin
        320.1,      # Process temperature in Kelvin
        2000,       # Rotational speed in rpm
        50.5,       # Torque in Nm
        10,         # Tool wear in minutes
        1,          # Type encoded (categorical, numeric encoding)
        8005,       # Product ID encoded (categorical, numeric encoding)
        0           # Failure Type encoded (target variable, if used for retraining or similar scenarios)
    ]
}

# Send request to FastAPI
response = requests.post(FASTAPI_URL, json=payload)

# Handle response
if response.status_code == 200:
    print("Prediction:", response.json())  # Print the predicted output, likely a machine failure prediction or similar
else:
    print("Error:", response.status_code, response.text)  # Handle errors like 404 or 500 from server

#### Response Interpretation

# Insert explanation

<h2 style="color:#d7633a; padding: 5px; text-align:left; border: 1px solid #d7633a;">Model Information Retrieval</h2>

To effectively interact with our FastAPI model server, it's essential to retrieve the model's configuration. This process involves accessing detailed metadata from the /model endpoint, which helps users understand the underlying model's structure and operational parameters.

**Retrieve Model Information**

The model server offers detailed metadata which includes:

* **Input schema** detailing the required format for prediction inputs.
* **Training Hyperparameters** used during the training process.
* **Important features** that significantly impact the prediction outcomes.

Execute the following cell to obtain detailed information about the model hosted on FastAPI:

In [None]:
import requests

# FastAPI server URL for the /model endpoint
FASTAPI_URL = "http://localhost:8000/model"

# Send request to the FastAPI model server to retrieve model metadata
response = requests.get(FASTAPI_URL)

# Handle the response from the server
if response.status_code == 200:
    model_info = response.json()  # Parse the JSON response containing the model details
    print("Model Information:")
    print("Input Schema:", model_info.get("input_schema", "No input schema information found"))
    print("Hyperparameters:", model_info.get("hyperparameters", "No hyperparameters information found"))
    print("Important Features:", model_info.get("important_features", "No important features information found"))
else:
    print("Error:", response.status_code, response.text)  # Print any errors encountered during the request

#### Insights on Retrieved Model Information

# Insert insights

<h2 style="color:#ffffff; background-color:#004aac; padding: 10px; text-align:left; border: 1px solid #004aac;">F. Drift Detection Demonstration</h2>

<h2 style="color:#d7633a; padding: 5px; text-align:left; border: 1px solid #d7633a;">Accessing Evidently AI from MLflow to Generate Drift Reports</h2>

Once the `drift_reports` asset is activated within **Dagster UI**, these reports are subsequently recorded in **MLflow**, specifically under Experiments → Artifacts.

**How to Access Drift Reports in MLflow**

1. Launch the **MLflow UI** by visiting http://localhost:5000.
2. Proceed to **Experiments** and select the **Most Recent Run**.
3. Navigate to the **Artifacts** tab.
4. Find and download the `drift_report.html` to examine the comprehensive drift analysis.

<h2 style="color:#d7633a; padding: 5px; text-align:left; border: 1px solid #d7633a;">Drift Result Analysis</h2>

# Insert analysis

<h2 style="color:#ffffff; background-color:#004aac; padding: 10px; text-align:left; border: 1px solid #004aac;">F. Reproducability</h2>

<h2 style="color:#ffffff; background-color:#004aac; padding: 10px; text-align:left; border: 1px solid #004aac;">G. References</h2>

## Stash

**Step 3: Build the Docker Containers**
Use Docker Compose to build the services defined in the docker-compose.yml file. This includes your Dagster orchestration, MLflow tracking server, and FastAPI application. Run the following command:

```bash
docker-compose build
```

This command builds the Docker images for each service according to the specifications in the Dockerfile and docker-compose.yml files. It installs all necessary dependencies in these images, ensuring each service has what it needs to run.

**Step 4: Run the Services**
Once the images are built, you can start the services with the following command:

```bash
docker-compose up
```

This command starts all services defined in docker-compose.yml. It creates and starts Docker containers for each service. The services will be networked together as defined, allowing them to communicate with each other.

**Step 5: Access the Services**
* FastAPI Application: Access the FastAPI application at http://localhost:8000. You will find the automatically generated API documentation there, which allows you to interact with the API directly through your web browser.
* MLflow Tracking Server: Access the MLflow tracking server at http://localhost:5000. Here you can view and compare different model runs and their metrics.
* Dagster Dagit UI: Access the Dagster orchestration UI at http://localhost:3000. This interface allows you to monitor and control the data pipelines.

**Step 6: Shut Down the Services**
To stop and remove all running containers, use the following command:

```bash
docker-compose down
```
This command stops all containers and removes the containers, networks, and volumes created by docker-compose up. It cleans up everything except the built images, allowing for a quick restart of the services if needed.