# Lab: Integrating Prometheus and Grafana with Machine Learning Model

#### Estimated Time: 25 mins

### **Lab Summary**
In this lab, you will learn how to integrate Prometheus and Grafana with a machine learning model serving layer for monitoring and visualization of relevant metrics.

- Understand how Prometheus collects metrics from the serving layer.
- Learn to configure Grafana dashboards to visualize metrics.
- Deploy the system using Docker Compose and test the integration.


### **Lab Objectives**

- Configure Prometheus to collect metrics from the ML model serving layer.
- Create Grafana dashboards for visualizing metrics like prediction count, request latency, and error rate.
- Deploy and test the entire setup using Docker Compose.


### **Directory Structure**

```plaintext
Lab6\challenge
├── data
│   └── walmart.csv
├── models
│   ├── random_forest_model.pkl
│   └── scaler.pkl
├── prometheus
│   └── prometheus.yml
├── serving
│   ├── Dockerfile
│   ├── preprocessing.py
│   ├── requirements.txt
│   └── serving.py
├── docker-compose.yml
├── instructions.ipynb
└── test_model.py
```

### **Step 1: Explore the Provided Files**

- **`serving.py`**: Contains the FastAPI code to serve the ML model.
- **`preprocessing.py`**: Preprocesses the data before predictions.
- **`requirements.txt`**: Lists required Python packages.
- **`Dockerfile`**: Builds the containerized environment for serving.
- **`prometheus.yml`**: Configures Prometheus to scrape metrics.
- **`docker-compose.yml`**: Orchestrates the services (Prometheus, Grafana, and the serving layer).
- **`test_model.py`**: Contains test cases to verify predictions.

💡 **Tip**: Explore these files to understand their functionality before proceeding.

### **Step 2: Set Up and Run Docker Compose**

In [None]:
# This command will run (prometheus/grafana/Serving) mentioned in docker-compose.yml as containers

!docker-compose up -d --build 

In [None]:

# Verify that the services (Prometheus, Grafana, and the serving layer) are running using:

!docker ps

🎯 **Outcome**: All services should be running without errors as shown in below screenshot.

![image.png](attachment:image.png)

### **Step 3: Navigate Prometheus Dashboard**

1. Access Prometheus at [http://localhost:9094](http://localhost:9094).

![image.png](attachment:image.png)

2. Explore Targets:
    -  Click on the **Status** menu at the top and select **Targets**.
    - Verify that the **model_serving** target (or any defined job) is listed and its status is UP.

    ![image.png](attachment:image.png)

![image.png](attachment:image.png)

**Query Metrics**:
 
- Go to the Graph tab in Prometheus.
- In the **query** input box, type one of the metrics you want to check, such as:
        - **predictions_count_total**
- Click **Execute** to run the query.

![image.png](attachment:image.png)

### **Step 3: Configure Grafana Dashboard**

1. Access Grafana at [http://localhost:3004](http://localhost:3004).
2. Log in using the default credentials:
   - Username: `admin`
   - Password: `admin`

![image.png](attachment:image.png)

3. Add Prometheus as a data source:
   - Navigate to **Settings** → **Data Sources**.
   - Click on **Add a New Data Source**
   - Select **Prometheus** and enter `http://prometheus:9090` as the URL.
   - Scroll down and click on **Save and Test**

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)


![image.png](attachment:image.png)

4. Create a new dashboard and add the following metrics:
   - **Prediction Count**: `predictions_count_total`

🎯 **Outcome**: A Grafana dashboard displaying real-time metrics from the serving layer.

![image.png](attachment:image.png)

A new Dashboard is created with one tile. You can for more metrics as well. Try for 

- error_count_total
- request_processing_seconds_count

![image.png](attachment:image.png)

### **Step 4: Test the Serving Layer**

1. Use `test_model.py` to send sample requests to the serving layer:

In [None]:

! python test_model.py


2. Verify that predictions are returned correctly and metrics like `predictions_count_total` are updated in Prometheus and Grafana.

🎯 **Outcome**: The serving layer processes requests, and the metrics reflect the activity.


**Check in Prometheus**

![image.png](attachment:image.png)

**Check in Grafana**

![image.png](attachment:image.png)

**Clean Up**

In [None]:
!docker-compose down

### **Conclusion**

In this lab, you:
- Integrated Prometheus and Grafana with an ML model serving layer.
- Monitored metrics like prediction count, error rate, and request latency.
- Set up a real-time dashboard in Grafana for visualizing metrics.

You now have a basic understanding of monitoring and visualizing metrics for ML systems in production.