# How to monitor ML Models in Production

* The model quality during production may be effected by the changes in the input data (e.g data processing isuues, problems with the data source, ...)
* The model in production is constantly receiving new data. However, this data might have a different probability distribution than the one you have trained the model. Using the original model with the new data distribution will cause a drop in model performance.

**Data Drift**

Data drift is the situation where the model’s input distribution changes.

$Pt1 (X) ≠ Pt2 (X)$

**Concept Drift**

To know what concept drift is, we need a definition of “concept”. Concept stands for the joint probability distribution of a machine learning model’s inputs (X) and outputs (Y). We can express their relationship in the following form:

$P(X, Y) = P(Y) P(X|Y) = P(X) P(Y|X)$

Concept shift happens when the joint distribution of inputs and outputs changes:

$Pt1 (X, Y) ≠ Pt2 (X, Y)$

Concept drift can originate from any of the concept components. The most important source is the posterior class probability $P(Y|X)$ , as it shows how well our model understands the relationship between inputs and outputs. For this reason, people use the term “concept drift” or “real concept drift” for this specific type.


More details: https://deepchecks.com/data-drift-vs-concept-drift-what-are-the-main-differences/

When this happens we have to intervene, e.g. retrain the model, fallback to some robust system, ...

![things to monitor](monitoring_01.png)

### Batch vs. Online serving Models
**Online**
* Online monitoring
* If no real-time update of the model is necessary, batch monitoring may be used for online mode aswell

**Batch**
* Batch monitoring
* Most ML models in production operate in batch mode, i.e. the pipeline my use prefect or airflow
* You may add a block of calculations after specific step of the pipeline and run some checks if the data and model behaves as expected. I.g. calculate some metrics, save them into a database and visualize them

This is what we want to implement:

![pipeline](monitoring_02.png)

We will continue with the New York taxi ride example. 

We will use MongoDB for logging
* MongoDB is a NoSQL database, we can push our data that may have different number of fields
* Our data is in json format (unstructured)
* In order to push our data to MongoDB, we need to connect to Host, create a database, create a collection and push our data there

We need to implement online monitoring
* we need to send our data from the prediction service to the monitoring service
* This calculates the metrics and saves them into a Prometheus database
* The Prometheus database shares the data with the Grafana service
* Both prometeus and Grafana are open source

We can the additionally include batch monitoring
![batch monitoring](monitoring_03.png)

* Include a prefect flow, that collects data from MongoDB and calculates metrics
* From that we can create a visual report in html

We will use docker to combine all this!

### Example: New York Taxi data
* Our example (```predict_flask.py```) uses Flask to integrate the service
* The prediction is returned in a json script
* Update this service with two more instructions:
    * logging
    * send data to monitoring service
* We call the new file ```app.py``` and save it in ```prediction_service```