# Monitoring Performance

Data for performance monitoring is generally collected using log files. For this it needs to be ensured that data are collected at the most granular level possible

There are three log files to keeping monitoring data.

1. Write unit tests for a logger and a logging API endpoint
2. Add logging for train abd test
3. Add logging for predict and test


In [1]:
import os
import sys
import csv
import numpy as np
import pandas as pd

## Reading the log files

The ``logs`` folder keeps the three log files. Subsequent code sequences reads these logs and extracts the monitoring data. 

```
├── app.py
├── Dockerfile
├── model.py
├── cslib.py
├── logger.py
├── README.md
├── requirements.txt
├── models
├── Notebook_Case_Study_part1_ingest_visuallization.ipynb
├── Notebook_Model_Performance_comparison.ipynb
├── Notebook_Monitoring.ipynb
├── Notebook_UnitTest.ipynb
├── UnitTest_allTestCases.py
├── UnitTest_Model.py
├── UnitTests_API.py
├── UnitTests_logger.py
├── templates
│   ├── base.html
│   ├── dashboard.html
│   ├── index.html
│   └── running.html
│
└── logs
    ├── predict-2020-12.log
    ├── predict-test.log
    ├── train-test.log
```

In [2]:

df = pd.read_csv("logs/train-test.log")
df

Unnamed: 0,unique_id,timestamp,tag,period,rmse,model_version,model_version_note,runtime
0,0af4047d-4a0b-4591-a6c1-473e90c5e311,1609415000.0,netherlands,"('2017-12-01', '2019-05-31')",{'rmse': 0.5},0.1,test model,00:00:01
1,b6dc69ca-9219-44dd-b9d8-0d74e9fce887,1609415000.0,netherlands,"('2017-12-01', '2019-05-31')",{'rmse': 0.5},0.1,test model,00:00:01
2,da66e5e5-7b40-4288-80a6-d6eb6bb45e05,1609415000.0,united_kingdom,"('2017-12-10', '2019-05-30')",{'rmse': 33864.0},0.1,supervised learning model for time-series,000:00:05
3,d7802fe8-3ca8-4e95-a8dc-05851c115a52,1609415000.0,all,"('2017-12-02', '2019-05-30')",{'rmse': 36343.0},0.1,supervised learning model for time-series,000:00:03


In [3]:
df = pd.read_csv("logs/predict-test.log")
df

Unnamed: 0,unique_id,timestamp,country,y_pred,y_proba,target_date,model_version,runtime
0,ba20ba06-318b-4f06-908e-771660e52f83,1609415000.0,netherlands,[0],"[0.6, 0.4]",2019-01-05,0.1,00:00:02
1,6f0814ab-a4ac-4315-9f59-6c093ee10878,1609415000.0,netherlands,[339.16883333],,2018-08-01,0.1,000:00:23


In [4]:
df = pd.read_csv("logs/predict-2020-12.log")
df

Unnamed: 0,unique_id,timestamp,country,y_pred,y_proba,target_date,model_version,runtime
0,05f6092c-f37d-46da-a40f-391a111035ff,1609345000.0,netherlands,[0],"[0.6, 0.4]",2019-01-05,0.1,00:00:02
1,21ace07e-dd47-49fa-b1df-e2fd24385ec4,1609345000.0,netherlands,[0],"[0.6, 0.4]",2019-01-05,0.1,00:00:02
2,c4e36713-2c89-4326-bc17-f38ee6f8ce70,1609346000.0,netherlands,[0],"[0.6, 0.4]",2019-01-05,0.1,00:00:02
3,e84465e0-30f3-4136-b3ee-04ab8a4e50fc,1609395000.0,netherlands,[0],"[0.6, 0.4]",2019-01-05,0.1,00:00:02
4,c118cc57-c314-4e38-8662-7c36d715135d,1609396000.0,netherlands,[0],"[0.6, 0.4]",2019-01-05,0.1,00:00:02
5,a68f0018-4bcc-49b6-a6a4-a884b297c20a,1609414000.0,netherlands,[0],"[0.6, 0.4]",2019-01-05,0.1,00:00:02
6,3cdacb47-bea4-4e9d-931c-1fb0172abcd0,1609414000.0,netherlands,[0],"[0.6, 0.4]",2019-01-05,0.1,00:00:02
7,d444e348-4e5f-47d8-bf28-f959e71c1955,1609415000.0,netherlands,[0],"[0.6, 0.4]",2019-01-05,0.1,00:00:02
8,84fafcf4-bc02-4af2-98f5-bb3297b6b55d,1609415000.0,netherlands,[0],"[0.6, 0.4]",2019-01-05,0.1,00:00:02
