# Data Monitoring with Flask and whylogs


In this example we will deploy a flask app with an ml model and use whylogs to monitor the data flowing throught the application. In this case we are using a pytorch image classification model based on densenet.


The api will receive a json request with the path of the image file, this path can be locally acesseed by the app or could be an for example a `s3` bucket or another storage supported by `smart_open`. 
The response will be also a `json` with the classification of the image based on ImageNet pretrained model, hence it will be one of the ImageNet classes. 


For this example you will need to install:


In [None]:
!pip install Flask==2.0.1 torchvision==0.10.0 whylogs==0.6.1 requests==2.26.0

We will begin by loading the model and the `whylogs` logger in memory as the application is launched. The function `get_or_create_session` will create a local configuration or load the default configuration file `.whylabs.yml` 
If you want to send your profiles whylabs you can send them directly to whylabs, witth the following enviroment variables. 

```python
from torchvision import models

from whylogs import get_or_create_session

os.environ["WHYLABS_API_KEY"] = "<API-KEY>"t
os.environ["WHYLABS_API_ENDPOINT"] = "<end point override. not required>"
os.environ["WHYLABS_DEFAULT_ORG_ID"] = "<your-org-id>"
os.environ["WHYLABS_DEFAULT_DATASET_ID"] = "<your-default-dataset-id>"

session = get_or_create_session()
whylog_logger = session.logger(tags={"datasetId": "<override-dataset-id>"}, , 
                        dataset_timestamp=datetime.datetime.now(datetime.timezone.utc), 
                        with_rotation_time="30s")

imagenet_class_index = json.load(open("imagenet_class_index.json"))
model = models.densenet121(pretrained=True)
model.eval()

app = Flask(__name__)
```



## Receiving Request and Monitoring Inputs

We will set the endpoint for our app to be `/predict`, this will the main interactions with our application, hence it is important to monitor all the data that is coming in. In this case the request json will include a path to an image.


```python
@app.route("/predict", methods=["POST"])
def predict():

    if request.method == "POST":
        filepath=request.json.get("file",None) 
        image = fetch_image(filepath)
        
        whylog_logger.log({"filepath":filepath })
        whylog_logger.log_image(image)
        
       
```

whylogs will aggregate the statistics for the data that is comming in at every hour, saving it locally or sending it to WhyLabs for automated monitoring.  Allowing you to monitoring changes to the request being made or images being loaded.

### Data preparation 

To get your data to how the model expects it we need to resize, normalize it and converted to pytorch tensor type.
We will use the transform lib found withi torchvision. We could additionally log the transform data.

```python
import torchvision.transforms as transforms

def transform_image(image):
    my_transforms = transforms.Compose([transforms.Resize(255),
                                        transforms.CenterCrop(224),
                                        transforms.ToTensor(),
                                        transforms.Normalize(
                                            [0.485, 0.456, 0.406],
                                            [0.229, 0.224, 0.225])])
    return my_transforms(image).unsqueeze(0)
```
### Inference
Putting it all together, we add the additional steps for the model inference and log the output info. 

```python
@app.route("/predict", methods=["POST"])
def predict():

    if request.method == "POST":
        filepath=request.json.get("file",None) 
        image = fetch_image(filepath)

        whylog_logger.log({"filepath":filepath })
        whylog_logger.log_image(image)
    
        tensor = transform_image(image)

        whylog_logger.log({"batch_size": tensor.shape[0]})
        
        outputs = model.forward(tensor)
        conf, p_index = outputs.max(1)
        predicted_idx = str(p_index.item())
        class_id, class_name= imagenet_class_index[predicted_idx]
        
        whylog_logger.log({"confidence": conf.item()})
        whylog_logger.log({"class_id": class_id})
        whylog_logger.log({"class_name": class_name})

        return jsonify({"class_id": class_id, "class_name": class_name}) 
```

You can test run your app by running locally with 

```bash
$~ python my_flask_app.py
```
and can use the snipppet below to make requests to the app. 

In [None]:
import requests
from torchvision import transforms

resp = requests.post("http://localhost:5000/predict",
                     json={"file": '../data/flower2.jpg'})
resp.json()

You can wait 30 seconds until the log rotates, or you can change the log rotation time

In [46]:
import time
import glob 

time.sleep(30)
files=glob.glob("output/*/*/*/*.bin")
sorted(files)
files

['output/my_deployed_model/e5f8620a-8f15-4d41-a0e9-e9845c317377/protobuf/dataset_profile.2021-08-25_13-15-22.bin',
 'output/my_deployed_model/e5f8620a-8f15-4d41-a0e9-e9845c317377/protobuf/dataset_profile.2021-08-25_13-15-52.bin',
 'output/my_deployed_model/e5f8620a-8f15-4d41-a0e9-e9845c317377/protobuf/dataset_profile.2021-08-25_13-16-22.bin',
 'output/my_deployed_model/e5f8620a-8f15-4d41-a0e9-e9845c317377/protobuf/dataset_profile.2021-08-25_13-16-52.bin']

You can load your profiles and view them individually with the viewer, or with the some of viz tools in the library.
Or even better send them to WhyLabs where you can see distributional shifts, anomaly detection and much more. 

In [47]:
from whylogs import DatasetProfile
profiles=DatasetProfile.read_protobuf(files[-1])

In [48]:
from whylogs.viz import profile_viewer

In [49]:
profile_viewer(profiles=[profile])

'/var/folders/pr/f715zv8x17b1v5vwydgv2gq40000gq/T/tmpr3be6_r7.html'