# Best practices

![Status](https://img.shields.io/static/v1.svg?label=Status&message=Finished&color=brightgreen)
[![Source](https://img.shields.io/static/v1.svg?label=GitHub&message=Source&color=181717&logo=GitHub)](https://github.com/particle1331/inefficient-networks/blob/master/docs/notebooks/mlops/04-deployment)
[![Stars](https://img.shields.io/github/stars/particle1331/inefficient-networks?style=social)](https://github.com/particle1331/inefficient-networks)

```text
𝗔𝘁𝘁𝗿𝗶𝗯𝘂𝘁𝗶𝗼𝗻: Notes for Module 6 of the MLOps Zoomcamp (2022) by DataTalks.Club.
```

---

## Introduction

- look at lambda function code
- write unit tests, integration tests
- in general, just work on code to make it better from an engineering pov


- copy all code from chapter 4.
- 

```
1. Get pipfile, install pipfile.
2. Install pytest.
3. write 1 test `model_test.py`
4. improve our code.
5. now we want to test lambda_function, but has too much crap on it giving errors
6. define model.py
```

```bash
❯ pytest
====================================================== test session starts =======================================================
platform darwin -- Python 3.9.12, pytest-7.1.2, pluggy-1.0.0
rootdir: /Users/particle1331/code/inefficient-networks/docs/notebooks/mlops/06-best-practices
plugins: anyio-3.6.1
collected 0 items

===================================================== no tests ran in 0.01s ======================================================
```

Initial version. Focus on prediction. No writing on stream.

```python
import os
import model

RUN_ID = os.getenv('RUN_ID')

model_service = model.init(run_id=RUN_ID)


def lambda_handler(event, context):
    # pylint: disable=unused-argument
    return model_service.lambda_handler(event)
```

```python
import json
import mlflow
import base64


def load_model(run_id):
    model_path = f's3://mlflow-models-ron/1/{run_id}/artifacts/model'
    model = mlflow.pyfunc.load_model(model_path)
    return model


def base64_decode(encoded_data):
    decoded_data = base64.b64decode(encoded_data).decode('utf-8')
    ride_event = json.loads(decoded_data)
    return ride_event


class ModelService:

    def __init__(self, model, model_version=None):
        self.model = model
        self.model_version = model_version

    def prepare_features(self, ride):
        features = {}
        features['PU_DO'] = f"{ride['PULocationID']}_{ride['DOLocationID']}"
        features['trip_distance'] = ride['trip_distance']
        return features

    def predict(self, features):
        pred = self.model.predict(features)
        return float(pred[0])

    def lambda_handler(self, event):
        """Predict on batch of input events."""

        predictions_events = []

        for record in event['Records']:

            # Decode data from input kinesis stream
            encoded_data = record['kinesis']['data']
            ride_event = base64_decode(encoded_data)

            # Pickout id to match input to output
            ride = ride_event['ride']
            ride_id = ride_event['ride_id']

            # Make predictions using model
            features = self.prepare_features(ride)
            prediction = self.predict(features)

            # Package prediction event for output stream
            prediction_event = {
                'model': 'ride_duration_prediction_model',
                'version': self.model_version,
                'prediction': {'ride_duration': prediction, 'ride_id': ride_id},
            }

            predictions_events.append(prediction_event)

        return {'predictions': predictions_events}


def init(run_id: str):
    """Initialize model service."""

    model = load_model(run_id)
    model_service = ModelService(model=model, model_version=run_id)

    return model_service
```

```
docker build -t stream-model-duration:v2 .
```

```
docker run -it --rm -p 8080:8080 --env-file .env stream-model-duration:v2
```

python test_docker.py to check if still working.

test base64 decode
test prepare_features

```python
from model import ModelService
from model import base64_decode


def test_base64_decode():
    base64_input = "eyAgICAgICAgICAicmlkZSI6IHsgICAgICAgICAgICAgICJQVUxvY2F0aW9uSUQiOiAxMzAsICAgICAgICAgICAgICAiRE9Mb2NhdGlvbklEIjogMjA1LCAgICAgICAgICAgICAgInRyaXBfZGlzdGFuY2UiOiAzLjY2ICAgICAgICAgIH0sICAgICAgICAgICJyaWRlX2lkIjogMTIzICAgICAgfQ=="

    actual_result = base64_decode(base64_input)
    expected_result = {
        "ride": {
            "PULocationID": 130,
            "DOLocationID": 205,
            "trip_distance": 3.66,
        },
        "ride_id": 123,
    }

    assert actual_result == expected_result


def test_prepare_features():
    """Test preprocessing."""

    ride = {
        "PULocationID": 140,
        "DOLocationID": 205,
        "trip_distance": 2.05
    }

    model_service = ModelService(model=None)

    actual_features = model_service.prepare_features(ride)
    
    expected_features = {
        'PU_DO': '140_205',
        'trip_distance': 2.05,
    }

    assert actual_features == expected_features

```