# Model Deployment
Quite a journey to get here! Do you remember that at the end of step 3 (model training) we saved the trained model as a bin file?<br>
Well, now it's time to serve the model to those folks that have been waiting for us to provide them an entrance for prediction 😊. __REST APIs__ are the most common method used for doing this.

During this notebook, we'll be reviewing how to do so, in simplest terms.

## Web Service API
Our prediction model needs to be wrapped into a web service, available to the outside world. When it comes to creating API web services, there are plenty of options. [Flask](https://palletsprojects.com/p/flask/) is a micro web application framework that makes creating web APIs like a breeze. [FastAPI](https://fastapi.tiangolo.com/) is another fantastic option.

The API wrapper will do the below three tasks:
- Receive inputs as a JSON string () via POST method
- Load prediction model from a saved file
- Run prediction on input data and return prediction result response

This is the code that achieves above goals using Flask:

```
# required library imports
import json
import pickle
from pandas import DataFrame
from numpy import expm1
from flask import Flask, request
from waitress import serve

# path to the folder containing our model
MODEL_PATH = './model/'


def load_model(path_to_model):
    with open(path_to_model, 'rb') as model_file:
        model = pickle.load(model_file)
    return model


def get_prediction(model, input_data):
    # input data is a json string
    # we have to convert it back to a pandas dataframe object
    # scikit-learn's ColumnTransformer only accepts an array or pandas DataFrame
    dict_obj = json.loads(input_data)
    X = DataFrame.from_dict(dict_obj)
    y_pred = expm1(model.predict(X)[0])

    # compose result dictionary and return it as a json string
    result = {
        'prediction': float(y_pred),
    }
    return json.dumps(result)


app = Flask('prediction_app')
@app.route('/predict', methods=['POST'])
def predict():
    input_json = request.get_json()
    model = load_model(MODEL_PATH+'model.bin')
    prediction_result = get_prediction(model, input_json)

    return prediction_result



if __name__ == "__main__":
    serve(app, host='0.0.0.0', port=9696)
```

There are three functions in our prediction file.<br>
- First one, __*load_model()*__, loads the model saved as a pickle file. As a side note, we know that our saved model is embodied with data preprocessing pipeline alongside.<br>
- Second function, __*get_prediction()*__, does the prediction task by utilizing a model and the input data. The __input data must be in the same dimension__ as our cleaned dataset, meaning it must have the exact features we used for training.
- The last one, __*predict()*__ , is the entry for our API endpoint handled by Flask. Notice the _route()_ and it's input above function definition. The route parameter can be any name you like, which will be used by outside world as API's entry point.

The last line employs _waitress_ web server to run the flask app. Waitress is a lightweight WSGI server that has no dependencies except ones which live in the Python standard library. It can run on both Windows & UNIX-based OS environments.
The code above is available in _script/predict.py_ file. We need to run this file in order to make our API accessible.<br>
All you need to do is to run a shell command inside the _script_ folder to run the file. Type-in one of the following commands in your command line inside this folder:
```
"python inference.py"
or
"python -m inference"
```

That's it. Now our flask app is running in the background, waiting to serve requests on 9696 port, available on all IP addresses our machine has. Note that, in the last line containing the _serve()_ function in _inference.py_ file, you can put-in any port number as function's port parameter, as long as it is not used by any apps or services running on your machine.

## Mocking and Testing an API Call
Now that we have our prediction web service running on our machine, let's mock a call from outside and see what we get as a result.<br>
We'll need to provide the API call with some input data. The most accessible data we have is our test dataset. Let's load this and extract a random JSON data point out of it:

In [2]:
import pandas as pd

DATA_PATH = './script/data/'
test_data = pd.read_csv(DATA_PATH+'test_cleaned.csv.gz')

In [3]:
# choose a sample row
# this should have all the columns except the 'id' column which was note used for training
sample = test_data[test_data.columns[~test_data.columns.isin(['id'])]].sample(n=1, random_state=1024).to_json()
sample

'{"cat1":{"5410":"A"},"cat2":{"5410":"A"},"cat4":{"5410":"A"},"cat5":{"5410":"B"},"cat6":{"5410":"A"},"cat8":{"5410":"A"},"cat9":{"5410":"A"},"cat10":{"5410":"A"},"cat11":{"5410":"A"},"cat12":{"5410":"A"},"cat13":{"5410":"A"},"cat14":{"5410":"A"},"cat16":{"5410":"A"},"cat17":{"5410":"A"},"cat18":{"5410":"A"},"cat19":{"5410":"A"},"cat20":{"5410":"A"},"cat21":{"5410":"A"},"cat23":{"5410":"A"},"cat24":{"5410":"A"},"cat25":{"5410":"A"},"cat26":{"5410":"A"},"cat27":{"5410":"A"},"cat28":{"5410":"A"},"cat29":{"5410":"A"},"cat30":{"5410":"A"},"cat31":{"5410":"A"},"cat32":{"5410":"A"},"cat33":{"5410":"A"},"cat34":{"5410":"A"},"cat35":{"5410":"A"},"cat36":{"5410":"A"},"cat37":{"5410":"A"},"cat38":{"5410":"B"},"cat39":{"5410":"A"},"cat40":{"5410":"A"},"cat41":{"5410":"A"},"cat42":{"5410":"A"},"cat43":{"5410":"A"},"cat44":{"5410":"A"},"cat45":{"5410":"A"},"cat46":{"5410":"A"},"cat47":{"5410":"A"},"cat48":{"5410":"A"},"cat49":{"5410":"A"},"cat50":{"5410":"A"},"cat51":{"5410":"A"},"cat52":{"5410":"A

It's time to mock an outgoing API request to our prediction endpoint and see the results:

In [4]:
import requests

api_url = "http://localhost:9696/predict"


api_response = requests.post(url=api_url, json=sample).json()
print(api_response)


{'prediction': 2284.148193359375}


There you go, we have our API endpoint alive and well!