In [17]:
from sklearn.datasets import load_boston
import pickle
import requests
import pandas as pd
import numpy as np
import seaborn as sns

# Documentation of ML-API

## Idea

In our project `Immobilienpreisrechner` in the third semester we have built a Machine Learning model to predict housing prices as accurate as possible. One task of this project was to implement an API for our model as well. Our API was not strucutured very well and it did not allow to pre-process the data same as the data we used to build and train our model. We did not a have a clear data pipeline for our model. In this small project for the study course web data collection I wanted to fix this issue and implement a basis to make a possible roll-out of the model possible.

## Description of the Service

In my opinion this service provides a framework for me in the fututer to implement a machine learning model and roll it out for customer or a colleague.
The API provides all relevant operations like predictions, scoring for customers as well as for a maintainer with the methods to update the model parameters and deletion of the model.


### Load Sample Data

Now we are loading the Sample data as defined above in our RAM [(Pedregosa et al., 2011)](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_boston.html).

In [18]:
X, y = load_boston(return_X_y=True)

### Local Test Client

In [19]:
from api import app

In [20]:
app = app.test_client()

In [21]:
r = app.get('/check')

In [22]:
r.get_json()

200

In [50]:
# Print All models on server
response = app.get('/get_all_models')
response.get_json()

['model_1.pkl']

### Create new Model Endpoint

Endpoint: `\create_model`

Creates a new model with a fitting id within the src folder. Then the id of the new model will be returned for further usage.
Because it is not possible to upload a python object to the API it is currently only possible to adapt an existing model. For example if there are only Lasso Regression models stored in src folder you can only adapt the lasso model.

In [51]:
# The URL of our API 
base_url = 'https://sko-webapp.herokuapp.com/'
requests.get(base_url + '/check')

<Response [200]>

In [52]:
# Print All models on server
response = requests.get(url=base_url+'/get_all_models')
response.json()

['model_1.pkl']

In [26]:
new_model_params = {'alpha': 3}

In [53]:
# Send request to create new model
response = requests.put(url=base_url+'/create_model', json=new_model_params)
print('HTTP-Statuscode:', response.status_code)

HTTP-Statuscode: 200


In [54]:
response.json()

{'model_id': 2}

In [56]:
# Print All models on server
response = requests.get(url=base_url+'/get_all_models')
response.json()

['model_2.pkl', 'model_1.pkl']

### Prediction Endpoint

Endpoint : `\predict`

First endpoint of the API, which we will check, is the prediction endpoint. With this endpoint you can send data to the API with POST method. 
We will convert our numpy array to a dataframe. This will make it easier to convert to JSON datatype which is needed for the API.

In [57]:
# Create dataframe to convert it to json
data = pd.DataFrame(X)
data.head(2)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12
0,0.00632,18.0,2.31,0.0,0.538,6.575,65.2,4.09,1.0,296.0,15.3,396.9,4.98
1,0.02731,0.0,7.07,0.0,0.469,6.421,78.9,4.9671,2.0,242.0,17.8,396.9,9.14


In [58]:
# Convert to JSON
data = data.to_json(orient='records')

In [59]:
# Send request
response = requests.post(url=base_url+'/predict/1', json=data)
print('HTTP-Statuscode:', response.status_code)

HTTP-Statuscode: 200


In [60]:
print('Predictions : \n', response.json()[:5])

Predictions : 
 [30.003843377016807, 25.025562379053135, 30.567596718601614, 28.607036488728102, 27.94352423287301]


### Score Endpoint

Endpoint: `\score`

Next we will test the Scoring endpoint of the API. The scoring endpoint returns the score of the model reached by the uploaded data. Therefore we have to send the X and y data encoded as JSON.

In [61]:
y_ = pd.DataFrame(y)
y_ = y_.to_json(orient='records')

In [62]:
data_ = dict(X=data, y=y_)

In [63]:
response = requests.post(url=base_url+'/score/2', json=data_)
print('HTTP-Statuscode:', response.status_code)

HTTP-Statuscode: 200


In [64]:
print('R2-Score: ',response.json())

R2-Score:  0.7406426641094094


### Model Coefficients Endpoint

Endpoint: `\model_params`

This endpoint can be reached with a GET requests and returns the model parameters. The model parameters depend on the model used. It is expected that this changes from model to model. In accordance with that the model parameters have to be changed.

In [65]:
response = requests.get(url=base_url + '/model_params/2')
print('HTTP-Statuscode:', response.status_code)

HTTP-Statuscode: 200


In [66]:
model_coef = response.json()
model_coef

{'_normalize': False,
 'alpha': 3,
 'coef_': [-0.10801135783679626,
  0.04642045836688073,
  0.020558626367068414,
  2.6867338193449006,
  -17.76661122830023,
  3.809865206809228,
  0.0006922246403443731,
  -1.4755668456002566,
  0.3060494789851648,
  -0.012334593916573912,
  -0.95274723170729,
  0.009311683273793841,
  -0.5247583778554887],
 'copy_X': True,
 'fit_intercept': True,
 'intercept_': 36.45948838508981,
 'max_iter': None,
 'n_features_in_': 13,
 'n_iter_': None,
 'normalize': 'deprecated',
 'positive': False,
 'random_state': None,
 'solver': 'auto',
 'tol': 0.001}

### Update Model Endpoint

Endpoint: `/update_model`

This endpoint can be reached with a put request. With this requests you give the model new model parameters. As well as the method above this need to be changed as the model parameters strongly depend on the model. However it makes sense to change the model parameters if the model need to be retrained with new data for instance time series data.

In [67]:
# Update model parameters
response = requests.put(url=base_url+'/update_model/2', json={'params': np.random.uniform(size=len(model_coef['coef_'])).tolist()})
print('HTTP-Statuscode:', response.status_code)

HTTP-Statuscode: 200


In [68]:
# Print response
response.json()

'Parameters updated successfully.'

In [69]:
# Check if model parameters have changed by requesting the model parameters
response = requests.get(url=base_url + '/model_params/2')

In [70]:
new_coef = response.json()

print('New Model Parameters: \n', new_coef)
print('Old Model Parameters: \n', model_coef['coef_'])

New Model Parameters: 
 {'_normalize': False, 'alpha': 3, 'coef_': [0.56651304232847, 0.12714359453131496, 0.09690361484609544, 0.6899060453155397, 0.0058304784115867925, 0.884159574622631, 0.4226790077852771, 0.3684488030062578, 0.4328225952860757, 0.5663242044365917, 0.6270367343864853, 0.9845217685172278, 0.7965459917037604], 'copy_X': True, 'fit_intercept': True, 'intercept_': 36.45948838508981, 'max_iter': None, 'n_features_in_': 13, 'n_iter_': None, 'normalize': 'deprecated', 'positive': False, 'random_state': None, 'solver': 'auto', 'tol': 0.001}
Old Model Parameters: 
 [-0.10801135783679626, 0.04642045836688073, 0.020558626367068414, 2.6867338193449006, -17.76661122830023, 3.809865206809228, 0.0006922246403443731, -1.4755668456002566, 0.3060494789851648, -0.012334593916573912, -0.95274723170729, 0.009311683273793841, -0.5247583778554887]


In comparisson to the old parameters we can see that the parameters have been changed now.

### Delete Model Endpoint

Endpoint: `\delete_model` 

As we have changed the model parameters the model is in fact not usable anymore. Ultimately we have to delete it. For this purpose I created a delete endpoint. 

In [71]:
response = requests.delete(url=base_url+ '/delete_model/2')
print('HTTP-Statuscode:', response.status_code)

HTTP-Statuscode: 200


In [72]:
print(response.json())

Model Deleted Successfully.


In [73]:
# Print All models on server
response = requests.get(url=base_url+'/get_all_models')
response.json()

['model_1.pkl']

In [74]:
response = requests.delete(url=base_url + f'/delete_model/{model_delete}')
response.json()

NameError: name 'model_delete' is not defined