# Machine learning API

Right now we have 3 database tables: 

* Users
* Requests
* Responses

Additionaly, we have created a machine learning model along with the input schema. It is time to create a working API using FastAPI to serve predictions. 

# Loading the ML model to memory 

The most efficient way to load an ML model to memory is to save it during the initiation of the FastAPI application. It is a common mistake to read the model file and the schema file everytime a new request comes in and then apply it. 

We should import the model objects and create any additional objects at the top of the main **app.py** script where the API object is beeing created. 

The necessary utilities: 

In [1]:
!cat ML_API/machine_learning_utils.py

# Pickle object reading 
import pickle 

# JSON object reading 
import json 

# OS traversal 
import os 

# Input dataframe
import pandas as pd 

# Array math 
import numpy as np 

def load_ml_model(model_dir='ml_model'):
    """
    Loads the model and the schema from the given path
    """
    model, type_dict, feature_list = {}, {}, []
    
    _model_path = os.path.join(model_dir, 'model.pkl')
    _input_schema_path = os.path.join(model_dir, 'input_schema.json')

    # Checking if the files exists and reading them 
    if os.path.exists(_model_path) and os.path.exists(_input_schema_path):
        with open(_model_path, 'rb') as f:
            model = pickle.load(f)
        with open(_input_schema_path, 'r') as f:
            input_schema = json.load(f)
    
    # Extracting the features
    features = input_schema.get('input_schema', {})
    features = features.get('columns', [])

    # Iterating over the list of dictionaries and changing the types.
    # numeric -> float 
    # bo

The loading of the model occurs right before defining the endpoints:

```
...

# Creating the application object 
app = FastAPI()

# Loading the machine learning objects to memory 
ml_model, type_dict, ml_feature_list = load_ml_model()

...
```

By loading the objects in the following way, the objects are saved in runtime memory and are not loaded from disk everytime a new request comes in. This makes the application much faster. 

# API usage flowchart

A typical flow of the API is the following: 

* Register a user: 

![registration](media/registration.png)

The output of the registration logic is a JWT token which we attach in each of the requests to our API. 

* Prediction flow: 



![api-flow](media/API-flow.png)

Each request to the API needs to have the JWT token attached to it. Then, along with the token, the data for the API is sent ant the following flow starts: 

1) The user is beeing authenticated. 

2) If the user is authenticated, then the request data is beeing validated for the ML model. 

3) If the data is good, then the prediction is beeing made.

4) The final response is sent. 

Along the way, the information is logged to the **Requests** and **Responses** tables. 

All the code is available in the **app.py** script in the ML_API directory so lets try and apply the above flowchart!

# API usage

## Creating a user 