# REST API Design 
It is firstly important to note that if this functionality were to be deployed into a production environment we would use a database such as mongoDB, especially since this service works with JSON documents. For the purpose of simplicity and getting this challenge done on time I have decided to simply use a dictionary in the place of the database. The reason for selecting a dictionary as my data structure is the fact that the key-value pairs are very similar to how a database is queried. Here instead a query can be carried out by looking up the key. To create this service I have used Flask, which is excellent for developing web applications. I have aimed to implement the different methods with the correct HTTP methods such as GET, POST, PUT and DELETE in order to create a functioning service. My aim was to approach this as if it were a product and therefore thought of all the items a user would require when using the API to classify data samples in real-time I have implemented the following functionalities to illustrate how this API could be utilized effectively. 

<b>1.</b> Classify new samples using the selected Machine Learning algorithm (POST).

<b>2.</b> Get the name of the Machine Leanring Algorithm currently being utilisex by the API (GET).

<b>3.</b> Allow the algorithm to be retrained on a new batch of data (POST).

<b>4.</b> Change the Machine Learning Algorithm being used (POST).

<b>5.</b> Return all of the data in the database (GET).

<b>6.</b> Update the data for a specified sample index in the database (PUT).

<b>7.</b> Delete the data at a specified sample index in the database (DELETE).

Each of these has been implemented above however, there are also many other items that could be seen as desirable for such an API. They key here is to enable functionality in order to reduce downtime of the service. 

<b>1.</b> Allow for the data at a specified sample index to be returned.

<b>2.</b> The implementation of some form of authentification would be highly desirable e.g. Authorization or API keys.

<b>3.</b> The ability to completely wipe all the data from the database.

<b>4.</b> The ability for more advanced database searches.

<b>5.</b> Reboot the database.

It is finally worth noting that I have aimed to use the HTTP error codes where possible as a means of detection errors in my testing but I am sure a different system would be applicable if this were to be a production system.

In [208]:
from flask import Flask
from flask_restful import Api, Resource, reqparse
import numpy as np
import pickle as p
import sys

APP = Flask(__name__)
API = Api(APP)
sampleID = 0

#This is the machine learning algorithm we will start off with 
loaded_model = p.load(open('RandomForest.pickle', 'rb'))
samples = {}

#This will include the functionaltiy to do with the Machine Learning Algorithm 
class Algorithm(Resource):
       
    #POST - Here we will classify a new sample
    @staticmethod
    def post():
        
        #Use a global sample ID here so we can keep track accross numerous requests
        global sampleID
        
        #load in the request 
        json_ = request.get_json()
        d = json.loads(json_)
        
        #create a dictionary to store the data
        dictionary = {}
          
        #load the JSON formatted data of the sample we wish to classify and put it into a dictionary 
        for i in range(len(d['data'])):
            name = d['index'][i]
            dictionary[name] = d['data'][i]
        
        #create a dataframe of the sample so we can classif it 
        query = pd.DataFrame(dictionary, index=[0])
    
        #get the prediction from the model
        prediction = loaded_model.predict(query)
        
        #what we are going to return
        out = {'Prediction': int(prediction[0])}
        
        #append the sample data with its predicted output
        sample = d['data'] + prediction[0]
        
        #append the sample to the dictionary (essentailly the database in this case)
        samples[sampleID] = sample
        #increment the sample ID 
        sampleID += 1
        
        #Return the data and a code to say it went OK 
        return (out, 200)
    
    #GET - This will return the name of the model we are using currently
    @staticmethod
    def get():
        
        #return the model name AND a code to say it went OK
        return type(loaded_model).__name__, 200
    
#This is to retrain the model (create a new class because the POST method is taken above)
class Retrain(Resource):

    #POST - Here will retrain the model with new data 
    @staticmethod
    def post():
        
        #get the json request data which is the new data to train on
        json_ = request.get_json()
        d = json.loads(json_)
        
        #seperate the train data into features and labels 
        train_data = pd.DataFrame(data = d['data'], columns = d['columns'])
        y_train = train_data['CarInsurance']
        X_train = train_data.drop(labels='CarInsurance', axis=1)
        
        #train the algorithm on the data
        loaded_model.fit(X_train, y_train)
        
        #get the accuracy 
        score = loaded_model.score(X_train, y_train)
                
        #Model is only as good as random and so we should flag something 
        #perhaps this error message isnt the best but it is just to show the idea 
        if score <= 0.50:
            return "Request Completed (200) - Poor Model Performance"
        
        return 200
    

class Change_Model(Resource):

    #POST - Here will change the model we are using 
    #Note this is for an already trained model and if we were to do this in production we would pass and object that we could
    #train also 
    @staticmethod
    def post():
        
        #The model is a global variable so we can update it and it will stay updated in the following requests
        global loaded_model
    
        #get the JSON request data which is the model filename 
        json_ = request.get_json()
        d = json.loads(json_)
        
        #here we load th epre-trained model
        loaded_model = p.load(open(str(d['model']), 'rb'))
        
        #Do a quick check to see if the new model name is now correct 
        substring = str(d['model'])[:7]
        
        #Do some error checking (I know for production grade code better error detection will be required)
        if substring in type(loaded_model).__name__:
            #went okday
            return 200
        
        #or else there was an error
        return "Error - Model did not update correctly"
    
    
class Database(Resource):

    #GET - Here will return all of the data in the database
    @staticmethod
    def get():
        
        #if there are no samples we cannot return any data
        if len(samples) == 0:
            return "The database is empty"
        
        #else take all the samples and convert to JSON and return
        json_str = json.dumps({k: v.tolist() for k, v in samples.items()})

        return json_str
    
    
    #PUT - Here will update the database sample at a specified index with new data
    @staticmethod
    def put():
        
        #get the JSON data which will be the index of the sample we want to change and its new data
        json_ = request.get_json()
        d = json.loads(json_)
        
        #unpack the data we got into the index and the data 
        valuesList = list(d.values())
        index = valuesList[14]
        valuesList = valuesList[:14]
        
            
        #Check if the index of the sample is in the database and if not throw an error
        if index not in samples:
            return "Error - The sample index supplied is not in the database"
        #else set the sample to be the new data 
        else:
            samples[index] = valuesList
            
        #return OK 
        return 200
    
    #DELETE - This query will delete the specified record 
    @staticmethod
    def delete():
        
        #get the JSON request data
        json_ = request.get_json()
        index = json.loads(json_)
        
        print(index)
            
        #Check if the index of the sample is in the database and if not throw an error
        if index not in samples:
            return "Error - The sample index supplied is not in the database"
        
        #otherwise delete the desired sample
        else:
            del samples[index]
            
        #If the sample is still there after deleting throw an error
        if index in samples:
            return "Deleting this sample did not work"
            
        return 200

#Add the resources of the API
API.add_resource(Algorithm, '/algorithm')
API.add_resource(Retrain, '/retrain')
API.add_resource(Change_Model, '/change_model')
API.add_resource(Database, '/database')

#Main method to run the API 
if __name__ == '__main__':
    APP.run(debug=True, port='1080', use_reloader=False)

 * Serving Flask app "__main__" (lazy loading)
 * Environment: production
   Use a production WSGI server instead.
 * Debug mode: on


 * Running on http://127.0.0.1:1080/ (Press CTRL+C to quit)
