## Deploying the model in production

#### Some examples and ideas how to put the data science solution to production

#### Tags:
    Data: no data involved
    Technologies: DWH, REST API, Python
    Techniques: deploying the model into production
    
#### Resources:
[Deploying a Machine Learning Model as a REST API](https://towardsdatascience.com/deploying-a-machine-learning-model-as-a-rest-api-4a03b865c166)

[How to build an API for a machine learning model in 5 minutes using Flask](https://medium.com/vantageai/how-to-build-an-api-for-a-machine-learning-model-in-5-minutes-using-flask-eb72d8cb4504)

[Building a basic RESTful API in Python](https://www.codementor.io/sagaragarwal94/building-a-basic-restful-api-in-python-58k02xsiq)



## Process and teams involved

Deploying the model into production is the last step in creating a data science solution. This step is very important as it is a first step in making the solution available to other interested parties. It can be a difficult step to do as usually the data science team needs the support of DevOps team or just needs to acquire knowledge to produce a solution. 

In most cases the best approach is to create a project/product team that consists not only of data scientist, but also Data Engineers, Big Data Engineers, Data Analysts and also domain knowledge owners. This usually involves a project/product agile organization structure that can be both flexible and efficient (think agile or SCRUM). There can also be other approaches, but they all seem to go into the same direction.

Depending on the goal of the project and the team setup putting the model into production can be a part of the data scientists job or not. After the deployment of the model into production, maintaining the model becomes a resposibility of a DevOps team, hence they will need to understand how and when to update the model or make changes. 

## Tracking value achieved

As the users use the model in the production environemnt there will be some feedback, especially at first. This feedback can be used to gage and understand the usefulness of the model as well as understand the areas for improvement. It is also the part where finally the value of the data science is achieved. It would be good to set up some measures that can be tracked so that the exact or expected value of the data science project can be measured (ROI).

## Options to deploy

### Exposing a solution as an API

One of the most used approaches to share the data is using the Web and HTTP. Many services are provided by exposing the API as a web service so that some result can be provided over web. Consumer can then send a query request to the API and get result.

Depending on the needs the API might have to be implemented as a global or a local service. It can also be made to be highly available (think many requests at once) and the model could be updated very frequently. This then requires more infrastructural resources that can ensure such availability. There are also some services out there so these steps could be outsourced, but the ROI on such investments needs to be calculated carefully. 

With python we have the option to create a micro service API web server using Flask. An example of how a REST API would be created for the example of predicting if a star is a pulsar is shown as a small project below:

[RESTful API with Flask](https://github.com/scenthr/sasa-pavkovic-portfolio/blob/master/deploying-model/Flask-REST-API/)

**How to:**

1. go to command line and start the Flask REST API by running
    
`python Flask_REST_API_web_server.py`

    
2. consume API and get predictions
    
`python consume_API.py`
    
### Embedding a solution in a RDBMS

Today most companies have some form of a DWH, a place where the data is aggregated from different tools and applications in the companies environment. The results of a prediction model can also be stored as a separate table in DWH. This could work well in the initial phases of the implementation and if the need for a fresh model is not very high and the predictions make sense to be pre-calculated (e.g. probability of a customer to make a purchase in the next X period).

Users that are already familiar with the querying data could then just add another table in their results to be able to use the results of a predictive model in their daily initiatives and decisions.

Python can be used for predictions and the results of the predictions can be stored to a RDBMS by using one of the python libraries that allow to connect to diff. DBs. Example of a small project can be found below:

[Storing the results of a model into RDBMs](https://github.com/scenthr/sasa-pavkovic-portfolio/blob/master/deploying-model/model-results-to-rdbms/model-results-to-rdbms.ipynb)

