# AUEB M.Sc. in Data Science (part-time)

**Course**: Practical Data Science

**Semester**: Fall 2018

**3rd homework**: Deploying a ML Model

**Author**: Spiros Politis

---

## Description

In this homework you will deploy a ML model behind a web API. That is the typical process followed in [Microservices Architectures](https://en.wikipedia.org/wiki/Microservices).

This homework - although minimal compared to the previous two - is essential.

It will teach you how to deliver your models not only to business stakeholders in a presentation format (as you did in the 1st homework), to fellow analysts & data scientists in <mark>.ipynb</mark> format (as you did in the 2nd homework), but to other language-agnostic components (people or machines) of a modern tech architecture too.

## Questions

### Model Persistence

Refer back to the last model (the "best one") you ended up with in the 2nd homework.

You should extend your notebook to persist this model on disk.

The standard way to do object serialisation (persistence) in python is via the [pickle module](https://docs.python.org/3/library/pickle.html)

For scikit-learn based models though, the recommended way is to use the <mark>joblib</mark> package - the API is identical to <mark>pickle</mark>'s one.

Using the <mark>joblib</mark> package save your model in a binary file named <mark>load_forecasting_model_v010.joblib</mark>.

This process is described in the official scikit-learn reference [here](https://scikit-learn.org/stable/modules/model_persistence.html).

### Web API

Next, you will expose a Web API that allows external users to access your model's predictions via a <mark>GET /forecast?param=1&param=2</mark> To do so you should use [Flask](http://flask.pocoo.org/docs/1.0/quickstart/).

Your [Flask](http://flask.pocoo.org/docs/1.0/quickstart/) app should be in a <mark>load_forecasting_web.py</mark> file.

Once a Python <mark>load_forecasting_web.py</mark> command is issued the following should happen:

- A Flask app is initialized

- The <mark>load_forecasting_model_v010.joblib</mark> should be loaded, deserialized and stored in a global variable (this is not best practice but it's fine for our purposes).

- A <mark>GET /forecast</mark> endpoint with the "appropriate" parameters should be added to the Flask app.
The response of that endpoint should be served in json format (use flask.jsonify). The response payload should contain at least the following keys: <mark>predicted_value</mark>, <mark>model_label</mark> (human readable label for your model). You should add any other metadata you think valid - a user should know about.

The "appropriate" parameters depend on the features you used during training.

When run via the the Python <mark>load_forecasting_web.py</mark> command, the server should bind to [http://127.0.0.1:9181 ](http://127.0.0.1:9181) and be accessible via a normal browser.

You should include an example request URL in your README.md file.

## Submission instructions

Your submission should include the following:

- The <mark>load_forecasting_model_v010.joblib</mark> file
- The <mark>load_forecasting_web.py</mark> file
- A <mark>requirements.txt</mark> file
- A <mark>README.md</mark> file

**Anything not written in that README.md file, will not be taken into consideration.**

## Honor code

You understand that this is an individual homework, and as such you must carry it out alone. You may seek help on the Internet, by Googling or searching in StackOverflow for general questions pertaining to the use of Python and pandas libraries and idioms. However, it is not right to ask direct questions that relate to the homework and where people will actually solve your problem by answering them. You may discuss with your colleagues in order to better understand the questions, if they are not clear enough, but you should not ask them to share their answers with you, or to help you by giving specific advice.

You should be able to justify **why** you wrote that particular line of code.

---