# Module 2: Data Engineering
## Sprint 3: Deploying Machine Learning Models
## Deploying machine learning model

## Background

This is the last lesson of this sprint. You did a great job learning all about web applications, APIs, and application deployment. Now you should be able to create and deploy Flask based application. You also know how to serve trained models and make inference using REST API. Today you will have to test your learnings by completing the final task of this sprint - deploying a machine learning model and tracking its performance.

## Predicting the price of house
Today you will need to train a simple regression model that can predict house price. You will need to use [this](https://scipy-lectures.org/packages/scikit-learn/auto_examples/plot_boston_prediction.html) provided example and copy-paste the training code. This task is not about training the model but about what comes next in the whole pipeline, so you do not need to worry about this part.

## Creating API using Flask
This is the step you will need to create an API for the trained model. Create one `POST` type route to access the model. This route should return a JSON response that includes model predictions.

## Creating the inference pipeline
You will need to make sure that data is passed into the model in the right format. This is why you will need to create a function that transforms JSON type data acquired from request and transform it into the one that the model understands.

## Loading model
In this step, you will need to load the trained regressor. Use pickle for this task. You should also leave this file in the repository as it will be needed when deploying your application to Heroku.

## Deploying the application
The is the part where you will need to deploy your application using Heroku. This is the task you had to do in the last lesson. The steps you have to make should not be that different from those that you did when deploying Tesla Factory API. 

## Concepts to explore
* Creating an API using Flask
* Saving and loading trained models
* Creating inference pipeline for the trained model
* Deploying Flask application using Heroku

## Requirements
* Model should be trained to successfully perform house price prediction
* API should be created that by getting data through `POST` requests, returns response (model's predictions)
* Created application should be deployed accessible (provide link to it)

## Evaluation criteria
1. Model is trained and is able to perform price predictions
2. API is created using Flask
3. Correct preprocessing pipeline is made
4. Model is successfully loaded and is reachable through `POST` type route
5. Application is deployed and reachable
6. Provided source code meets the "Clean Code" standards. There are no secrets/passwords left in the code.



---



## Github link to the code and documentation: [Github/Vilkamini](https://github.com/VilkaMini/turingcollege/tree/main/TC%20235%20files)

### Link to the deployed program: https://tc-235-house-prices.herokuapp.com/price (Only POST method is allowed)


# Project:

In [None]:
import requests
import json

## Test #1:
One 1 dimensional array.

In [None]:
house_no_1 = [6.320e-03, 
              1.800e+01, 
              2.310e+00, 
              0.000e+00, 
              5.380e-01, 
              6.575e+00, 
              6.520e+01, 
              4.090e+00, 
              1.000e+00, 
              2.960e+02, 
              1.530e+01, 
              3.969e+02, 
              4.980e+00]

In [None]:
response = requests.post("https://tc-235-house-prices.herokuapp.com/price", data=json.dumps(house_no_1))
print(response)
print(json.loads(response.text)["predicted"])

<Response [200]>
[30.360034097439428]


## Test #2:
One 2 dimensional array with 2x1 dimensional arrays.

In [None]:
house_no_2_and_3 = [[2.9850e-02, 0.0000e+00, 2.1800e+00, 0.0000e+00, 
                     4.5800e-01, 6.4300e+00, 5.8700e+01, 6.0622e+00, 
                     3.0000e+00, 2.2200e+02, 1.8700e+01, 3.9412e+02, 
                     5.2100e+00],
                    [1.00245, 0.     , 8.14   , 0.     , 0.538  , 6.674  ,
                     87.3   , 4.239  , 4.     , 307.   , 21.    , 380.23 ,
                     11.98   ],
                   ]

In [None]:
response = requests.post("https://tc-235-house-prices.herokuapp.com/price", data=json.dumps(house_no_2_and_3))
print(response)
print(json.loads(response.text)["predicted"])

<Response [200]>
[25.161733394421084, 20.374045650950944]


## Test #3:
Wrong input (missing one value)

In [None]:
house_no_4 = [4.3370e-02,
              2.1000e+01,
              5.6400e+00,
              0.0000e+00,
              4.3900e-01,
              6.1150e+00,
              6.3000e+01,
              6.8147e+00,
              4.0000e+00,
              2.4300e+02,
              1.6800e+01,
              3.9397e+02]

In [None]:
response = requests.post("https://tc-235-house-prices.herokuapp.com/price", data=json.dumps(house_no_4))
print(response)
print(json.loads(response.text)["error"])

<Response [400]>
INPUT MUST BE A 2D ARRAY



## Sample correction questions

During a correction, you may get asked questions that test your understanding of covered topics.

- What is a REST API? How can you use one to create an interface for your trained models?
- How should you choose the right Cloud Services provider? Explain key differences between Heroku and GCP
- Why is having an E2E Machine Learning Pipeline important? What are the main parts of it?
- How should you ensure the security of the deployed applications?