# DSCI 525 - Web and Cloud Computing

***Milestone 4:*** In this milestone, you will deploy the machine learning model you trained in milestone 3.

Milestone 4 checklist :

- [X] Use an EC2 instance.
- [X] Develop your API here in this notebook.
- [X] Copy it to ```app.py``` file in EC2 instance.
- [X] Run your API for other consumers and test among your colleagues.
- [X] Summarize your journey.

In [3]:
## Import all the packages that you need
import numpy as np
import pandas as pd
from sklearn.metrics import mean_squared_error
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split

## 1. Develop your API

rubric={mechanics:45}

### Solution for task1

```python
from flask import Flask, request, jsonify
import joblib

app = Flask(__name__)

# 1. Load your model here
model = joblib.load("model.joblib")

# 2. Define a prediction function
def return_prediction(input_data):

    # format input_data here so that you can pass it to model.predict()

    return model.predict([input_data])[0]

# 3. Set up home page using basic html
@app.route("/")
def index():
    # feel free to customize this if you like
    return """
    <h1>Welcome to our rain prediction service</h1>
    To use this service, make a JSON post request to the /predict url with 5 climate model outputs.
    """

# 4. define a new route which will accept POST requests and return model predictions
@app.route('/predict', methods=['POST'])
def rainfall_prediction():
    content = request.json  # this extracts the JSON content we sent
    prediction = return_prediction(content['data'])
    
    # return whatever data you wish, it can be just the prediction
    # or it can be the prediction plus the input data, it's up to you
    results = {
        "input": content["data"],
        "prediction": prediction
    }  
    
    return jsonify(results)
```

## 2. Deploy your API

rubric={mechanics:40}

<img src="img/milestone4.PNG" alt="" height="900" width="900"> 

## 3. Summarize your journey from Milestone 1 to Milestone 4
rubric={mechanics:10}
> - In Milestone 1 we downloaded data from the web and learned how frustrating it can be to work with large data using common wrangling libraries like Pandas and tidyverse and with "vanilla" file formats such as .csv.  After this experience, we were ready to move on to more sophisticated methods of working with big data! 
> - In Milestone 2 we set up the required infrastructure needed to process and store big data in the cloud (scale-up).  We used the AWS user interface by setting up our own EC2 instance and S3 bucket, and connected to the cloud from our local terminal.  We also prepared the data to be used in our ML model in the next milestone. 
> - In Milestone 3 we set up the distributed infrastructure for a scale-out solution using EMR and spark, again using the AWS user interface and connecting via our local terminal.  We used spark's MLlib to identify the best hyperparameters.  Then trained our ML model with this hyperparameters and saved the trained model to S3. 
> - In Milestone 4 we deployed our ML model using flask so it can be used by other "consumers". 

## 4. Submission instructions
rubric={mechanics:5}

In the textbox provided on Canvas please put a link where TAs can find the following-
- [X] This notebook with solution to ```1 & 3```
- [X] Screenshot from 
    - [X] Output after trying curl. Here is a [sample](https://github.ubc.ca/MDS-2020-21/DSCI_525_web-cloud-comp_students/blob/master/Milestones/milestone4/images/curl_deploy_sample.png). This is just an example; your input/output doesn't have to look like this, you can design the way you like. But at a minimum, it should show your prediction value.