# ML Model Training and Deployment 


We create new project to keep all depedencies and packages for our deployment env. 

Load Julia env from Project.toml and Manifest.toml

In [None]:
]instantiate

After training and evaluation, the model should be deployed to serve the scores and predictions.


The model is usually embedded into a bigger application or exposed through a web service. The mentioned solutions need additional logic to properly prepare the input data and return the prediction should be returned to the user in appropriate form.
* **JSON-based web service** - JSON payload with input observation is provided to the web service and the JSON with the prediction is returned back

## Model building

We'll build regression model to predict median house value in the Boston suburbs. 

The dataset comes from [UCI repository](https://archive.ics.uci.edu/ml/machine-learning-databases/housing/).

Attribute Information:

1. CRIM - per capita crime rate by town
2. ZN - proportion of residential land zoned for lots over 25,000 sq.ft.
3. INDUS - proportion of non-retail business acres per town
4. CHAS - Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
5. NOX - nitric oxides concentration (parts per 10 million)
6. RM - average number of rooms per dwelling
7. AGE - proportion of owner-occupied units built prior to 1940
8. DIS - weighted distances to five Boston employment centres
9. RAD - index of accessibility to radial highways
10. TAX - full-value property-tax rate per \$10,000
11. PTRATIO - pupil-teacher ratio by town
12. B - 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
13. LSTAT - \% lower status of the population
14. **MEDV - Median value of owner-occupied homes in \$1000's**


Model building will be proceed with 3 steps: 

1. Load data
2. Preprocessing (not implemented at present)
3. Model Training

In [None]:
using CSV
using DataFrames

In [None]:
load_data(path) = CSV.File(path) |> DataFrame

In [None]:
houses = load_data("housing.csv")

In [None]:
names(houses)

In [None]:
using GLM

# model 1 - linear regression model
reg = lm(@formula(MEDV ~ CRIM + INDUS + CHAS + RM + AGE + DIS + TAX + LSTAT), houses::DataFrame)

In [None]:
# take first row of data
first_row = DataFrame(houses[1,[:CRIM, :INDUS, :CHAS, :RM, :AGE, :DIS, :TAX, :LSTAT]])
# first_row = DataFrame(houses[1,:])
# predict value
predict(reg, first_row)

In [None]:
using BSON: @save
@save "reg.bson" reg

In [None]:
test_dict = Dict("DIS" => 4.09,"CRIM" => 0.00632,"INDUS" => 2.31,"RM" => 6.575,"AGE" => 65.2,"CHAS" => 0.0,"TAX" => 296.0,"LSTAT" => 4.98)

In [None]:
predict(reg, DataFrame(test_dict ))

In [None]:
using LinearAlgebra
using BSON: @load
reg = nothing
@load "reg.bson" reg
predict(reg, DataFrame(test_dict))

let's take other form for neural network model

In [None]:
X = transpose(Matrix(houses[!,Not(:MEDV)]))
y = transpose(houses.MEDV);

In [None]:
using Flux
using ProgressMeter

# Neural network model one dense hidden layer with ReLU activation function
model = Chain(Dense(13 => 42, relu), Dense(42 => 1))
loss(x, y) = Flux.Losses.mse(model(x), y)
parameters = Flux.params(model)
data = [(X, y)]
opt = Flux.Adam(0.002)
@showprogress for epoch in 1:50_000
    Flux.train!(loss, parameters, data, opt)
end

In [None]:
first_row_matrix = X[:,1]
println("from NN model: ", model(first_row_matrix)[1])

In [None]:
# model evaluation 
using Statistics

RMSE(y, ŷ) = sqrt(mean((y-ŷ).^2));

In [None]:
# for regression 
RMSE(y,transpose(predict(reg, houses)))

In [None]:
# for neural network
RMSE(y, model(X))

In [None]:
RMSE(y[1],transpose(predict(reg, DataFrame(first_row))[1]))

In [None]:
RMSE(y[1],model(first_row_matrix)[1])

In [None]:
using BSON: @save
@save "nn_model.bson" model

In [None]:
using BSON: @load
model = nothing
@load "nn_model.bson" model
println("from NN model: ", model(first_row_matrix)[1])

In [None]:
# Saving first observation from the training dataset into `house.json` file
using JSON
open("house.json","w") do f
    JSON.print(f, Dict(names(houses)[begin:end-1] .=> X[:,1]), 4)
end

## Simple REST API

Simple routing with Genie 

We want json as a response

GET method to send variables 


In [None]:
using Genie, Genie.Renderer.Json
using Genie.Requests # for method GET and POST
using JSON

route("/") do 
  (:message => "Hello Julia!") |> Json.json
end

route("/getapi", method=GET) do
  vars = getpayload()
  (:variables => vars) |> Json.json
end

#start the server - it will not block the Jupyter due to async=true
up(8000, async = true)

After starting the server, you can use `curl` or other tool capable of sending and receiving HTTP requests to interact with the API.

In [None]:
;curl http://localhost:8000/

In [None]:
;curl http://localhost:8000/getapi\?\&val1=43\&val2=3

In [None]:
using HTTP
resp = HTTP.get("http://localhost:8000")
println(resp.status)
println(String(resp.body))

The server is running asynchronously in Jupyter. When you are finished, run the `down()` command to turn it off.

In [None]:
down()

In [None]:
using Genie, Genie.Requests, Genie.Renderer.Json
using Flux
using BSON: @load
using GLM
using DataFrames
using LinearAlgebra


@load "nn_model.bson" model

@load "reg.bson" reg

route("/") do
"""<div style="white-space:pre">To receive a prediction send POST request with JSON payload.

Example:
>> curl -X POST -d @house.json -H "Content-Type: application/json" http://localhost:8000/
>> cat house.json
{
    "crim": 0.00632,
    "tax": 296.0,
    "chas": 0.0,
    "black": 396.9,
    "lstat": 4.98,
    "age": 65.2,
    "indus": 2.31,
    "rm": 6.575,
    "dis": 4.09,
    "zn": 18.0,
    "nox": 0.538,
    "ptratio": 15.3,
    "rad": 1.0
}</div>"""
end

route("/", method = POST) do
    
    input_data = jsonpayload()
    keys_json = keys(input_data)
    columns = ["CRIM","ZN","INDUS","CHAS","NOX","RM","AGE","DIS","RAD","TAX","PTRATIO","B","LSTAT"]
    missing_fields = [k for k in columns if k ∉ keys_json]
    
    if length(missing_fields) != 0
        missing_str = join(missing_fields, ",")
        Json.json(:error => "The fields: $missing_str are missing from the JSON payload."*
            "The prediction can not be returned.")
    else
        try
            Json.json(Dict("input" => input_data,
                        "prediction" => model([input_data[f] for f in columns])[1])
                     )
        catch e
            Json.json(:error => "Ooops! There was a problem while generating a prediction.")
        end
    end
end

route("/glm") do
"""<div style="white-space:pre">To receive a prediction for GLM linear model send POST request with JSON payload.

First row:
{
    "crim": 0.00632,
    "tax": 296.0,
    "chas": 0.0,
    "black": 396.9,
    "lstat": 4.98,
    "age": 65.2,
    "indus": 2.31,
    "rm": 6.575,
    "dis": 4.09,
    "zn": 18.0,
    "nox": 0.538,
    "ptratio": 15.3,
    "rad": 1.0
}</div>"""
    
end

route("/glm", method = POST) do
    input_data = jsonpayload()
    try
        (":input" => input_data,":prediction" => predict(reg, DataFrame(input_data))) |> Json.json
    catch e
        (:error => "Ooops! There was a problem while generating a prediction.") |> Json.json
    end
end


#start the server - it will not block the Jupyter due to async=true
up(port=8000, async=true)

In [None]:
down()

In [None]:
;cat house.json

In [None]:
;curl -X POST -d @house.json -H "Content-Type: application/json" http://localhost:8000/

In [None]:
;curl -X POST -d @house.json -H "Content-Type: application/json" http://localhost:8000/glm/

## Docker container 

In [None]:
] generate Docker

In [None]:
;cd Docker

In [None]:
;pwd

In [None]:
] activate .

### i will use just simple GLM model

In [None]:
] add "Genie" "BSON" "GLM" "DataFrames"  "LinearAlgebra"

In [None]:
;cd ..

Add you BSON file with model and create new app.jl file with genie server.
Remember change async setting
```julia
 up(port=8000, async=false)
```