<a href="https://colab.research.google.com/github/LeonardoGoncRibeiro/05_AppliedMachineLearning/blob/main/07_MLOps_API.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# MLOps: Machine Learning and APIs

In this course, we will learn what is MLOps and how to create APIs using Flask. Then, we will learn how to serve ML models using APIs. We will see how we can better administrate the dependencies of our model, and how to serialize an ML model.

In this course, we will basically learn what to do after we have build a model. How can other people use our model for their advantage? This is a basic course to learn how to create a first API, and, for futher understanding over web development, the reader is referred to other courses. 

## MLOps, or Machine Learning Operations

MLOps says respect to what we should do after our model is built. For instance, let's say we built a classification model, which takes some information as input, and outputs the most likely event. We want to make it so that any person can use this model, and in an easy manner.

Thus, MLOps is the set of actions we should do to make sure that our model is able to be used in our business, and be relevant for our final customer. After building our model, through continuous feedback, we should iteratly deploy, operate, integrate, and even rebuild our model, so that it keeps being able to represent and model reality.

## API, or Application Programming Interface

An API is an interface that will be used for the final user to access our model in a simplified way. For instance, if we want to know the fastest way to reach a certain point, we can use the Google Maps API. Basically, the user sends a *request* to the API, and the API sends a *response*. 

The API allows for the communication between services, and allow the use of our complex model by an user who do not necessarily knows how our model works. 

To develop our APIs, we will use the flask package:

In [None]:
!pip install flask-ngrok

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting flask-ngrok
  Downloading flask_ngrok-0.0.25-py3-none-any.whl (3.1 kB)
Installing collected packages: flask-ngrok
Successfully installed flask-ngrok-0.0.25


In [None]:
!pip install pyngrok

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting pyngrok
  Downloading pyngrok-5.1.0.tar.gz (745 kB)
[K     |████████████████████████████████| 745 kB 25.5 MB/s 
Building wheels for collected packages: pyngrok
  Building wheel for pyngrok (setup.py) ... [?25l[?25hdone
  Created wheel for pyngrok: filename=pyngrok-5.1.0-py3-none-any.whl size=19007 sha256=62eaa13e173c9df2c5be795faac941cd44514ee634796df5feebba1ec21b3a6b
  Stored in directory: /root/.cache/pip/wheels/bf/e6/af/ccf6598ecefecd44104069371795cb9b3afbcd16987f6ccfb3
Successfully built pyngrok
Installing collected packages: pyngrok
Successfully installed pyngrok-5.1.0


In [None]:
from flask import Flask
from flask_ngrok import run_with_ngrok
from pyngrok import ngrok

Nice! Now, to create our first API, we need to instance a Flask object:

In [None]:
my_first_app = Flask('first_app')
run_with_ngrok(my_first_app)

In [None]:
port = 5000
auth_token = '2ATxMOi0B3lpLciSMzBb9eOAejh_7uTTU7DVWwQH18eA3RnkQ'
ngrok.set_auth_token(auth_token)
public_url = ngrok.connect(port).public_url



We should then define the routes of our API. 

In [None]:
@my_first_app.route('/')
def home( ):
  return "My first API."

Now, to run our app, we can do:

In [None]:
my_first_app.run( )

 * Serving Flask app "first_app" (lazy loading)
 * Environment: production
[2m   Use a production WSGI server instead.[0m
 * Debug mode: off


 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)


 * Running on http://407a-35-186-174-7.ngrok.io
 * Traffic stats available on http://127.0.0.1:4040


127.0.0.1 - - [12/Jun/2022 21:01:20] "[37mGET / HTTP/1.1[0m" 200 -
127.0.0.1 - - [12/Jun/2022 21:01:21] "[33mGET /favicon.ico HTTP/1.1[0m" 404 -
127.0.0.1 - - [12/Jun/2022 21:01:22] "[37mGET / HTTP/1.1[0m" 200 -


Nice! So, we have been able to create our first API (which can be run from google colab from the ngrok link). This API simply prints a text "My First API" in our screen.

Note that there are other great frameworks for web development using Python, such as Django or Pyramid.

https://www.djangoproject.com/

https://trypyramid.com/

# Creating an API for a simple sentiment analysis model

So, just for example purposes, we will develop here a Sentiment Analysis model. Sentiment Analysis tests if a given mention (or post) was positive, negative, or neutral. 

Let's create our model:

In [None]:
from textblob import TextBlob

text = "Python is great for Machine Learning."

tb = TextBlob(text)

In [None]:
tb.sentiment.polarity

0.8

Note that textblob can also be used for translating text (polarity analysis should be used only on english text):

In [None]:
text_pt = "Python é ótimo para Aprendizado de Máquina"

tb_pt = TextBlob(text_pt)

tb_en = tb_pt.translate(from_lang = 'pt', to = 'en')

In [None]:
tb_en

TextBlob("Python is great for machine learning")

In [None]:
tb_en.polarity

0.8

Nice! So, we can create a sentiment analysis model using the TextBlob library. So, let's try to use it in our API:

In [None]:
my_first_app = Flask(__name__)
run_with_ngrok(my_first_app)

port = 5000
auth_token = '2ATxMOi0B3lpLciSMzBb9eOAejh_7uTTU7DVWwQH18eA3RnkQ'
ngrok.set_auth_token(auth_token)
public_url = ngrok.connect(port).public_url

@my_first_app.route('/')
def home( ):
  return "My first API."

@my_first_app.route('/sentiment/<text>')
def sentiment(text):
  tb_pt = TextBlob(text)
  tb_en = tb_pt.translate(from_lang = 'pt', to = 'en')
  pol = tb_en.polarity
  return "polarity: {}".format(pol)

my_first_app.run( )

 * Serving Flask app "__main__" (lazy loading)
 * Environment: production
[2m   Use a production WSGI server instead.[0m
 * Debug mode: off


 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)


 * Running on http://61a8-35-186-174-7.ngrok.io
 * Traffic stats available on http://127.0.0.1:4040


127.0.0.1 - - [12/Jun/2022 21:04:02] "[37mGET / HTTP/1.1[0m" 200 -
127.0.0.1 - - [12/Jun/2022 21:04:02] "[33mGET /favicon.ico HTTP/1.1[0m" 404 -
127.0.0.1 - - [12/Jun/2022 21:04:03] "[37mGET / HTTP/1.1[0m" 200 -
127.0.0.1 - - [12/Jun/2022 21:04:12] "[37mGET /sentiment/Python%20é%20ótimo HTTP/1.1[0m" 200 -
127.0.0.1 - - [12/Jun/2022 21:04:12] "[33mGET /favicon.ico HTTP/1.1[0m" 404 -
127.0.0.1 - - [12/Jun/2022 21:04:13] "[37mGET /sentiment/Python%20é%20ótimo HTTP/1.1[0m" 200 -


Nice! This way, we create a branche for our API, where we can pass a text in portuguese and get its polarity. For instance, going to:

https://61a8-35-186-174-7.ngrok.io/sentiment/Python%20%C3%A9%20%C3%B3timo

we get the polarity for "Python é ótimo" (or "Python is great", in english), which is 0.8.

Note that, here, the text is passed directly in the url. 

# Creating an API for a regression model

Ok, we have shown how to create an API from a simple model using TextBlob. However, usually, we need to train and fit our model. Let's see how we can train a model and use it in our API.

Here, we will use the housing prices datasets from Kaggle:

In [None]:
from google.colab import files
uploaded = files.upload( )

Saving casas.csv to casas.csv


In [None]:
import pandas as pd
import io

prices = pd.read_csv(io.BytesIO(uploaded['casas.csv']))

In [None]:
prices.head( )

Unnamed: 0,tamanho,ano,garagem,preco
0,159.0,2003,2,208500
1,117.0,1976,2,181500
2,166.0,2001,2,223500
3,160.0,1915,3,140000
4,204.0,2000,3,250000


Ok, so we have information about the size, year, and size of garage, and the housing prices. Here, we will build a very simple model using only one variable:

In [None]:
prices_tmp = prices[['tamanho', 'preco']].copy( )

Now, let's separe our independent and target variable:

In [None]:
X = prices_tmp.drop('preco', axis = 1)
y = prices_tmp.preco.copy( )

Now, let's make our train_test split:

In [None]:
from sklearn.model_selection import train_test_split

SEED = 42

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = SEED)

Finally, let's fit a simple linear regression model:

In [None]:
from sklearn.linear_model import LinearRegression

model_LinReg = LinearRegression( )
model_LinReg.fit( X_train, y_train )

LinearRegression()

Nice! Now, let's try to do the deploy of this model using an API. We basically have to reproduce the code for model fitting to the API code. Note that the model fitting should be performed before, since we do not want to be repeating model training everytime someone tries to predict a value. We could leave the endpoint with just the prediction result:

In [None]:
my_first_app = Flask(__name__)
run_with_ngrok(my_first_app)

prices = pd.read_csv('casas.csv')
prices_tmp = prices[['tamanho', 'preco']].copy( )
X = prices_tmp.drop('preco', axis = 1)
y = prices_tmp.preco.copy( )
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = SEED)
model_LinReg = LinearRegression( )
model_LinReg.fit( X_train, y_train )

port = 5000
auth_token = '2ATxMOi0B3lpLciSMzBb9eOAejh_7uTTU7DVWwQH18eA3RnkQ'
ngrok.set_auth_token(auth_token)
public_url = ngrok.connect(port).public_url

@my_first_app.route('/')
def home( ):
  return "My first API."

@my_first_app.route('/sentiment/<text>')
def sentiment(text):
  tb_pt = TextBlob(text)
  tb_en = tb_pt.translate(from_lang = 'pt', to = 'en')
  pol = tb_en.polarity
  return "polarity: {}".format(pol)

@my_first_app.route('/linreg/<int:size>')
def get_price(size):
  x_pred = size
  y_pred = model_LinReg.predict([[x_pred]])[0]
  return "prediction: {}".format(y_pred)

my_first_app.run( )

 * Serving Flask app "__main__" (lazy loading)
 * Environment: production
[2m   Use a production WSGI server instead.[0m
 * Debug mode: off


 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)


 * Running on http://88e7-35-186-174-7.ngrok.io
 * Traffic stats available on http://127.0.0.1:4040


127.0.0.1 - - [12/Jun/2022 21:31:00] "[37mGET / HTTP/1.1[0m" 200 -
127.0.0.1 - - [12/Jun/2022 21:31:01] "[33mGET /favicon.ico HTTP/1.1[0m" 404 -
  "X does not have valid feature names, but"
127.0.0.1 - - [12/Jun/2022 21:31:09] "[37mGET /linreg/120 HTTP/1.1[0m" 200 -
127.0.0.1 - - [12/Jun/2022 21:31:10] "[33mGET /favicon.ico HTTP/1.1[0m" 404 -
  "X does not have valid feature names, but"
127.0.0.1 - - [12/Jun/2022 21:31:12] "[37mGET /linreg/120 HTTP/1.1[0m" 200 -


Nice! Our model works fine, returning that, if we have a house with 120 m2, we get a price of 157377.06185641736. This is the same value we find when we try to predict using our model:

In [None]:
model_LinReg.predict([[120]])[0]

  "X does not have valid feature names, but"


157377.06185641736

Nice!

## Using multiple inputs

We can still try to improve our model a little. Let's try to implement a model with more than one variable. Thus, we have to send a request with more than one value to our server. Basically, we will send a JSON, with all our features. This JSON will have the format:

In [None]:
features = {
    "tamanho" : 120,
    "ano" : 2001,
    "garagem" : 2
}

So, let's update our endpoint:

In [None]:
from flask import request, jsonify      # We are using a new library named request to get the JSON

my_first_app = Flask(__name__)
run_with_ngrok(my_first_app)

prices = pd.read_csv('casas.csv')
X = prices.drop('preco', axis = 1)
y = prices.preco.copy( )
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = SEED)
model_LinReg = LinearRegression( )
model_LinReg.fit( X_train, y_train )

X_cols = ['tamanho', 'ano', 'garagem']

port = 5000
auth_token = '2ATxMOi0B3lpLciSMzBb9eOAejh_7uTTU7DVWwQH18eA3RnkQ'
ngrok.set_auth_token(auth_token)
public_url = ngrok.connect(port).public_url

@my_first_app.route('/')
def home( ):
  return "My first API."

@my_first_app.route('/sentiment/<text>')
def sentiment(text):
  tb_pt = TextBlob(text)
  tb_en = tb_pt.translate(from_lang = 'pt', to = 'en')
  pol = tb_en.polarity
  return "polarity: {}".format(pol)

@my_first_app.route('/linreg/', methods  = ['POST'])    # Now, we are using the method POST to receive new entries
def linreg( ):
  x_pred = request.get_json( )
  input = [x_pred[col] for col in X_cols]
  y_pred = model_LinReg.predict([input])[0]
  return jsonify(price = y_pred)

my_first_app.run( )

 * Serving Flask app "__main__" (lazy loading)
 * Environment: production
[2m   Use a production WSGI server instead.[0m
 * Debug mode: off


 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)


 * Running on http://5857-35-186-174-7.ngrok.io
 * Traffic stats available on http://127.0.0.1:4040


  "X does not have valid feature names, but"
127.0.0.1 - - [12/Jun/2022 21:54:37] "[37mPOST //linreg/ HTTP/1.1[0m" 200 -


Now that we used the POST method, we can't run it directly in the URL. Now, to test our API, we have to use another interface, such as postman:

https://web.postman.co/

## Using a pre-built model

Note that, here, we have built our model in the same code as we developed the API. However, if our model fitting is too expensive, it might make us unable to develop a functioning API. 

We can actually create a model in a separate file, and then we *serialize* the model so we can later import it in other instance of our code. To serialize our model, we can use the pickle library.

In [None]:
import pickle

Now, let's create our model again:

In [None]:
prices = pd.read_csv('casas.csv')
X = prices.drop('preco', axis = 1)
y = prices.preco.copy( )
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = SEED)
model_LinReg = LinearRegression( )
model_LinReg.fit( X_train, y_train )

LinearRegression()

Now, we can serialize our model using the dump( ) method:

In [None]:
pickle.dump(model_LinReg, open('model_LinReg.sav', 'wb'))

Nice! Now, we can just read our serialized model in our API:

In [None]:
my_first_app = Flask(__name__)
run_with_ngrok(my_first_app)

model_LinReg = pickle.load(open('model_LinReg.sav', 'rb'))

X_cols = ['tamanho', 'ano', 'garagem']

port = 5000
auth_token = '2ATxMOi0B3lpLciSMzBb9eOAejh_7uTTU7DVWwQH18eA3RnkQ'
ngrok.set_auth_token(auth_token)
public_url = ngrok.connect(port).public_url

@my_first_app.route('/')
def home( ):
  return "My first API."

@my_first_app.route('/sentiment/<text>')
def sentiment(text):
  tb_pt = TextBlob(text)
  tb_en = tb_pt.translate(from_lang = 'pt', to = 'en')
  pol = tb_en.polarity
  return "polarity: {}".format(pol)

@my_first_app.route('/linreg/', methods  = ['POST'])    # Now, we are using the method POST to receive new entries
def linreg( ):
  x_pred = request.get_json( )
  input = [x_pred[col] for col in X_cols]
  y_pred = model_LinReg.predict([input])[0]
  return jsonify(price = y_pred)

my_first_app.run( )

 * Serving Flask app "__main__" (lazy loading)
 * Environment: production
[2m   Use a production WSGI server instead.[0m
 * Debug mode: off


 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)


 * Running on http://c82d-35-186-174-7.ngrok.io
 * Traffic stats available on http://127.0.0.1:4040


127.0.0.1 - - [12/Jun/2022 22:29:33] "[37mGET / HTTP/1.1[0m" 200 -
127.0.0.1 - - [12/Jun/2022 22:29:34] "[33mGET /favicon.ico HTTP/1.1[0m" 404 -
127.0.0.1 - - [12/Jun/2022 22:29:35] "[37mGET / HTTP/1.1[0m" 200 -
  "X does not have valid feature names, but"
127.0.0.1 - - [12/Jun/2022 22:29:43] "[37mPOST /linreg/ HTTP/1.1[0m" 200 -


Nice! Everything worked out, even though we now just imported the serialized model instead of building it from scratch!

Now, just for learning purposes, let's introduce a basic authentication procedure:

In [None]:
!pip install flask-basicauth

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting flask-basicauth
  Downloading Flask-BasicAuth-0.2.0.tar.gz (16 kB)
Building wheels for collected packages: flask-basicauth
  Building wheel for flask-basicauth (setup.py) ... [?25l[?25hdone
  Created wheel for flask-basicauth: filename=Flask_BasicAuth-0.2.0-py3-none-any.whl size=4243 sha256=97cb3c2b779008318bfcc9399672a815fdd65a28e6d9db31fd392cdb5c818592
  Stored in directory: /root/.cache/pip/wheels/d5/08/a3/19638d90fdf01258ede772449bcbde424839459749acb977b6
Successfully built flask-basicauth
Installing collected packages: flask-basicauth
Successfully installed flask-basicauth-0.2.0


Now, we can just introduce some lines of code to make a basic authentication:

In [None]:
from flask_basicauth import BasicAuth

my_first_app = Flask(__name__)
run_with_ngrok(my_first_app)
my_first_app.config['BASIC_AUTH_USERNAME'] = 'Leo'
my_first_app.config['BASIC_AUTH_PASSWORD'] = 'pswd'

model_LinReg = pickle.load(open('model_LinReg.sav', 'rb'))

X_cols = ['tamanho', 'ano', 'garagem']

basic_auth = BasicAuth(my_first_app)

port = 5000
auth_token = '2ATxMOi0B3lpLciSMzBb9eOAejh_7uTTU7DVWwQH18eA3RnkQ'
ngrok.set_auth_token(auth_token)
public_url = ngrok.connect(port).public_url

@my_first_app.route('/')
def home( ):
  return "My first API."

@my_first_app.route('/sentiment/<text>')
@basic_auth.required
def sentiment(text):
  tb_pt = TextBlob(text)
  tb_en = tb_pt.translate(from_lang = 'pt', to = 'en')
  pol = tb_en.polarity
  return "polarity: {}".format(pol)

@my_first_app.route('/linreg/', methods  = ['POST'])    # Now, we are using the method POST to receive new entries
def linreg( ):
  x_pred = request.get_json( )
  input = [x_pred[col] for col in X_cols]
  y_pred = model_LinReg.predict([input])[0]
  return jsonify(price = y_pred)

my_first_app.run( )

 * Serving Flask app "__main__" (lazy loading)
 * Environment: production
[2m   Use a production WSGI server instead.[0m
 * Debug mode: off


 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)


 * Running on http://1e92-35-186-174-7.ngrok.io
 * Traffic stats available on http://127.0.0.1:4040


127.0.0.1 - - [12/Jun/2022 22:39:57] "[37mGET / HTTP/1.1[0m" 200 -
127.0.0.1 - - [12/Jun/2022 22:39:58] "[33mGET /favicon.ico HTTP/1.1[0m" 404 -
127.0.0.1 - - [12/Jun/2022 22:39:59] "[37mGET / HTTP/1.1[0m" 200 -
127.0.0.1 - - [12/Jun/2022 22:40:08] "[31m[1mGET /sentiment/Python%20é%20ótimo HTTP/1.1[0m" 401 -
127.0.0.1 - - [12/Jun/2022 22:40:15] "[31m[1mGET /sentiment/Python%20é%20ótimo HTTP/1.1[0m" 401 -
127.0.0.1 - - [12/Jun/2022 22:40:17] "[37mGET /sentiment/Python%20é%20ótimo HTTP/1.1[0m" 200 -
127.0.0.1 - - [12/Jun/2022 22:40:18] "[33mGET /favicon.ico HTTP/1.1[0m" 404 -


Now, our sentiment analysis requires a basic authentication before being run!

# Testing requests using Python

We can use requests to test our APIs using Python. For instance, we can do:

In [None]:
import requests

url = 'http://c82d-35-186-174-7.ngrok.io/linreg/'
json_request = {"tamanho" : 120, "ano" : 2001, "garagem" : 2}

response = requests.post(url, json = json_request)

Then, we can get the result from this response using:

In [None]:
response.json( )

If we have a basic authentication, we can use:

In [None]:
url = 'http://c82d-35-186-174-7.ngrok.io/sentiment/Python é ótimo'

auth = requests.auth.HTTPBasicAuth('Leo', 'pswd')

response = requests.get(url, auth = auth)