# Introduction to Models as Web Endpoints

## Predictive model as a web endpoint
In order for a machine learning model to be useful, you need a way of sharing the results with other services and applications within your organization. While you can precompute results and save them to a database using a batch pipeline approach, it’s often necessary to respond to requests in real-time with up-to-date information. One way of achieving this goal is by setting up a predictive model as a web endpoint that can be invoked from other services.

This chapter shows how to set up this functionality for both scikit-learn and Keras models, and introduces Python tools that can help scale up this functionality.

## Hosting and consuming web endpoints
It’s good to build experience in both hosting and consuming web endpoints when building out model pipelines with Python. In some cases, a predictive model needs to pull data points from other services before making a prediction, such as needing to pull additional attributes about a user’s history as input to feature engineering. In this chapter, we’ll focus on JSON based services because it is a popular data format and works well with Python’s data types.

## Model as an endpoint
A model as an endpoint is a system that provides a prediction in response to a set of parameters. These parameters can be a feature vector, image, or other types of data that are used as input to a predictive model. The endpoint then makes a prediction and returns the results, typically as a JSON payload. The benefits of setting up a model this way are that other systems can use the predictive model, it provides a real-time result, and can be used within a broader data pipeline.

## Chapter learning outcomes
In this chapter, we will:

- Call web services using Python.
- Set up endpoints.
- Save models so that they can be used in production environments.
- Host scikit-learn and Keras predictive models.
- Scale up a service with Gunicorn and Heroku.
- Build an interactive web application with Plotly Dash.

# Web Services

Before we host a predictive model, we’ll use Python to call a web service and to process the result. After showing how to process a web response, we’ll set up our own service that echoes the passed-in message back to the caller.

## Libraries to install
There are few different libraries we’ll need to install for the examples in this chapter:

In [2]:
# !pip install  requests==2.23.0 Flask==1.1.4 gunicorn==20.1.0 mlflow==1.25.1 pillow==9.1.0 dash==2.3.1

## Functionalities
These libraries provide the following functionalities:

- Requests: provides functions for GET and POST methods
- Flask: enables functions to be exposed as HTTP locations
- Gunicorn: a WSGI server that enables hosting Flask apps in production environments
- Mlflow: a model library that provides model persistence
- Pillow: a fork of the Python Imaging Library
- Dash: enables writing interactive web apps in Python

Many of the tools for building web services in the Python ecosystem work well with Flask. For example, Gunicorn can be used to host Flask applications at the production scale, and the Dash library builds on top of Flask.

## Heroku 
To get started with making web requests in Python, we’ll use the Cat Facts Heroku app.

Heroku is a cloud platform that works well for hosting Python applications that we’ll explore later in this chapter.

## Example 
The Cat Facts service provides a simple API that provides a JSON response containing interesting tidbits about felines. We can use the /facts/random endpoint to retrieve a random fact using the requests library:

In [3]:
import requests
 
result = requests.get("http://cat-fact.herokuapp.com/facts/random")
print(result)
print(result.json())
print(result.json()['text'])

<Response [200]>
{'status': {'verified': True, 'sentCount': 1}, '_id': '591f98783b90f7150a19c1a0', '__v': 0, 'text': 'The average lifespan of an outdoor-only cat is about 3 to 5 years while an indoor-only cat can live 16 years or much longer.', 'source': 'api', 'updatedAt': '2020-08-23T20:20:01.611Z', 'type': 'cat', 'createdAt': '2018-06-03T20:20:02.291Z', 'deleted': False, 'used': False, 'user': '5a9ac18c7478810ea6c06381'}
The average lifespan of an outdoor-only cat is about 3 to 5 years while an indoor-only cat can live 16 years or much longer.


This snippet loads the requests library and then uses the ``get`` function to perform an HTTP get for the passed in URL. The result is a response object that provides a response code and payload if available. In this case, the payload can be processed using the ``json`` function, which returns the payload as a Python dictionary. The three print statements show the response code, the full payload, and the value for the ``text`` key in the returned dictionary object.

# Echo Service