# Flask demo

Flask will be our web service tool.

Demo `ping.py`

```python
## ping.py
from flask import Flask
app = Flask('ping')

@app.route('/ping', methods=['GET'])
def ping():
    return 'PONG'


if __name__  == '__main__':
#     # only run if called from CLI as a script
#     # or with python -m
    app.run(debug=True, host='0.0.0.0', port=9222)
```

Once running, run this in the terminal:

```bash
curl http://0.0.0.0:9222/ping
```

`curl` sends a `GET` request, and because we specified `/ping`, it will run the `ping()` routine and return `PONG` in our terminal.

## http methods

These are the methods by which the HyperText Transfer Protocol communicates between server and client. 

* GET
* POST
* PUT
* HEAD
* DELETE
* PATCH
* OPTIONS
* CONNECT
* TRACE

For example, a client may send an HTTP request to server, and depending on the method, the server may reply with a different response. The response contains the request status, and potentially the requested content.

`GET` requests data from source. `POST` sends data to server. They are the two most common methods. 

## Serving a model with flask

Our imports. The shebang `#!/usr/bin/env python` is a little different here, in that we use the `env` command to find the path to `python` executable. Having multiple versions of python that we do, in our conda envs, this is very useful, as it will use the current active conda env's python version.

```python
#!/usr/bin/env python
# coding: utf-8

## predict.py
import pickle
import numpy as np

from flask import Flask, request, jsonify
```

Where we make our prediction
```python
def predict_single(customer, dv, model):
    X = dv.transform([customer])
    y_pred = model.predict_proba(X)[:, 1]
    return y_pred[0]
```

Deploying the serialized, pre-trained model
```python
with open('churn-model.bin', 'rb') as f_in:
    dv, model = pickle.load(f_in)
```

Running the web service. Notice the `POST` method: the request must come with a json of the customer features; the client is POSTing the data to the server. In turn the web service returns with a json response containing the prediction
```python
app = Flask('churn')

@app.route('/predict', methods=['POST'])
def predict():
    customer = request.get_json()

    prediction = predict_single(customer, dv, model)
    churn = prediction >= 0.5
    
    result = {
        ## converts numpy types to python types
        ## otherwise json cannot serialize the values.
        'churn_probability': float(prediction),
        'churn': bool(churn),
    }

    return jsonify(result)


if __name__ == '__main__':
    app.run(debug=True, host='0.0.0.0', port=9696)

```


## Development vs Production Server

Running the `ping.py` script above will return a warning that this is a development server. Production requires `WSGI server` instead.

In short, plain `flask` is for testing only. `gunicorn` is an example of WSGI server. What is **WSGI (Web Server Gateway Interface)**? It is a specification describing how web server communicates with web apps, and how web apps can chain together to process a request. The idea here is that it *abstracts away whichever web framework or web server developers want to work with*; if it follows the WSGI spec, they can communicate, between server and app. 

*WSGI server acts as the interface between app and server*. [See python PEP-3333](https://peps.python.org/pep-3333/)

To use `gunicorn` for the prediction app above:

```bash
# bash
gunicorn --bind 0.0.0.0:9696 predict:app
```

And that's it. `predict` is the name of the file - `predict.py`, and `app` is our `Flask` object

### Windows

`gunicorn` does not work on windows - use `mod_wsgi` or `waitress-serve` instead:

```shell
# shell
waitress-serve --listen=0.0.0.0:9696 predict:app
```