# AUA, DS 229 – MLOps
### Week 10 – Flask / REST API / OpenAPI & Swagger UI 

***

In [None]:
# !pip install Flask==2.1.3
# !pip install "connexion[swagger-ui]==2.14.2"

## What is an API?

API stands for "Application Programming Interface." **An API provides a way for different software programs to communicate with each other and exchange information**.

APIs allow developers to create applications that can access data and functionality from other systems or services without needing to understand how those systems work. In this way, **APIs can simplify the development process, allowing developers to focus on creating the user interface and business logic of their applications rather than worrying about the details of how different systems work together**.

For example, an e-commerce website might use an API provided by a payment processing company to process customer payments. The website would send a request to the API with the customer's payment information, and the API would handle the details of processing the payment with the payment processing company's system. The e-commerce website would then receive a response from the API indicating whether the payment was successful or not.

<center><img src="./images/what_is_an_api.png" width=800 height = 700/></center>

[[image source](https://www.geeksforgeeks.org/what-is-an-api/)]

### REST API
REST API stands for Representational State Transfer Application Programming Interface. **It is a type of API that uses the HTTP protocol to enable communication between clients and servers**.

**REST is a set of architectural principles that define how web standards such as HTTP and URLs should be used to create scalable and maintainable web services**. RESTful APIs follow these principles to create a standard way for applications to communicate with each other over the internet. The principles were defined by Roy Fielding in his doctoral dissertation, and they are widely accepted as the best practices for designing web services.

> **HTTP (Hypertext Transfer Protocol) is a protocol used to transfer data over the internet** (protocol is a set of rules or procedures for transmitting data between electronic devices, such as computers). It is the foundation of data communication on the World Wide Web and allows clients and servers to communicate and exchange data.  
> **HTTP is based on a request-response model, where a client sends a request to a server, and the server sends a response back to the client**. HTTP requests can be sent using different HTTP methods, such as **GET** (read), **POST** (create), **PUT** (update) and **DELETE** (delete), depending on the type of action that needs to be performed.  
> HTTP requests and responses are made up of headers and a body. The headers contain metadata about the request or response, such as the content type or length, while the body contains the actual data being transferred.

The main principles of RESTful APIs are:

- **Client-server architecture**: The client and the server must be separate from each other, with a clear separation of concerns. This means that the client should not have to know anything about the server's internal implementation.

- **Statelessness**: Each request from the client to the server must contain all the necessary information to complete the request. The server does not store any client state between requests, which makes it easier to scale the service.

- **Cacheability**: Responses from the server can be cached by clients, which can improve performance and reduce server load.

- **Layered system**: A layered system is used to decouple the client from the server. This means that the client can interact with the server without knowing anything about the intermediate layers, such as load balancers or proxies.

- **Uniform interface**: A uniform interface is used to decouple the client from the server. This means that clients can interact with the server without knowing anything about the server's internal implementation.

- **Code on demand**: This principle is optional and not always used. It allows the server to send executable code to the client to be executed within the client's context.

By following these principles, RESTful APIs can create a standardized way for applications to communicate over the internet, making it easier to create scalable and maintainable web services.


There are several advantages of using RESTful APIs over non-RESTful APIs:

1) **Scalability**: RESTful APIs are designed to be scalable, as they are stateless and cacheable. This means that they can handle a large number of requests without affecting performance.
2) **Flexibility**: RESTful APIs are flexible in terms of the data formats they can handle. They can use different data formats, such as JSON, XML, or HTML, making them compatible with a wide range of client applications.
3) **Easy to Understand**: RESTful APIs use standard HTTP methods (GET, POST, PUT, DELETE) and URLs to communicate with the server. This makes it easy for developers to understand how to use the API and reduces the learning curve.
4) **Separation of Concerns**: RESTful APIs separate the client from the server, with a clear separation of concerns. This makes it easier to modify the server without affecting the client, or vice versa.
5) **Reusability**: RESTful APIs promote reusability by breaking down resources into smaller, reusable components. This means that different parts of the API can be used by multiple clients or applications.
6) **Interoperability**: RESTful APIs are interoperable, meaning that they can work with different technologies, platforms, and programming languages. This makes it easier to integrate different applications and services.


## [Flask](https://flask.palletsprojects.com/en/2.2.x/)

<center><img src="./images/flask.png" width=350 height = 500/></center>

**Flask is a lightweight web framework for Python**. It is designed to make building web applications and APIs  easy and quick. Flask provides a simple and flexible architecture, allowing developers to easily extend or modify its functionality to suit their needs.

Some key features of Flask include:

- **Routing**: Flask allows developers to define URLs for their web applications and map them to specific functions that will handle the requests.
- **Templating**: Flask comes with a built-in template engine, Jinja2, which makes it easy to render HTML templates with dynamic content.
- **Built-in development server**: Flask includes a development server that makes it easy to test and debug applications during development.
- **Extensibility**: Flask can be extended with various third-party extensions that provide additional functionality, such as authentication, database integration, and caching.


### [OpenAPI](https://spec.openapis.org/oas/latest.html)

<center><img src="./images/openapi_swagger.png" width=350 height = 500/></center>

**OpenAPI is a broadly adopted industry standard for describing modern APIs**. This standard, formerly named **Swagger**, is used to describe, produce, consume, and visualize APIs in a vendor neutral format. It defines a standard language-agnostic interface for RESTful APIs, allowing developers to communicate about APIs and share code and documentation. OpenAPI allows you to describe the entire API, including endpoints, request/response formats, security schemes, and other details in a human-readable format. With an OpenAPI specification, developers can generate client libraries and server stubs in multiple programming languages, perform automated testing, and validate compliance with the API standard. OpenAPI promotes a consistent and standardized approach to API design and helps developers build better and more interoperable applications.

Nowadays, there are two versions: 2 and 3. Version 2 is the Swagger specification and is quite common thanks to the many tools available. Version 3 is the latest one, the first one from the OpenAPI Initiative (OAI).

Consider you are a developer at a bank and your task is to write a banking application that will handle all the payments by communicating with the corresponding API that serves the payments in the system. In this case, first of all, you will need to understand the functionalities that the payment API provides. Mainly, the supported operations, calls, their inputs and outputs. This is where OpenAPI can be helpful.


**OpenAPI specifications** defines how to describe REST API interface. So how to describe? We can utilize **OpenAPI definition** which is nothing more than a yaml / json file that describes what an API can do. It is standardized and human-readable. The availability of a well-written documentation of an API is vital for developers as understanding even a single endpoint usage by just digging into the codes may take hours. 


### [Connexion](https://github.com/spec-first/connexion)

Connexion is a framework that automagically handles HTTP requests based on OpenAPI Specification (formerly known as Swagger Spec) of your API described in YAML format. Connexion allows you to write an OpenAPI specification, then maps the endpoints to your Python functions; this makes it unique, as many tools generate the specification based on your Python code. You can describe your REST API in as much detail as you want; then Connexion guarantees that it will work as you specified.

[[source](https://github.com/spec-first/connexion)]


For short, OpenAPI is a specification for building RESTful APIs, Swagger is a set of tools for designing and documenting APIs that conform to the OpenAPI specification, and Connexion is a Python module that simplifies the process of building APIs using the OpenAPI specification.


***

## Deveoping an api to serve ML prediction system

In [None]:
import pickle
from datetime import datetime
import requests
import json

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.datasets import load_breast_cancer
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.metrics import classification_report
from sklearn.preprocessing import StandardScaler

%matplotlib inline

In [None]:
cancer = load_breast_cancer()

features = ["mean concavity", "worst area", "mean area"]
df_feat = pd.DataFrame(cancer["data"], columns=cancer["feature_names"])
df_target = pd.DataFrame(cancer["target"], columns=["Cancer"])

df_feat = df_feat[features]

In [None]:
df_feat.head()

In [None]:
df_target["Cancer"].unique()  # 1 - malignant, 0 - benign.

In [None]:
data_to_viz = pd.concat([df_feat, df_target], axis=1)
for feature in df_feat.columns:
    sns.displot(data=data_to_viz, x=feature, hue="Cancer", kind="kde", height=4)
plt.show()

In [None]:
X_train, X_test, y_train, y_test = train_test_split(df_feat, np.ravel(df_target), 
                                                    test_size=0.20, random_state=5)

print(f"Train size: {X_train.shape[0]}")
print(f"Test size: {X_test.shape[0]}")

In [None]:
scaler = StandardScaler()
X_train_transformed = scaler.fit_transform(X_train)

In [None]:
X_test_transformed = scaler.transform(X_test)

In [None]:
param_grid = {'C': [0.1, 1, 10, 100, 1000], 
              "gamma": [1, 0.1, 0.01, 0.001, 0.0001], 
              "kernel": ["rbf", "linear", "poly"]}
model = GridSearchCV(SVC(), param_grid, refit=True, cv=3, verbose=1)
model.fit(X_train_transformed, y_train)

In [None]:
model.best_params_

In [None]:
pred = model.predict(X_test_transformed)
print(classification_report(y_test, pred))

In [None]:
# Saving the model and data pre-processor object.
with open("app/checkpoints/model.pkl", "wb") as f:
    pickle.dump(model, f)

with open("app/checkpoints/scaler.pkl", "wb") as f:
    pickle.dump(scaler, f)
    
today = datetime.today()    
metadata = {
    "problem": "classification",
    "n_classes": 2,
    "label2class": {
        0: "benign",  
        1: "malignant"
    },
    "model": "SVM",
    "datetime": str(today.date())
}
with open("app/checkpoints/metadata.pkl", "wb") as f:
    pickle.dump(metadata, f)

<div class="alert alert-block alert-danger">
<b>Action</b>:
    <b>Open a terminal an run</b>: `python app/flask_main.py` and go to http://127.0.0.1:8000/
    
Go through `app/flask_main.py` file and explain the components that build the API.
</div> 

Let's get metadata about the API we wrote in a json format widely used in almost every programming language. To do so we can utilize **requests** package. 

In [None]:
# Other methods like PUT/POST will throw an error since the endpoint supports only GET method.
response = requests.get("http://127.0.0.1:8000/metadata_json")
result = json.loads(response.text)
result

## OpenAPI (Swagger UI)

Now that we are familiar with Flask, it is time to dive into OpenAPI, a standard for API development.
The OpenAPI Specification is an API description format for REST APIs that provides a bunch of functionalities (we will cover only the basic stuff). By utilizing OpenAPI in conjunction with Swagger, it is possible to generate a graphical user interface (GUI) for API exploration. This process involves creating a configuration file that can be accessed by your Flask application.

The configuration file for Swagger is a YAML or JSON file that includes your OpenAPI definitions. It contains all the details required to configure your server for URL endpoint definition, input parameter validation, and output response data validation.

<mark> Now open `swagger.yml` file and read the following explanations in parallel.</mark>

To define an API, it's necessary to specify the version of the OpenAPI definition you are using. This is done using the "**openapi**" keyword. The version string is crucial as certain parts of the OpenAPI structure may evolve over time. Similar to how new features are added to each new version of Python, the OpenAPI specification may also see new keywords added or outdated ones removed. Next, title field is for the header that will show up in Swagger UI, with its corresponding description and API version. The 'url' field is for the root path for all endpoints (similar to WORKDIR in Dockerfile). The 'paths' field lists all the endpoints in our API where **operationId** field indicates the python function for serving a certain endpoint. For example, operationId `endpoints.hello_world` tells that `hello_world` function in `endpoints` file/module will serve the endpoint under which that operationId is listed.


<div class="alert alert-block alert-danger">
<b>Action</b>:
    <b>Open a terminal an run</b>: `python app/openapi_main.py` and go to http://127.0.0.1:8000/
    
Go through `app/openapi_main.py`, `app/endpoints.py` and `swagger.yml` files and explain the components that build the API.
</div> 


<div class="alert alert-block alert-danger">
<b>Action</b>:
    <b>Open Swagger UI</b>: go to http://127.0.0.1:8000/ui
    
</div> 

In [None]:
# An example with Python requests:
response = requests.get("http://127.0.0.1:8000/predict", 
                        params={"mean_concavity": 1.2, 
                                "worst_area": 0.6, 
                                "mean_area": 4.2})
result = json.loads(response.text)
result

# http://127.0.0.1:8000/predict?mean_concavity=1.2&worst_area=0.6&mean_area=4.2

## Containerization

In [None]:
# Run this command from the app/ directory.
!docker build -t cancer_pred_api .

If an application is run on a port X within a container, then that port is only published in the container. **This means that outside of that container we cannot see the result**. To map the port Y in host into the container port X, run the container with the port argument:  
`docker run -p Y:X <image-name>`

In [None]:
!docker run -p 8000:8000 cancer_pred_api

# Docker-compose

In [None]:
!docker-compose up  

## Summary

OpenAPI is a specification for building RESTful APIs. It defines a standard for describing API endpoints, data models, and request/response schemas using JSON or YAML files. OpenAPI was formerly known as Swagger, but the project was renamed after it was donated to the OpenAPI Initiative.

Swagger is a set of open-source tools for designing, building, and documenting RESTful APIs that conform to the OpenAPI specification. The Swagger toolset includes a range of tools, including Swagger UI for visualizing and interacting with API endpoints, and Swagger Codegen for generating client and server code based on the OpenAPI specification.

Connexion is a Python module that simplifies the process of building RESTful APIs using the OpenAPI specification. It allows developers to define API endpoints, data models, and request/response schemas using YAML or JSON files that conform to the OpenAPI specification. Connexion automatically generates Python code based on the provided OpenAPI specification and provides built-in support for various authentication mechanisms and popular web frameworks such as Flask and Tornado.

# References
- [What is a REST API?](https://www.ibm.com/topics/rest-apis#:~:text=the%20next%20step-,What%20is%20a%20REST%20API%3F,representational%20state%20transfer%20architectural%20style.)
- [Docker](https://www.docker.com/)