# BentoML vs FastAPI: Good-bye FastAPI For Machine Learning!
## Objective comparison between BentoML and FastAPI

### Notes

FastAPI vs BentoML

Similar Features
- Pydantic validation
- Swagger UI
- async
- Starlette

ML Differentiators
- Running models in separate processes
    - https://modelserving.com/blog/breaking-up-with-flask-amp-fastapi-why-ml-model-serving-requires-a-specialized-framework

### Motivation

### But what is an API?

The world is full of APIs. Yet, strangely, most do a poor job of explaining what they are. When you google "what is an API?", you'll get stuff like "API stands for Application Programming Interface" (like we would care), or as Wikipedia puts it:

> An API is a way for two or more computer programs to communicate with each other.

For me, at a high level, an API is just a plain-old URL. An example? 

How about ChatGPT3? The link chat.openai.com/chat is a URL that lets you interact with OpenAI's ChatGPT3 AI model. You send requests to the API via the website's UI using prompts. 

But APIs doesn't have to have fancy UIs. They can simply be URLs that programmers send requests to perform a variety of tasks like generating an image given a prompt. For example, I've built an API that returns a cuteness score given a pet's image. Here is its URL:

PASTE THE URL HERE

It has a PREDICT endpoint but APIs can have as many endpoints as possible that perform different functions.

An API is just a way to hide complex programming logic like ML models with billions of parameters behind simple interfaces so that users can interact without any prior knowledge of how the thing behind was built or used. 

So, in this article, we are comparing the king of API frameworks, FastAPI, which is generally used for web applications, to BentoML, a relatively young library specialized to deploy machine learning models as API.

### Similar features

Before we go into the differences in terms of machine learning use cases, let's quickly discuss FastAPI and BentoML's similar features:

- Starlette: built upon the same powerful ASGI web application building framework, making them fast and easy to use
- Automatic documentation with Swagger UI: both generate automatic documentation for APIs using the standard API docs format called OpenAPI.
- Asynchronous requests: both allow asynchronous requests for heavy Input/Output-bound APIs. This means they can handle multiple requests simultaneously without executing them linearly.


These are the basic features required by modern API frameworks. The real differentiators between BentoML and FastAPI are in machine learning use cases.

### Saving/loading models
https://docs.bentoml.org/en/latest/concepts/model.html

### Integration with other ML frameworks

https://docs.bentoml.org/en/latest/frameworks/index.html

### Input/Output
https://docs.bentoml.org/en/latest/reference/api_io_descriptors.html

### Dependency management

### Model registry

### Deploying the service
https://docs.bentoml.org/en/latest/concepts/deploy.html

### GPU serving
https://docs.bentoml.org/en/latest/guides/gpu.html

### Batching inputs
https://docs.bentoml.org/en/latest/guides/batching.html