# Deployment & Service Considerations

## Serving Modes

This is dictated by the acceptable latency expected by your users. Longer latency
services such as daily weather forecasts are less technically demanding than near real
time or real time services such as fraud detection algorithms. The primary modes are:

* **Batch Processing**: Pre-predict batches, not event-driven.
* **Near Real-Time**: Stream processing. Acceptable latency in minutes.
* **Real-Time**: Acceptable latency less than a second. Late prediction useless. 
Consider **edge deployment**: Deployment of the model to the user's device.

***

## API Engineering

There are 2 applications involved in a service:

1. Client application
2. Server application

The server app hosts the model and exposes its functionality to the client via the API.

There must be input validation on the values requested by the end user. Also
expectations can be made of the model outputs in an output validation stage. 

**FastAPI** is a popular Python framework for engineering an API.

<img src="https://fastapi.tiangolo.com/img/logo-margin/logo-teal.png" alt="FastAPI">">

***

## Testing

The different forms of tests that can be employed in ensuring deployment with confidence:

* Unit tests
* Integration tests
* Smoke tests
* Load tests
* Stress tests
* User acceptance tests

Unit tests are employed within the dev environment while actively developing the code.
But it should also be used within a dedicated test environment, such as in your CI
staging area, eg GitHub Actions.

The other forms of testing should occur in a **staging environment**. This should be an
exact clone of the production environment, albeit with a representative sample of the 
full data. 

Integration tests testing how the different components of the model work together or 
with external resources.

Smoke tests check whether the application can be started & deployed without issue.

Load tests checks whether the service can cope with expected concurrent demand.
Increasing this demand beyond normal conditions is stress testing.
