# Offline or online?

In this lesson, you learned that one of the most basic differences between serving modes is whether their execution is scheduled or triggered by events and user requests.

Your final approach should be dictated by the needs of your users, without unnecessary complexities that add no value.

For example, if you have built a model for generating weather predictions that your users would like to receive only once a day, every evening, the most reasonable choice would be:

### Possible Answers


    Stream processing
    
    
    "Batch" a.k.a. "offline" a.k.a. "static" prediction {Answer}
    
    
    "Online" a.k.a. "dynamic" prediction

# When time matters - a bit

You have learned how the acceptable latency of your Machine Learning service will impact the choice of the serving mode you will implement.

Sometimes users can wait for days, even weeks. Sometimes, a second is too much.

The lower the expected latency, the bigger the engineering challenges and the cost of your service becomes. Therefore, avoid over-engineering and match the design of your ML service to what the users require and are willing to pay for.

For example, say you are building an ML service for analyzing and summarizing large .pdf documents. If your users tell you that they would like to receive the outputs of your service within 5 minutes of making a request to it, the most reasonable serving mode for your use case would be:

### Possible Answers


    Real-time prediction
    
    
    Near-real-time prediction a.k.a. stream processing {Answer}
    
    
    Offline batch prediction

# Client-server

In the video, you learned that an API is a point of contact between a client and a server.

In this context, what is exactly a client?

### Possible Answers


    Any application that uses your ML model via the API {Answer}
    
    
    The user that pays for the API usage, irrespective of who the actual consumers are.
    
    
    The application that exposes your model to the outside world via the API.

# API functionalities

You learned that the API, being the contact point between your ML service and the outside world, has many important safeguarding functionalities.

If it happens that one of the client applications sends you a request containing only two out of four required features, you must be able to detect that immediately and inform the client that they have made a mistake.

Otherwise your service may return a generic error and it may seem like there is an issue on your side.

This check is a basic functionality of every serious API and we call it:

### Possible Answers


    Output validation
    
    
    Authentication
    
    
    Input validation {Answer}
    
    
    Throttling

# Which test is it?

If you want to extensively test how a new component of your ML application works together with other internal and external components, which type of tests would you write for that purpose, and in which environment should you run them?

### Possible Answers


    Smoke tests in the production environment
    
    
    Integration tests in the staging environment {Answer}
    
    
    Unit tests in the production environment
    
    
    Smoke tests in the staging environment

# Progression through environments

What is the sequence of environments through which we deploy and test our models until we finally make our service available to the end user?

![Answer](images/ch_03-01.png)

# Tests per environment

In the last video, you learned about the different types of tests your ML application should be subjected to. You also heard about different environments that help you in this process of testing and bringing your app from development to production.

Do you still remember which environment served which purpose?

![Answer](images/ch_03-02.png)

# A fitting deployment strategy

You have realized that your current model is severely degraded, so you have collected a new batch of training data and trained the new model.

Your model pipeline hasn't changed in any way. You have just re-trained it on new data. Therefore, you are confident it is safe to immediately start serving this new, better-performing model to all your users.

Which deployment strategy makes the most sense in this case?

### Possible Answers


    Canary deployment
    
    
    Blue/green deployment {Answer}
    
    
    Shadow deployment

# Order of risk

Order the deployment strategies, from the one with the highest to lowest risk, in terms of user exposure to potential unexpected behavior.

![Answer](images/ch_03-03.png)

# Shadow of the shadow

So, although shadow deployment carries the lowest risk regarding user exposure to unexpected behavior, it still has its downsides. There is no free lunch.

Do you remember what it is?

### Possible Answers


    It makes rollback in case of errors very difficult.
    
    
    It can't work in batch mode.
    
    
    Every request is processed by both the old and new model, which can increase the service latency. {Answer}