# 09. PyTorch Model Deployment

Lets bring FoodVision to life and make it publicly accessible.

**We are going to deploy our FoodVision model to the internet as a usable app!**


## What is machine learning model deployment?

***Machine learning model deployment** is the process of making your machine learning model accessible for others.*

For example, someone taking a photo on their smartphone of food and then having our FoodVision model classify it into pizza, steak, or sushi.

Some other examples can be, *an operating system may lower its resource consumption based on a machine learning model making predictions on how much power someone generally uses at specific times of day.*

Also these models can learn from each other as well. For example, a Tesla car's computer vision system will interact with the car's route planning program and then the route planning program will get inputs and feedback from the driver.


## Why deploy a machine learning model?

One of the most important philosophical question in machine learning is:
***if a machine learning model never leaves a notebook, does it exist?***

---
Deploying a model is as important as training one.

Because although you can get a pretty good idea of how your model is going to function by evaluating it on a well crafted test set or visualizing its results, you never really know how it will perform ultil you release it to the wild.

***Having people who've never used your model interact with it will often reveal edge cases you never thought of during training.***

For example, what happens if someone was to upload a photo that wasn't of food to our FoodVision model?

One solution would be to create another model that firstclassifies images as "food" or "not food" and passing the target image through that model first.

THen if the images is of "food" it goes to our FoodVision model and gets classifies into pizza, steak, or sushi.

And if it's "not food", a message is displayed.

But what is these predictions were wrong?

What happens then?

You can see how these questions could keep going.

Thus this highlights the importance of model deployment: it helps you figure our errors in your model that aren't obvious during training/testing.

---

***But once you've got a good model, deployment is a good next step. Monitoring involves seeing how your model goes on the most important data split: data from the real world.***

## Different types of machine learning model deployment

Whole books could be written on the different types of machine learning model deployment(and many good one are listed [PyTorch Extra Resources](https://www.learnpytorch.io/pytorch_extra_resources/#resources-for-machine-learning-and-deep-learning-engineering))

---
Let's start with a simple question:
> What is the most ideal scenario for our machine leanring model to be used?

ANd then work backward from there.

In case of FoodVision, our ideal scenerio might be:
- someone taking a photo on a mobile device(through an app or web browser)
- The prediction comes back fast.

Easy.

So we have two main criteria:
1. The model should work on a mobile device
2. THe model should make predictions *fast*(because a slow app is a boring app).

And of course, depending on our use case, our requirements may vary.

We may notive the above two points break down into another two questions:
1. **Where's it going to go?** - As in, where is it going to be stored?
2. **How's it going to function?** - As in, does it return predictions immediatedly? or do they come later?

![](09-deployment-questions-to-ask.png)

*When starting to deploy machine learning models, it's helpful to start by asking what's the most ideal use case and then work backwards from there, asking where the model is going to go and then how it's going to function.*

### WHere's it going to go?

When you deploy your machine learning model, where does it live?

THe main debate here is usually on-device (also called edge/in the browser) or on the cloud (a computer/server that isn't the actual device someone/something calls the model from).

Both have their pros and cons.

| Deployment location| Pros| Cons|
|:--|:--|:--|
| **On-device(edge/in the browser** | Can be very fast (since no data leaves the device)|Limited compute power (larger models take longer to run)|
| - | Privacy preserving (again no data has to leave the device)| Limited storage space (smaller model size required)|
|-|No internet connection required (sometimes)|Device-specific skills often required|
| **On cloud** | Near unlimited compute power (can scale up when needed)|Costs can get out of hand (if proper scaling limits aren't enforced)|
|-|Can deploy one model and use everywhere (via API)|Predictions can be slower due to data having to leave device and predictions having to come back (network latency)|
|-|Links into existing cloud ecosystem|Data has to leave device (this may cause privacy concerns)|

There are more details to these but I've left resources in the extra-curriculum to learn more.

Let's give an example.

If we're deploying FoodVision as an app, we want it to perform well and fast.

<div class='alert alert-success'>

So which model would we prefer?
1. A model on-device that performs at 95% accuracy with an inference time (latency) of one second per prediction.
2. A model on the cloud that performs at 98% accuracy with an inference time of 10 seconds per prediction (bigger, better model but takes longer to compute).

</div>

We've made these numbers up but they showcase a potential difference between on-device and on the cloud.

***Option 1** could potentially be a smaller less performant model that runs fast because its able to fit on a mobile device.*

***Option 2** could potentially a larger more performant model that requires more compute and storage but it takes a bit longer to run because we have to send data off the device and get it back (so even though the actual predictoin might be fast, the network time and data transfer has to be factored in)

**For FoodVision, we'd likely prefer 1, because the small hit in performance is outweighted by the faster inference speed.
![](09-model-deployment-on-device-vs-cloud.png)

*In the case of a Tesla car's computer vision system, which would be better? A smaller model that performs well on device (model is on the car) or a larger model that performs better that's on the cloud? In this case, you'd much prefer the model being on the car. The extra network time it would take for data to go from the car to the cloud and then back to the car just wouldn't be worth it (or potentially even impossible with poor signal areas).*

***Note**: For a full example of seeing what it's like to deploy a PyTorch model to an edge device, see the PyTorch tutorial on achieving real-time inference (30fps+) with a computer vision model on a Raspberry Pi.*