# Model Deployment

Before we get into an example of model deployment, we should cover a few Python techniques which will be helpful.

## Serialization

Sometimes we want to save Python objects in the same way we save data or text files. To do this, we use a process called **[serialization](https://realpython.com/python-serialize-data/#get-an-overview-of-data-serialization)**. In short, we save only the necessary aspects of an object inside a file so that when it is loaded, the rest of the object can be recreated exactly as it was when it was saved. The packages that you import into a Python session can be used in this recreation, removing the need to save "package-based" content (e.g., the `.mean()` or `.groupby()` methods of a `pd.DataFrame` don't need to be stored in its pickle file).

[There are many ways to serialize objects in Python](https://realpython.com/python-pickle-module/#serialization-in-python), but one of the most common and efficient ways is to [serialize with the pickle module](https://www.datacamp.com/tutorial/pickle-python-tutorial#serializing-python-data-structures-with-pickle-lists).

To use pickle, we save *bytes* objects, and then load them:

In [1]:
import pickle

In [2]:
# stuff... 
my_object = ['this', 'is', 'an', 'object']

In [3]:
# notice this Writes a Bytes object "wb"
with open("./my_object.pickle", 'wb') as f:
    pickle.dump(my_object, f)

In [4]:
# notice this Reads a Bytes object "rb"
with open("./my_object.pickle", 'rb') as f:
    my_object = pickle.load(f)

In [5]:
my_object

['this', 'is', 'an', 'object']

## Decorators

Streamlit uses **[decorator functions](https://pythonbasics.org/decorators/)** to cache objects. In short, a decorator function "wraps" some other function, so that every time the *wrapped* function is used, the wrapp*er* is instantiated first. For more information on these, I recommend the first two main sections of the [RealPython article](https://realpython.com/primer-on-python-decorators) by Geir Arne Hjelle.

For a **simple example**, the following code is adapted from the above article.

In [6]:
def intro_outro(func):
    # we need to make a function to return a function ...
    def wrapper():
        print("This is an introduction.")
        func()
        print("Thank you for your time!")

    # return the "wrapped" version of the function
    return wrapper

def say_whee():
    print("Whee!")

# wrapped version of the function
@intro_outro
def wrap_say_whee():
    print("Whee!")

In [7]:
say_whee()

Whee!


In [8]:
wrap_say_whee()

This is an introduction.
Whee!
Thank you for your time!


As a **more involved example**, we can pass arguments and keyword arguments to the wrapped function within the decorator wrapper using `*args` and `**kwargs`. Recall, that these uses of `*` notation will *unpack* lists (of arguments) and dictionaries (of keyword arguments).

In [16]:
def do_twice(func):
    def wrapper_do_twice(*args, **kwargs):
        func(*args, **kwargs)
        func(*args, **kwargs)
    return wrapper_do_twice

@do_twice
def say_something(statement="this is what I want to say."):
    print(statement)

In [17]:
say_something('this is also what I want to say.')

this is also what I want to say.
this is also what I want to say.


(Optional) For an even more **complex example**, take a look at the ways you can [define a decorator with arguments](https://realpython.com/primer-on-python-decorators/#defining-decorators-with-arguments), or even [define a decorator with *optional* arguments](https://realpython.com/primer-on-python-decorators/#creating-decorators-with-optional-arguments).

## Remote Storage

When running applications on the cloud, you should **avoid saving project data to GitHub.** Similarly, you should want to avoid saving anything else that might be large (e.g., large models, etc.). One immediate reason for this is that GitHub is not built for storing lots of data (or copies of that data for each commit!), and large files can make GitHub operations unwieldy (e.g., note that pushing=uploading and pulling=downloading). Further, data and models usually require some level of security, and public GitHub models are available to anyone. Thus, your data (or objects) should be stored on the cloud using a cloud storage service.

The most common cloud services are Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure. If your organization isn't using in-house, proprietary servers, it is almost surely using one of these three options. However, aside from the fact that these tools are not free, they are also out of the scope of this course. Instead, we use Google Drive, as described in the Simple Streamlit example linked below.

# Simple Streamlit

As a supplement to this lab, we have a [simple Streamlit web app](https://github.com/leontoddjohnson/simple_streamlit) that demonstrates how to deploy a model that uses data stored on Backblaze:

1. **[Fork](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo?tool=webui)** the [simple_streamlit](https://github.com/leontoddjohnson/simple_streamlit) repository on GitHub.
2. Then, **[clone](https://docs.github.com/en/repositories/creating-and-managing-repositories/cloning-a-repository?tool=desktop)** that fork (from your GitHub account) onto your local machine using GitHub Desktop.
3. Follow the steps on the README.md file, and use that fork as your own template.

# Explore

Test your understanding of this week's content with the following explorations.

*Note: unless otherwise noted, **explorations are completely optional and will not be reviewed.***

## Exploration 1

Write a `@timer` decorator that will print the time it takes for a function to run. You can use the `time` Python module to do this.