# Sending Python Objects through Space

![star_wars](../.img/star_wars.JPG)

What's the best way to send someone an instance of a python object over the internet?

As part of [HackThisAI](https://github.com/JosephTLucas/HackThisAI), players may be challenged to steal or invert machine learning models. To check their solution, I had to be able to import and compare their model objects with mine. In figuring out the best way to do it, I learned a bit about serialization and namespaces.

## Setup

Let's assume the user wants to create and send us this object:

```python
from sklearn.tree import DecisionTreeClassifier

class Star_Destroyer:
    def __init__(self):
        self.ammo = 100
        self.cls = DecisionTreeClassifier()

    def shoot_laser(self):
        self.ammo -= 1

    def train_model(self):
        return self.cls.get_depth()
```

Furthermore, since the players are going to want to submit _trained_ models, I needed to not only support importing a `Class`, but more specifically an `instance of the Class`. So let's shoot the laser, then save and submit the model.

```python
sd = Star_Destroyer()
sd.shoot_laser()
```

How should we [serialize](https://stackoverflow.com/questions/633402/what-is-serialization) `sd` to send it?

## Pickle

[Pickle](https://docs.python.org/3/library/pickle.html) is the standard way to do this.

***
It's worth reading the first several paragraphs of the pickle documentation. They discuss marshalling, security and binary vs. text serialization.
***

```python
import pickle

with open("setup/thing.pkl", "wb") as f:
    pickle.dump(sd, f)
```

After saving the item to a file, the player can send it using normal HTTP methods. After we receive it, we'd expect to be able to `load` and use the pickle.

In [3]:
import pickle

with open("setup/thing.pkl", "rb") as f:
    sd = pickle.load(f)

AttributeError: Can't get attribute 'Star_Destroyer' on <module '__main__'>

Hmm, what do we make of that error? To [Stack Overflow](https://stackoverflow.com/questions/27732354/unable-to-load-files-using-pickle-and-multiple-modules)!

> Remember that pickle doesn't actually store information about how a class/object is constructed, and needs access to the class when unpickling.

Okay.... looks like maybe we need to import our class and dump it from a helper.

A helper might look like this:

```python
import pickle

from example import Star_Destroyer

sd = Star_Destroyer()
sd.shoot_laser()

with open("dumped_thing.pkl", "wb") as f:
    pickle.dump(sd, f)
```

Does this new object perform better after being sent across the internet?

In [4]:
import pickle

with open("setup/dumped_thing.pkl", "rb") as f:
    sd = pickle.load(f)

ModuleNotFoundError: No module named 'example'

Huh, still doesn't work. It looks like the `pickled` object still references the old namespace. Several of the solutions presented in the Stack Overflow link discuss writing a customer unpickler. However, this technique still acts as a redirection for import paths. That won't work for the CTF because we don't know what namespaces existed in the players context. Furthermore, we might not even have access to their various imports. We need something totally independent.

## Joblib

[Joblib](https://joblib.readthedocs.io/en/latest/) is another great library for binary serialization. In fact, it's [recommended by Scikit-Learn](https://scikit-learn.org/stable/modules/model_persistence.html) because it is "more efficient on objects that carry large numpy arrays internally as is often the case for fitted scikit-learn estimators".

In [5]:
import joblib

with open("setup/thing.joblib", "rb") as f:
    sd = joblib.load(f)

AttributeError: Can't get attribute 'Star_Destroyer' on <module '__main__'>

Unfortunately while it is more efficient, ["joblib.dump() and joblib.load() are based on the Python pickle serialization model"](https://joblib.readthedocs.io/en/latest/persistence.html). We won't find our solution here.

## Dill

In the shadows of the stack overflow replies, you'll see references to [dill](https://pypi.org/project/dill/). I'm usually reluctant to add other dependencies, but was desperate enough to give this a try. I was particularly encouraged by their description:

> In addition to pickling python objects, dill provides the ability to save the state of an interpreter session in a single command. Hence, it would be feasable to save an interpreter session, close the interpreter, ship the pickled file to another computer, open a new interpreter, unpickle the session and thus continue from the ‘saved’ state of the original interpreter session.

> dill can be used to store python objects to a file, but the primary usage is to send python objects across the network as a byte stream. dill is quite flexible, and allows arbitrary user defined classes and functions to be serialized.

In [6]:
import dill

with open("setup/thing.dill", "rb") as f:
    sd = dill.load(f)

In [7]:
print(sd.ammo)

99


That's what we wanted to see! We can import the `Star_Destroyer` object and it even has the state it had when it was exported (we'd fired the laser once).

## Conclusion

Is there a better way to do this? I'm all ears.

If I had players submit whole `.py` files, we still have dependency challenges.

I think the "best" solution would probably be to have the players expose an API that I can query to compare the models. However, this requires a bit more development from the players and increases the barrier to entry.