# Python Pickle - Data Serialization

Many times you are working on your program (perhaps something with long runtimes or multiple steps), and you want to stop in the middle for some reason.  Maybe you haven't completed a further part, or need to step away from the computer.  Often it can be helpful to preserve your program state for later. The `pickle` module can help you do this.  Pickling your data dumps a raw Python data structure to a binary(ish) format that can be easy to read in at a later date.  

## Dump/Load (File)

The `pickle` module closely follows the API of the python `json` module.  To store the representation of a Python object in a file, you use the `dump` method.  This file can be read with the `load` function at a later time to get exactly the python representation back

In [6]:
import pickle # alternately, could write 
# from pickle import load, dump
# The alternate way may conflict with the json load and dump
import os
import tempfile

my_data = {'hi': [1, 2, 'asdf'], 3: 17}

# get a temporary file so we don't clobber any data accidentally
# You could use your own filename if you wanted so you know where it 
# is going.
file, filename = tempfile.mkstemp()

# open the temporary file for writing bytes to
with open(filename, 'wb') as f:
    print('Writing pickle to', filename)
    pickle.dump(my_data, f)
    
# open the temporary file for reading bytes from
with open(filename, 'rb') as f:
    print('Reading pickle from', filename)
    my_data2 = pickle.load(f)

# cleanup the temporary files
os.close(file)
os.remove(filename)
# look! our data!
print(my_data2)
    

Writing pickle to /tmp/tmp0olp9atb
Reading pickle from /tmp/tmp0olp9atb
{'hi': [1, 2, 'asdf'], 3: 17}


This operation is useful if you want to cache the results of an expensive operation or network API call so you can continue to work on the data at a later time without waiting on an expensive calculation or network call.  This is also an excellent idea if you are going to be demoing a project when you want to ensure you have data from an external source saved (instead of praying to the demo deity of your choice that the network will work perfectly).

## Dumps/Loads (String)

The `pickle` method also supports dumping to and loading from strings.  This can be incredibly useful if you want to share python state between running python processes (on the same computer or across a network) instead of using a different serialization protocol like Google's Protocol Buffers (which are EXCELLENT in the author's opinion if you need to have microservices implemented in different languages talking to each other).

In [7]:
import pickle # alternately, could write 
# from pickle import loads, dumps
# The alternate way may conflict with the json loads and dumps

my_data = {'hi': [1, 2, 'asdf'], 3: 17}\
# dump the data to a pickle string
pickle_str = pickle.dumps(my_data)
print("Preserved pickle:", pickle_str)

my_data2 = pickle.loads(pickle_str)
print("Unpickled data:", my_data2)

Preserved pickle: b'\x80\x03}q\x00(X\x02\x00\x00\x00hiq\x01]q\x02(K\x01K\x02X\x04\x00\x00\x00asdfq\x03eK\x03K\x11u.'
Unpickled data: {'hi': [1, 2, 'asdf'], 3: 17}
