### Serialization Basics
Pickling is the name of the serialization process in Python. By pickling, we can convert an object hierarchy to a binary format (usually not human readable) that can be stored.

### Pickle Package

In [1]:
import pickle

class Animal():
    def __init__(self, name, paws):
        self.number_of_paws = paws
        self.name = name
        
class Sheep(Animal):
    def _init__(self, name, paws):
        Animal.__init__(name, paws)
        
dolly = Sheep('Dolly', 4)
pickled_dolly = pickle.dumps(dolly)
pickled_dolly

b'\x80\x03c__main__\nSheep\nq\x00)\x81q\x01}q\x02(X\x0e\x00\x00\x00number_of_pawsq\x03K\x04X\x04\x00\x00\x00nameq\x04X\x05\x00\x00\x00Dollyq\x05ub.'

In [3]:
with open('pickling.pkl', 'wb') as f:
    pickle.dump(dolly, f)
    
with open('pickling.pkl', 'rb') as f:
    unpickled_obj = pickle.load(f)
    print(unpickled_obj.name)

Dolly


Note that `Sheep` class must be present in the module when we try to deserialize, else we'll get an error.

### What can be pickled and unpickled?
- None, True, and False
- integers, long integers, floating point numbers, complex numbers
- normal and Unicode strings
- tuples, lists, sets, and dictionaries containing only picklable objects
- functions defined at the top level of a module
- built-in functions defined at the top level of a module
- classes that are defined at the top level of a module
- instances of such classes whose __dict__ or __setstate__() is picklable

Attempting to pickle unpicklable object leads to `PicklingError` exception.

Use `pickletools` module to know more about how pickled files are stored.

In [2]:
import pickletools

pickletools.dis(pickled_dolly)

0: \x80 PROTO      3
    2: c    GLOBAL     '__main__ Sheep'
   18: q    BINPUT     0
   20: )    EMPTY_TUPLE
   21: \x81 NEWOBJ
   22: q    BINPUT     1
   24: }    EMPTY_DICT
   25: q    BINPUT     2
   27: (    MARK
   28: X        BINUNICODE 'number_of_paws'
   47: q        BINPUT     3
   49: K        BININT1    4
   51: X        BINUNICODE 'name'
   60: q        BINPUT     4
   62: X        BINUNICODE 'Dolly'
   72: q        BINPUT     5
   74: u        SETITEMS   (MARK at 27)
   75: b    BUILD
   76: .    STOP
highest protocol among opcodes = 2


### Pickling and Security
Pickle documentation states "The pickle module is not intended to be secure against erroneous or maliciously con-structed data.  Never unpickle data received from an untrusted or unauthenticated source".  

When a class instance is pickled, only data specific to the instance is stored in the pickle.  This meansthat Python code and class variables are not included in the pickle stream.  

The main source of pickle vulnerability is the `__reduce__` function. We know that certain objects are unpicklable like file, thread, etc. Using \_\_reduce\_\_ we can pickle such objects. The \_\_reduce\_\_ function should return a tuple whose first argument is the type of object to be constructed and rest are arguments to the first parameter. Whenever unpickling is done, the \_\_reduce\_\_ method is executed.

In [4]:
class UsingReduce:
    def __init__(self, value):
        self.value = value

    def __reduce__(self):
        return (self.__class__, ('Always this value',))

testObject = UsingReduce('test')

pickledTestObject = pickle.dumps(testObject)
unpickledTestObject = pickle.loads(pickledTestObject)

unpickledTestObject.value

'Always this value'

In [6]:
import os
class CmdExecutor:
    def __reduce__(self):
        return (os.system, ('echo HI',))

# Pickle it to a file
with open('test.pkl', 'wb') as f:
	pickle.dump(CmdExecutor(), f)

In [7]:
# We can restart shell now, import pickle, no need to import os
with open('test.pkl', 'rb') as f:
	pickle.load(f)  # os.system('echo Hi') executed