Pickling

Pickling is the name of the serialization process in Python. By pickling, we can convert an object hierarchy to a binary format (usually not human readable) that can be stored. To pickle an object we just need to import the pickle module and call the dumps() function passing the object to be pickled as a parameter.

In [3]:
import pickle

class Animal:
    def __init__(self, number_of_paws, color):
        self.number_of_paws = number_of_paws
        self.color = color

class Sheep(Animal):
     def __init__(self, color):
        Animal.__init__(self, 4, color)

mary = Sheep("white")

print (str.format("My sheep mary is {0} and has {1} paws", mary.color, mary.number_of_paws))
my_pickled_mary = pickle.dumps(mary)

print ("Would you like to see her pickled? Here she is!")
print (my_pickled_mary)


My sheep mary is white and has 4 paws
Would you like to see her pickled? Here she is!
b'\x80\x04\x95A\x00\x00\x00\x00\x00\x00\x00\x8c\x08__main__\x94\x8c\x05Sheep\x94\x93\x94)\x81\x94}\x94(\x8c\x0enumber_of_paws\x94K\x04\x8c\x05color\x94\x8c\x05white\x94ub.'


In [4]:
import pickle

class Animal:
    def __init__(self, number_of_paws, color):
        self.number_of_paws = number_of_paws
        self.color = color

class Sheep(Animal):
    def __init__(self, color):
        Animal.__init__(self, 4, color)

mary = Sheep("white")

print (str.format("My sheep mary is {0} and has {1} paws", mary.color, mary.number_of_paws))
my_pickled_mary = pickle.dumps(mary)

binary_file = open('my_pickled_mary.bin', mode='wb')
my_pickled_mary = pickle.dump(mary, binary_file)
binary_file.close()

My sheep mary is white and has 4 paws


Unpickling

The process that takes a binary array and converts it to an object hierarchy is called unpickling.

The unpickling process is done by using the load() function of the pickle module and returns a complete object hierarchy from a simple bytes array. Let’s try to use the load function on the example above:

In [5]:
import pickle

class Animal:
    def __init__(self, number_of_paws, color):
        self.number_of_paws = number_of_paws
        self.color = color

class Sheep(Animal):
    def __init__(self, color):
        Animal.__init__(self, 4, color)

#Step 1: Let's create the sheep Mary
mary = Sheep("white")

# Step 2: Let's pickle Mary
my_pickled_mary = pickle.dumps(mary)

#Step 3: Now, let's unpickle our sheep Mary creating another instance, another sheep... Dolly!
dolly = pickle.loads(my_pickled_mary)

#Dolly and Mary are two different objects, in fact if we specify another color for dolly
#there are no conseguencies for Mary

dolly.color = "black"

print (str.format("Dolly is {0} ", dolly.color))
print (str.format("Mary is {0} ", mary.color))

Dolly is black 
Mary is white 


In [6]:
import pickle
my_custom_pickle = bytes("this is unpicklable", encoding="UTF-8")

# this next statement will raise a _pickle.UnpicklingError
my_new_object = pickle.loads(my_custom_pickle)


UnpicklingError: could not find MARK

The problem when you have unpicklable object in the hierarchy of the object you want to pickle is that this prevents you to serialize (and store) the entire object. Fortunately, Python offers you two convenient methods to specify what you want to pickle and how to re-initialize (during the unpickling process) the objects that you haven’t pickled before. These methods are setstate() and getstate()

In [4]:
import pickle

class my_zen_class:
    number_of_meditations = 0

    def __init__(self, name):
            self.number_of_meditations = 0
            self.name = name
        
    def meditate (self):
            self.number_of_meditations = self.number_of_meditations

    def __getstate__ (self):
            # this method is called when you are 
            # going to pickle the class, to know what to pickle
            state = self.__dict__.copy()
            # You will never get the Buddha state if you count
            # meditations, so 
            # don't pickle this counter, the next time you will just 
            # start meditating from scratch :)
            del state['number_of_meditations']

            return state
        
    def __setstate__ (self,state):
            # this method is called when you are going to
            # unpickle the class
            # if you need some initalization after the
            # unpicking you can do it here

            self.__dict__.update(state)
        
# start meditating
my_zen_object = my_zen_class("Dave")
for i in range(100):
    my_zen_object.meditate()

#Now I pickle my meditation experience
print(str.format("I'm {0}, and I,ve meditated {1} times'", my_zen_object.name, my_zen_object.number_of_meditations))
my_pickled_zen_object = pickle.dumps(my_zen_object)
my_zen_object = None

# Now I get my meditation experience back
my_new_zen_object = pickle.loads(my_pickled_zen_object)

# As expected, the number_of_meditations property 
# has not been restored because it hasn't been pickled

print(str.format("I'm {0}, and I don't have a beginner mind yet because I've meditated only {1} times'", my_new_zen_object.name, my_new_zen_object.number_of_meditations))
        


I'm Dave, and I,ve meditated 0 times'
I'm Dave, and I don't have a beginner mind yet because I've meditated only 0 times'
