# 用cooyreg实现可靠的pickle操作

In [1]:
import logging

pickle模块能够将Python对象序列化为字节流，也能把这些字节流反序列化为Python对象。

**示例：**用Python对象表示玩家的游戏进度。GameState类包含了玩家当前级别，以及剩余的生命数。

In [2]:
class GameState(object):
    def __init__(self):
        self.level = 0
        self.lives = 4

In [3]:
state = GameState()
state.level += 1  # Player beat a level
state.lives -= 1  # Player had to try again

玩家退出游戏时，程序可以把游戏状态保存到文件里，以便稍后恢复。

In [4]:
import pickle
state_path = 'game_state.bin'
with open(state_path, 'wb') as f:
    pickle.dump(state, f)

可以用load函数来加载这个文件，并把GameState对象还原回来。

In [5]:
with open(state_path, 'rb') as f:
    state_after = pickle.load(f)
print(state_after.__dict__)

{'level': 1, 'lives': 3}


**需求改进：**为了鼓励玩家追求高分，给游戏添加计分功能。给GameState添加points字段，以表示玩家的分数。

In [6]:
class GameState(object):
    def __init__(self):
        self.level = 0
        self.lives = 4
        self.points = 0

In [7]:
state = GameState()
serialized = pickle.dumps(state)
state_after = pickle.loads(serialized)
print(state_after.__dict__)

{'level': 0, 'lives': 4, 'points': 0}


**出现问题：**如果有一份存档，是用旧版的GameState格式保存的，而现在玩家又要用这份存档来继续游戏，还原出来的对象，没有points属性。

In [8]:
with open(state_path, 'rb') as f:
    state_after = pickle.load(f)
print(state_after.__dict__)

{'level': 1, 'lives': 3}


In [9]:
assert isinstance(state_after, GameState)

**分析问题：**这个问题的出现是由于pickle模块的工作机制所表现的副作用，pickle模块的主要功能是帮助我们轻松地在对象上执行序列化操作。

## 改进方法一：为缺失的属性提供默认值

In [10]:
class GameState(object):
    def __init__(self, level=0, lives=4, points=0):
        self.level = level
        self.lives = lives
        self.points = points

In [11]:
# 接受GameState对象，并将其转换为一个包含参数的元组。
# 返回的元组，含有unpickle操作使用的函数，
# 以及要传给那个unpickle函数的参数。
def pickle_game_state(game_state):
    kwargs = game_state.__dict__
    return unpickle_game_state, (kwargs,)

In [12]:
# 接受由pickle_game_state所传过来的序列化数据及参数，
# 并返回响应的GameState对象
def unpickle_game_state(kwargs):
    return GameState(**kwargs)

In [13]:
# 注册pickle_game_state函数
import copyreg
copyreg.pickle(GameState, pickle_game_state)

In [14]:
state = GameState()
state.points += 1000
serialized = pickle.dumps(state)
state_after = pickle.loads(serialized)
print(state_after.__dict__)

{'level': 0, 'lives': 4, 'points': 1000}


**需求更改：**给玩家一定数量的魔法卷轴。

In [15]:
class GameState(object):
    def __init__(self, level=0, lives=4, points=0, magic=5):
        self.level = level
        self.lives = lives
        self.points = points
        self.magic = magic

In [16]:
state_after = pickle.loads(serialized)
print(state_after.__dict__)

{'level': 0, 'lives': 4, 'points': 1000, 'magic': 5}


## 改进方法二：用版本号来管理类

**需求更改：**游戏不应该限制玩家的生命数量，把生命数量从游戏中拿掉。

In [17]:
class GameState(object):
    def __init__(self, level=0, points=0, magic=5):
        self.level = level
        self.points = points
        self.magic = magic

In [18]:
try:
    pickle.loads(serialized)
except:
    logging.exception('Expected')
else:
    assert False

ERROR:root:Expected
Traceback (most recent call last):
  File "<ipython-input-18-c5ccc301c321>", line 2, in <module>
    pickle.loads(serialized)
  File "<ipython-input-12-d4450a2dd838>", line 4, in unpickle_game_state
    return GameState(**kwargs)
TypeError: __init__() got an unexpected keyword argument 'lives'


修改构造器之后，程序无法对旧版的游戏数据进行反序列化操作。因为旧版游戏数据中的所有字段，都会通过unpickle_game_state函数，传给GameState构造器。

**解决办法：**修改pick_game_state函数，在该函数里添加一个表示版本号的参数，在对新版的GameState对象进行pickle的时候，pickle_game_state函数会在序列化后的新版数据里面，添加值为2的version参数。

In [19]:
def pickle_game_state(game_state):
    kwargs = game_state.__dict__
    kwargs['version'] = 2
    return unpickle_game_state, (kwargs,)

In [20]:
def unpickle_game_state(kwargs):
    version = kwargs.pop('version', 1)
    if version == 1:
        kwargs.pop('lives')
    return GameState(**kwargs)

In [21]:
copyreg.pickle(GameState, pickle_game_state)
state_after = pickle.loads(serialized)
print(state_after.__dict__)

{'level': 0, 'points': 1000, 'magic': 5}


## 改进方法三：固定引入路径

**问题：**在使用pickle模块时，当类的名称改变之后，原有的数据无法正常进行反序列化操作。

In [22]:
copyreg.dispatch_table.clear()
state = GameState()
serialized = pickle.dumps(state)
del GameState
class BetterGameState(object):
    def __init__(self, level=0, points=0, magic=5):
        self.level = level
        self.points = points
        self.magic = magic

In [23]:
try:
    pickle.loads(serialized)
except:
    logging.exception('Expected')
else:
    assert False

ERROR:root:Expected
Traceback (most recent call last):
  File "<ipython-input-23-c5ccc301c321>", line 2, in <module>
    pickle.loads(serialized)
AttributeError: Can't get attribute 'GameState' on <module '__main__'>


In [24]:
print(serialized[:25])

b'\x80\x03c__main__\nGameState\nq\x00)'


该问题也可以通过copyreg模块来解决，给函数指定一个固定的标识符，令它采用这个标识符来对数据进行unpickle操作。

In [25]:
copyreg.pickle(BetterGameState, pickle_game_state)

In [26]:
state = BetterGameState()
serialized = pickle.dumps(state)
print(serialized[:35])

b'\x80\x03c__main__\nunpickle_game_state\nq\x00}'
