> **序列化** 在计算机科学的数据处理中，是指将数据结构或对象状态转换成可取用格式，以留待后续在相同或另一台计算机环境中，能恢复原先状态的过程。依照序列化格式重新获取字节的结果时，可以利用它来产生与原始对象相同语义的副本。对于许多对象，像是使用大量引用的复杂对象，这种序列化重建的过程并不容易。

# Pickle

> The data format used by the `pickle` module is *Python-specific*.

## Pickling With A File

**Saving Data TO A Pickle File**

In [1]:
# shell 1
entry = {}
entry['title'] = 'Dive into history, 2009 edition'
entry['article_link'] = 'http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition'
entry['comments_link'] = None
entry['internal_id'] = b'\xDE\xD5\xB4\xF8'
entry['tags'] = ('diveintopython', 'docbook', 'html')
entry['published'] = True
import time
entry['published_date'] = time.strptime('Fri Mar 27 22:20:42 2009')

import pickle
with open('entry.pickle', 'wb') as f:
    pickle.dump(entry, f)

**Loading Data from a Pickle File**

In [2]:
# shell 2
import pickle
with open('entry.pickle', 'rb') as f:
    entry = pickle.load(f)
    
entry

{'title': 'Dive into history, 2009 edition',
 'article_link': 'http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition',
 'comments_link': None,
 'internal_id': b'\xde\xd5\xb4\xf8',
 'tags': ('diveintopython', 'docbook', 'html'),
 'published': True,
 'published_date': time.struct_time(tm_year=2009, tm_mon=3, tm_mday=27, tm_hour=22, tm_min=20, tm_sec=42, tm_wday=4, tm_yday=86, tm_isdst=-1)}

> `pickle.dump()` use a stream object and performs serialization, `pickle.load()` use a stream object and performs deserialization.

## Pickling Without a File

In [3]:
b = pickle.dumps(entry)
b

b'\x80\x04\x95J\x01\x00\x00\x00\x00\x00\x00}\x94(\x8c\x05title\x94\x8c\x1fDive into history, 2009 edition\x94\x8c\x0carticle_link\x94\x8cJhttp://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition\x94\x8c\rcomments_link\x94N\x8c\x0binternal_id\x94C\x04\xde\xd5\xb4\xf8\x94\x8c\x04tags\x94\x8c\x0ediveintopython\x94\x8c\x07docbook\x94\x8c\x04html\x94\x87\x94\x8c\tpublished\x94\x88\x8c\x0epublished_date\x94\x8c\x04time\x94\x8c\x0bstruct_time\x94\x93\x94(M\xd9\x07K\x03K\x1bK\x16K\x14K*K\x04KVJ\xff\xff\xff\xfft\x94}\x94(\x8c\x07tm_zone\x94N\x8c\ttm_gmtoff\x94Nu\x86\x94R\x94u.'

In [4]:
entry3 = pickle.loads(b)
entry3

{'title': 'Dive into history, 2009 edition',
 'article_link': 'http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition',
 'comments_link': None,
 'internal_id': b'\xde\xd5\xb4\xf8',
 'tags': ('diveintopython', 'docbook', 'html'),
 'published': True,
 'published_date': time.struct_time(tm_year=2009, tm_mon=3, tm_mday=27, tm_hour=22, tm_min=20, tm_sec=42, tm_wday=4, tm_yday=86, tm_isdst=-1)}

> `pickle.dumps()` performs serialization, `pickle.loads()` performs deserialization.

# JSON

1. *JSON* is text-based.
2. *JSON* allow arbitrary amounts of whitespace between values.

## Saving Data to a json File

In [None]:
# shell 1
basic_entry = {}
basic_entry['id'] = 256
basic_entry['title'] = 'Dive into history, 2009 edition'
basic_entry['tags'] = ('diveintopython', 'docbook', 'html')
basic_entry['published'] = True
basic_entry['comments_link'] = None
import json
with open('basic.json', mode='w', encoding='utf-8') as f:
    json.dump(basic_entry, f)   

## Mapping of Python Datatypes to json

| JSON  | Python 3  |  Notes |
| :----:  |  :----:     | :----:  |
| object |  dictionary    |      |
| array |  list    |      |
| string |  string    |      |
| integer |  integer    |      |
| real number |  float    |      |
| true |  True    |*      |
| false |  False    |*      |
| null |  None    |*      |
* All json values are case-sensitive.	

In [6]:
a  = {'a':1}
repr(a)


"{'a': 1}"

## Serializing Datatypes Unsupported by json

>  Define your own “mini-serialization format.”, doing the converting-to-a-supported-datatype part.

In [29]:
entry = {'comments_link': None,
 'internal_id': b'\xDE\xD5\xB4\xF8',
 'title': 'Dive into history, 2009 edition',
 'tags': ('diveintopython', 'docbook', 'html'),
 'article_link': 'http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition',
 'published': True}
entry

{'comments_link': None,
 'internal_id': b'\xde\xd5\xb4\xf8',
 'title': 'Dive into history, 2009 edition',
 'tags': ('diveintopython', 'docbook', 'html'),
 'article_link': 'http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition',
 'published': True}

In [30]:

#shell 1
import json
def to_json(python_object):
    if isinstance(python_object, bytes):
        return {'__class__': 'bytes',
                '__value__': list(python_object)}
    raise TypeError(repr(python_object) + ' is not JSON serializable')
    
with open('entry.json', 'w', encoding='utf-8') as f:
    json.dump(entry, f, default=to_json)

## Loading Data from a json File

In [31]:
# shell 2
entry = None
def from_json(json_object):                             
    if '__class__' in json_object:                
        if json_object['__class__'] == 'bytes':
            return bytes(json_object['__value__'])
    return json_object

with open('entry.json', 'r', encoding='utf-8') as f:
    entry = json.load(f, object_hook=from_json)
    
entry

{'comments_link': None,
 'internal_id': b'\xde\xd5\xb4\xf8',
 'title': 'Dive into history, 2009 edition',
 'tags': ['diveintopython', 'docbook', 'html'],
 'article_link': 'http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition',
 'published': True}

>  the json module silently converts both tuples and lists into json arrays during serialization. 