## 第八章 数据的归宿

#### 文件输入/输出
* fileobj = open(filename,mode)

![open](http://owz0zbwsq.bkt.clouddn.com/file_open.png)

* 使用write()写文本文件

In [3]:
poem = '''There was a young lady named Bright,
Whose speed was far faster than light;
She started one day
In a relative way,
And returned on the previous night.'''
len(poem)

150

In [4]:
fout = open('relativity','wt')
fout.write(poem)
fout.close()

In [7]:
fout = open('relativity','wt')
print(poem,file=fout,sep='',end='')
fout.close()

* 如果源字符串非常大,可以将数据分块.直到所有字符被写入

In [12]:
fout = open('relativity','wt')
size = len(poem)
offset = 0
chunk = 100
while True:
    if offset > size:
        break
    fout.write(poem[offset:offset+chunk])
    offset += chunk
fout.close()

* 使用read() readline()或者readlines()读取文本文件

In [15]:
fin = open('relativity','rt')
poem = fin.read()
fin.close()
len(poem)

150

In [16]:
poem = ''
fin = open('relativity','rt')
chunk = 100
while True:
    fragment = fin.read(chunk)
    if not fragment:
        break
    poem += fragment
fin.close()
len(poem)

150

In [17]:
poem = ''
fin = open('relativity','rt')
while True:
    fragment = fin.readline()
    if not fragment:
        break
    poem += fragment
fin.close()
len(poem)

150

* 读取文本文件最简单的方式是使用一个迭代器

In [18]:
poem = ''
fin = open('relativity','rt')
for line in fin:
    poem += line
fin.close()
len(poem)

150

* 使用write()写二进制文件

In [2]:
bdata = bytes(range(0,256))
len(bdata)

256

In [5]:
fout = open('bfile','wb')
size = len(bdata)
offset = 0
chunk = 100
while True:
    if offset > size:
        break
    fout.write(bdata[offset:offset+chunk])
    offset += chunk
fout.close()

* 使用read()读取二进制文件

In [11]:
fin = open('bfile','rb')
bdata = fin.read()
fin.close()
len(bdata)

256

* 使用with自动关闭文件

In [12]:
with open('bfile','rb') as f:
    f.read()

* 使用seek()改变位置

In [13]:
fin = open('bfile','rb')
fin.tell()

0

In [14]:
fin.seek(111)

111

In [15]:
fin.tell()

111

* seek(offset,origin) offset偏移量 origin起始位置

#### 结构和文本文件

* CSV

In [17]:
import csv
villains = [
    ['Doctor', 'No'],
    ['Rosa', 'Klebb'],
    ['Mister', 'Big'],
    ['Auric', 'Goldfinger'],
    ['Ernst', 'Blofeld'],
]
with open('villains','wt') as fout:
    csvout = csv.writer(fout)
    csvout.writerows(villains)

In [18]:
import csv
with open('villains','rt') as fin:
    cin = csv.reader(fin)
    villains = [row for row in cin]
villains

[['Doctor', 'No'],
 ['Rosa', 'Klebb'],
 ['Mister', 'Big'],
 ['Auric', 'Goldfinger'],
 ['Ernst', 'Blofeld']]

* 数据不仅仅是列表的集合,也可以是字典的集合 

In [20]:
import csv
with open('villains','rt') as fin:
    cin = csv.DictReader(fin,fieldnames=['first','last'])
    villains = [row for row in cin]
    
villains

[{'first': 'Doctor', 'last': 'No'},
 {'first': 'Rosa', 'last': 'Klebb'},
 {'first': 'Mister', 'last': 'Big'},
 {'first': 'Auric', 'last': 'Goldfinger'},
 {'first': 'Ernst', 'last': 'Blofeld'}]

In [24]:
import csv
villains = [{'first': 'Doctor', 'last': 'No'},
 {'first': 'Rosa', 'last': 'Klebb'},
 {'first': 'Mister', 'last': 'Big'},
 {'first': 'Auric', 'last': 'Goldfinger'},
 {'first': 'Ernst', 'last': 'Blofeld'}]

with open('villains','wt') as fout:
    cout = csv.DictWriter(fout,['first','last'])
    cout.writeheader()
    cout.writerows(villains)

* XML

In [25]:
import xml.etree.ElementTree as et

tree = et.ElementTree(file='menu.xml')
root = tree.getroot()
root.tag

'menu'

In [27]:
for child in root:
    print('tag:',child.tag,'attributes:',child.attrib)
    for grandchild in child:
        print('\ttag:',grandchild.tag,'attributes:',grandchild.attrib)

tag: breakfast attributes: {'hours': '7-11'}
	tag: item attributes: {'price': '$6.00'}
	tag: item attributes: {'price': '$4.00'}
tag: lunch attributes: {'hours': '11-3'}
	tag: item attributes: {'price': '$5.00'}
tag: dinner attributes: {'hours': '3-10'}
	tag: item attributes: {'price': '8.00'}


In [28]:
# 菜单选择的数目
len(root)

3

In [29]:
# 早餐项的数目
len(root[0])

2

其他标准的Python XML库如下:
* xml.dom
* xml.sax

* JSON

In [36]:
import requests
import json

response = requests.get('http://www.bsbj.net/iv/data.json')
data = response.json()
dogs_json = json.dumps(data)
dogs_json

'{"dogs": [{"dog_id": 1, "image": "http://www.bsbj.net/iv/dogs/dog1.jpg", "sort": 6, "date": "2016-08-08 16:12:47", "detail": "\\u79cb\\u7530\\u72ac\\uff08\\u65e5\\u8bed\\uff1a\\u79cb\\u7530\\u72ac\\uff0f\\u3042\\u304d\\u305f\\u3044\\u306c\\uff0f\\u30a2\\u30ad\\u30bf\\u30a4\\u30cc Akita Inu *\\uff09\\u662f\\u65e5\\u672c\\u72ac\\u7684\\u4e00\\u79cd\\uff0c\\u662f\\u56fd\\u5bb6\\u5929\\u7136\\u7eaa\\u5ff5\\u7269\\u4e4b\\u516d\\u79cd\\u65e5\\u672c\\u72ac\\u4e2d\\u552f\\u4e00\\u7684\\u5927\\u578b\\u72ac\\u79cd\\u3002", "name": "\\u79cb\\u7530\\u72ac"}, {"dog_id": 2, "image": "http://www.bsbj.net/iv/dogs/dog2.jpg", "sort": 8, "date": "2016-08-08 16:12:47", "detail": "\\u963f\\u62c9\\u65af\\u52a0\\u96ea\\u6a47\\u72ac\\uff08\\u82f1\\u8bed\\uff1aAlaskan Malamute\\uff09\\u53c8\\u79f0\\u963f\\u62c9\\u65af\\u52a0\\u9a6c\\u62c9\\u7a46\\uff0c\\u662f\\u6700\\u53e4\\u8001\\u7684\\u96ea\\u6a47\\u72ac\\u4e4b\\u4e00\\u3002", "name": "\\u963f\\u62c9\\u65af\\u52a0\\u96ea\\u6a47\\u72ac"}, {"dog_id": 3, "ima

In [38]:
dogs = json.loads(dogs_json)
dogs

{'dogs': [{'date': '2016-08-08 16:12:47',
   'detail': '秋田犬（日语：秋田犬／あきたいぬ／アキタイヌ Akita Inu *）是日本犬的一种，是国家天然纪念物之六种日本犬中唯一的大型犬种。',
   'dog_id': 1,
   'image': 'http://www.bsbj.net/iv/dogs/dog1.jpg',
   'name': '秋田犬',
   'sort': 6},
  {'date': '2016-08-08 16:12:47',
   'detail': '阿拉斯加雪橇犬（英语：Alaskan Malamute）又称阿拉斯加马拉穆，是最古老的雪橇犬之一。',
   'dog_id': 2,
   'image': 'http://www.bsbj.net/iv/dogs/dog2.jpg',
   'name': '阿拉斯加雪橇犬',
   'sort': 8},
  {'date': '2016-08-08 16:12:47',
   'detail': '美国爱斯基摩犬（英语：American Eskimo Dog）是一种来自德国的玩赏犬犬种。',
   'dog_id': 3,
   'image': 'http://www.bsbj.net/iv/dogs/dog3.jpg',
   'name': '美国爱斯基摩犬',
   'sort': 9},
  {'date': '2016-08-08 16:12:47',
   'detail': '都柏文，又称杜宾犬（德语：Dobermann，英语：Doberman Pinscher）是原产于德国的一种中大型犬，大约在1890年由Karl Friedrich Louis Dobermann所培育出来。是最常被用来作为军事用途的军犬。',
   'dog_id': 4,
   'image': 'http://www.bsbj.net/iv/dogs/dog4.jpg',
   'name': '杜宾犬',
   'sort': 7},
  {'date': '2016-08-08 16:12:47',
   'detail': '英国可卡犬是猎犬的一个品种。英国可卡犬是一个活跃的、和蔼的猎犬，有着很好的身材和 马肩隆。',
 

* YAMl