## 打开和关闭文件

In [3]:
f = open('./hello.py') # 使用open 打开文件，返回时值时一个 File-like 对象

In [4]:
f.read()   # 使用read读文件

"print('hello world')\n"

In [5]:
f.close() # close方法关闭文件

## 文件对象的操作

* 读
* 写

文件的操作和文件打开方式强相关

文件打开方式和文件操作的相关性

Character Meaning
------------------------------------------------------------------------
* 'r'       open for reading (default)
* 'w'       open for writing, **truncating the file first**
* 'x'       create a new file and open it for writing
* 'a'       open for writing, appending to the end of the file if it exists
* 'b'       binary mode
* 't'       text mode (default)
* '+'       open a disk file for updating (reading and writing)
* 'U'       universal newline mode (deprecated)


In [10]:
f = open('./hello.py', mode='r')

In [11]:
f.write('test')  # mode=r  不可写

UnsupportedOperation: not writable

In [12]:
f.read() # mode=r 可读

"print('hello world')\n"

In [13]:
f.close()

In [15]:
f = open('./not_exist.txt', mode='r') # mode=r 文件不存在时， 会抛出FileNotFoundError

FileNotFoundError: [Errno 2] No such file or directory: './not_exist.txt'

In [16]:
%cat hello.py

print('hello world')


In [17]:
f = open('./hello.py', mode='w')

In [18]:
f.read() # mode=w  不可读

UnsupportedOperation: not readable

In [19]:
f.write('abcd') # mode=w 可写

4

In [20]:
f.close()

In [22]:
%cat hello.py # mode=w 会清空原文件

abcd

In [23]:
f = open('./hello.py', mode='w') # mode=w 即使打开后，不做任何操作，也会清空文件

In [24]:
f.close()

In [25]:
%cat hello.py

In [26]:
f = open('./not_exist.txt', mode='w') # mode=w 当文件不存在时，会创建新文件

In [27]:
f.close()

In [28]:
%ls ./not_exist.txt

./not_exist.txt


In [30]:
f = open('./hello.py', mode='x') # mode=x 当文件存在， 会抛出异常 FileExistsError

FileExistsError: [Errno 17] File exists: './hello.py'

In [31]:
%rm ./not_exist.txt

In [32]:
f = open('./not_exist.txt', mode='x') # mode=x 总是创建新文件

In [33]:
f.read() # mode=x 不可读

UnsupportedOperation: not readable

In [34]:
f.write('abcd') # mode=x 可写

4

In [35]:
f.close()

In [36]:
%cat not_exist.txt

abcd

In [38]:
%cat hello.py

print('hell world')


In [39]:
f = open('./hello.py', mode='a')

In [40]:
f.read() # mode=a 不可读

UnsupportedOperation: not readable

In [41]:
f.write('abcd') # mode=a 可写

4

In [42]:
f.close()

In [43]:
%cat hello.py # mode=a 写入的内容追加到文档末尾

print('hell world')
abcd

In [44]:
%rm not_exist.txt

In [45]:
f = open('./not_exist.txt', mode='a')

In [46]:
f.write('abcd')

4

In [47]:
f.close()

In [48]:
%cat not_exist.txt

abcd

控制读写的模式：

* r  只读 文件必须存在
* w  只写，先清空文件， 文件不存会创建文件
* x  只写， 文件必须不存在
* a  只写，追加到文件末尾, 文件不存在会创建

* 从读写的方面来看， 只有r可读不可写， 其他都是可写不可读
* 从文件不存在来看，只有r抛出异常，其他都创建新文件
* 从文件存在来看， 只有x抛出异常
* 从是否影响原始内容来看，只有w会清空文件

In [53]:
f = open('./hello.py', mode='rt')

In [54]:
s = f.read() # mode=t 读入的内容是字符串

In [57]:
type(s)

str

In [58]:
f.close()

In [65]:
f = open('./hello.py', mode='rb')

In [66]:
s = f.read() # mode=b 读入的是bytes

In [67]:
s

b"print('hell world')\nabcd"

In [68]:
type(s)

bytes

In [69]:
f.close()

* mode=t  按字符操作
* mode=b  按字节操作

In [70]:
f = open('hello.py', mode='wt')

In [71]:
f.write('马哥教育')

4

In [72]:
f.close()

In [73]:
f = open('hello.py', mode='wb')

In [74]:
f.write('马哥教育') # mode=b write参数为bytes

TypeError: a bytes-like object is required, not 'str'

In [75]:
f.write('马哥教育'.encode()) # 按字节写入， 写入12个字节

12

In [76]:
f.close()

In [77]:
f = open('hello.py', mode='rw') # rwxa  只能选一个

ValueError: must have exactly one of create/read/write/append mode

In [78]:
f = open('hello.py', mode='r+') #mode=r+ 可读可写

In [79]:
f.read()

'马哥教育'

In [80]:
f.write('haha')

4

In [81]:
f.close()

In [82]:
%cat hello.py

马哥教育haha

In [83]:
f = open('hello.py', mode='w+') # mode=w+ 可读可写， 清空文件

In [84]:
f.read()

''

In [85]:
f.write('haha')

4

In [86]:
f.close()

In [87]:
%cat hello.py

haha

In [88]:
f = open('hello.py', mode='r+')

In [89]:
f.write('he')

2

In [90]:
f.read()

'ha'

In [91]:
f.close()

In [92]:
%cat hello.py

heha

当打开文件的时候， 解释器会持有一个指针， 指向文件的某个位置

当我们读写文件的时候，总是从指针处开始向后操作，并且移动指针

当mode=r时， 指针是指向0(文件开始)

mode=a时， 指针指向EOF(文件末尾)

In [93]:
f = open('hello.py', mode='a+')

In [94]:
f.read()

''

In [95]:
f.write('heihei')

6

In [96]:
f.close()

In [97]:
%cat hello.py

hehaheihei

In [98]:
f = open('hello.py', mode='+') # 单独的+不能工作， mode里必须有且仅有rwxa中的一个

ValueError: Must have exactly one of create/read/write/append mode and at most one plus

当mode包含+时， 会增加额外的读写操作， 也就说原来是只读的，会增加可写的操作， 原来是只写的，会增加可读的操作，但是+不改变其他行为

In [100]:
f = open('hello.py') # rt

In [101]:
f.tell() # 获取当当前文件指针的位置

0

In [102]:
f.read()

'hehaheihei'

In [103]:
f.tell()

10

In [104]:
f.close()

In [105]:
f = open('hello.py', mode='a') # at

In [106]:
f.tell()

10

In [107]:
f.close()

In [108]:
f = open('hello.py') # rt

In [109]:
help(f.seek)

Help on built-in function seek:

seek(cookie, whence=0, /) method of _io.TextIOWrapper instance
    Change stream position.
    
    Change the stream position to the given byte offset. The offset is
    interpreted relative to the position indicated by whence.  Values
    for whence are:
    
    * 0 -- start of stream (the default); offset should be zero or positive
    * 1 -- current stream position; offset may be negative
    * 2 -- end of stream; offset is usually negative
    
    Return the new absolute position.



In [110]:
f.seek(4, 0)

4

In [111]:
f.tell()

4

In [112]:
f.read()

'heihei'

In [113]:
f.seek(0, 0)

0

In [114]:
f.read()

'hehaheihei'

In [115]:
f.tell()

10

In [116]:
f.seek(2)

2

In [117]:
f.tell()

2

In [118]:
f.seek(4, 1) # 不支持

UnsupportedOperation: can't do nonzero cur-relative seeks

In [119]:
f.seek(4, 2) # 不支持

UnsupportedOperation: can't do nonzero end-relative seeks

In [120]:
f.seek(0, 1) # 支持

2

In [121]:
f.seek(0, 2) # 支持

10

In [122]:
f.tell()

10

In [123]:
f.close()

#### mode=t

* 按字节移动文件指针
* 当whence为start(0)(默认值)， offset可以是任意整数
* 当whence为current(1)或者end(2)， offset只能为0

In [124]:
f = open('hello.py', mode='w')

In [125]:
f.write('马哥教育')

4

In [126]:
f.close()

In [127]:
f = open('hello.py', mode='rb') # rb

In [128]:
f.tell()

0

In [132]:
f.seek(3) # 按字节操作

3

In [133]:
f.read().decode()

'哥教育'

In [134]:
f.seek(3)

3

In [135]:
f.seek(3, 1) # whence=current(1) offset可以是任意整数

6

In [136]:
f.read().decode()

'教育'

In [137]:
f.seek(3, 2)

15

In [138]:
f.read().decode()

''

In [139]:
f.seek(-3, 2) # whence=end(2) offset可以是任意整数

9

In [140]:
f.read().decode()

'育'

In [141]:
f.seek(13) # 向后超出范围允许

13

In [142]:
f.tell()

13

In [144]:
f.seek(-13, 2) # 向前超出范围不允许

OSError: [Errno 22] Invalid argument

In [145]:
f.close()

#### mode=b

* 按字节移动文件指针
* 当whence为start(0)(默认值)， offset可以是任意整数
* 当whence为current(1)或者end(2)， offset也可以是任意整数

In [146]:
f = open('hello.py', mode='a+')

In [147]:
f.tell()

12

In [148]:
f.close()

In [149]:
f = open('hello.py')

In [150]:
f.tell()

0

In [151]:
f.read()

'马哥教育'

In [153]:
f.tell() # tell 总是以字节来计算

12

In [154]:
f.close()

In [155]:
f = open('hello.py', mode='a+')

In [156]:
f.seek(0, 1)

12

In [157]:
f.seek(0)

0

In [158]:
f.seek(3)

3

In [159]:
f.read()

'哥教育'

In [160]:
f.seek(13)

13

In [161]:
f.write('abc')

3

In [162]:
f.seek(0)

0

In [163]:
f.read()

'马哥教育abc'

In [164]:
f.tell()

15

当seek超出文件末尾， 不会有异常， tell也会超出文件末尾， 但是写数据的时候，还是会从文件末尾开始写

write 操作 从 min(EOF, tell())处开始

* 文件指针按字节操作
* tell方法返回当前文件指针位置
* seek方法移动文件指针
* whence 参数 SEEK_SET(0) 从0开始向后移动offset个字节, SEEK_CUR(1) 从当前位置向后移动offset个字节, SEEK_END(2) 从EOF向后移动offset个字节
* offset是整数
* 当mode为t时， whence为SEEK_CUR或者SEEK_END时， offset只能为0
* 文件指针不能为负数
* 读文件的时候从文件指针(pos)开始向后读
* 写文件的时候从min(EOF,pos)处开始向后写
* 以append模式打开的时候，无论文件指针在何处，都从EOF开始写

In [168]:
f.seek(2)

2

In [169]:
f.read()

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xac in position 0: invalid start byte

In [170]:
f.close()

In [171]:
%cat hello.py

马哥教育abc

In [172]:
f = open('hello.py', 'a+')

In [173]:
f.tell()

15

In [174]:
f.seek(0)

0

In [176]:
f.tell()

0

In [178]:
f.write('123')

3

In [179]:
f.close()

In [180]:
%cat hello.py

马哥教育abc123

In [181]:
f = open('hello.py', 'a+')

In [182]:
f.seek(0)

0

In [183]:
f.read()

'马哥教育abc123'

In [184]:
f.write('abc')

3

In [185]:
f.seek(0)

0

In [186]:
f.write('abc')

3

In [187]:
f.close()

In [188]:
%cat hello.py

马哥教育abc123abcabc

In [190]:
f = open('hello.py', 'wb')

In [191]:
f.write(b'abc')

3

In [192]:
%cat hello.py

In [193]:
f.flush() # flush方法刷新缓冲区

In [194]:
%cat hello.py

abc

In [195]:
f.write(b'abc')

3

In [196]:
%cat hello.py

abc

In [197]:
f.close() # close 也会刷新缓冲区

In [198]:
%cat hello.py

abcabc

In [199]:
f = open('hello.py', 'wb', buffering=5)

In [201]:
f.write(b'abc')

3

In [202]:
%cat hello.py

In [203]:
f.write(b'abc') # 检查缓冲区是否足够写入当前字节，如果不够，flush缓冲区，然后在把当前字节写入缓冲区

3

In [204]:
%cat hello.py

abc

In [205]:
f.write(b'xx')

2

In [206]:
%cat hello.py

abc

In [208]:
f.close()

In [209]:
f = open('hello.py', 'wb', buffering=0)

In [210]:
f.write(b'abc')

3

In [211]:
%cat hello.py

abc

In [212]:
f.close()

In [213]:
f = open('hello.py', 'wb', buffering=5)

In [214]:
f.write(b'abcdefgh')

8

In [215]:
%cat hello.py

abcdefgh

In [216]:
f.close()

In [217]:
import io

In [218]:
io.DEFAULT_BUFFER_SIZE

8192

In [232]:
f = open('hello.py', 'wt', buffering=1) # 文本模式， buffering 为1 使用line buffer

In [233]:
f.write('abc')

3

In [234]:
%cat hello.py

In [235]:
f.write('\n') # text mode 当写入换行符时，会刷新缓冲区

1

In [236]:
%cat hello.py

abc


In [237]:
f.close()

In [238]:
f = open('hello.py', 'wt', buffering=5)

In [239]:
f.write('abc')

3

In [240]:
%cat hello.py

In [241]:
f.write('\n')

1

In [242]:
%cat hello.py

In [243]:
f.write('abc')

3

In [244]:
%cat hello.py

In [245]:
f.write('a' * io.DEFAULT_BUFFER_SIZE)

8192

In [246]:
%cat hello.py

abc
abcaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

In [247]:
f.write('a' * (io.DEFAULT_BUFFER_SIZE -1))

8191

In [248]:
f.close()

In [249]:
f = open('hello.py', 'wt', buffering=5) # text mode buffering　大于1， 缓冲区大小为io.DEFAULT_BUFFER_SIZE 缓冲区满的时候，flush缓冲区，连同本次写入的内容

In [250]:
f.write('a' * (io.DEFAULT_BUFFER_SIZE))

8192

In [251]:
%cat hello.py

In [252]:
f.write('b')

1

In [253]:
%cat hello.py

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

In [254]:
f.close()

In [255]:
f = open('hello.py', 'wt', buffering=0) # 文本模式不能关闭缓冲区

ValueError: can't have unbuffered text I/O

In [256]:
import sys

In [258]:
sys.stderr.write('abc')

abc

#### buffering=-1
* 二进制模式： DEFAULT_BUFFER_SIZE
* 文本模式: DEFAULT_BUFFER_SIZE

#### buffering=0
* 二进制模式: unbuffered 
* 文本模式： 不允许

#### buffering=1
* 二进制模式： 1
* 文本模式： line buffering

#### buffering>1
* 二进制模式：buffering
* 文本模式： DEFAULT_BUFFER_SIZE

* 二进制模式： 判断缓冲区剩余位置是否足够存放当前字节，如果不能，先flush， 在把当前字节写入缓冲区，如果当前字节大于缓冲区大小， 直接flush
* 文本模式： line buffering，遇到换行就flush， 非line buffering，如果当前字节加缓冲区中的字节，超出缓冲区大小，直接flush缓冲区和当前字节

特殊文件对象有特殊的刷新方式

flush和close可以强制刷新缓冲区

In [259]:
f = open('hello.py')

In [265]:
f.readline() # 按行读

''

In [266]:
f.readlines()

[]

In [267]:
f.close()

In [268]:
%cat hello.py

In [269]:
f = open('hello.py', 'r+')

In [270]:
f.readable()

True

In [271]:
help(f.read)

Help on built-in function read:

read(size=-1, /) method of _io.TextIOWrapper instance
    Read at most n characters from stream.
    
    Read from underlying buffer until we have n characters or we hit EOF.
    If n is negative or omitted, read until EOF.



In [272]:
f.read(4)

'aaaa'

In [273]:
f.read(4)

'\nbbb'

In [274]:
f.read(-1)

'b\ncccc\ndddd\n'

In [275]:
f.seek(0)

0

In [276]:
f.read(0)

''

In [277]:
f.readline()

'aaaa\n'

In [278]:
f.readline()

'bbbb\n'

In [279]:
help(f.readline)

Help on built-in function readline:

readline(size=-1, /) method of _io.TextIOWrapper instance
    Read until newline or EOF.
    
    Returns an empty string if EOF is hit immediately.



In [281]:
f.readline(1) # 读入min(size, EOF-pos)个字符, 如果遇到换行符， 提前返回

'c'

In [282]:
f.seek(0)

0

In [283]:
f.readlines()

['aaaa\n', 'bbbb\n', 'cccc\n', 'dddd\n']

In [284]:
f.seek(0)

0

In [285]:
for line in f: # 文件对象是可迭代对象，每次迭代一行
    print(line)

aaaa

bbbb

cccc

dddd



In [286]:
f.writable()

True

In [287]:
f.write('abc\n')

4

In [288]:
help(f.write)

Help on built-in function write:

write(text, /) method of _io.TextIOWrapper instance
    Write string to stream.
    Returns the number of characters written (which is always equal to
    the length of the string).



In [289]:
help(f.writelines)

Help on built-in function writelines:

writelines(lines, /) method of _io.TextIOWrapper instance



In [290]:
f.writelines(['abc\n', 'cbd\n']) # 一次性写入多组文本

In [291]:
f.flush()

In [292]:
%cat hello.py

aaaa
bbbb
cccc
dddd
abc
abc
cbd


In [294]:
f.seekable() # 文件指针是否可移动

True

In [296]:
f.fileno()

56

In [297]:
f.isatty()

False

In [298]:
sys.stderr.isatty()

False

In [299]:
f.mode

'r+'

In [300]:
f.name

'hello.py'

In [302]:
f.close()

In [303]:
f = open('hello.py', mode='r+b')

In [304]:
f.readline()

b'aaaa\n'

In [305]:
f.readlines()

[b'bbbb\n', b'cccc\n', b'dddd\n', b'abc\n', b'abc\n', b'cbd\n']

In [306]:
help(f.readinto)

Help on built-in function readinto:

readinto(buffer, /) method of _io.BufferedRandom instance



In [311]:
f2 = open('haha.txt', 'r+b')

In [318]:
f.seek(0)

0

In [321]:
f.close()

In [1]:
lst = []

In [2]:
for x in range(2000):
    lst.append(open('haha.txt'))

OSError: [Errno 24] Too many open files: 'haha.txt'

In [3]:
len(lst)

971

In [4]:
open('haha.txt')

OSError: [Errno 24] Too many open files: 'haha.txt'

In [5]:
for x in lst:
    x.close()

In [6]:
f = open('haha.txt')

In [7]:
f.close()

In [8]:
f = open('haha.txt')
f.write('12345')
f.close()

UnsupportedOperation: not writable

In [10]:
f.fileno()

53

In [11]:
f.closed

False

In [12]:
f.close()

In [13]:
f = open('haha.txt')
try:
    f.write('12345')
finally:
    f.close()

UnsupportedOperation: not writable

In [14]:
f.closed

True

## 上下文管理

In [15]:
with open('haha.txt') as f:
    f.write('')

UnsupportedOperation: not writable

In [16]:
with open('haha.txt') as fh:
    pass

In [17]:
fh

<_io.TextIOWrapper name='haha.txt' mode='r' encoding='UTF-8'>

In [18]:
fh.closed

True

上下文管理，会在离开时自动关闭文件， 但是他不会开启新的作用域

In [19]:
f = open('haha.txt')
with f:
    pass

In [20]:
f.closed

True

In [21]:
f.close()

In [22]:
f.name

'haha.txt'

In [23]:
from io import StringIO

In [25]:
sio = StringIO()

In [26]:
sio.readable()

True

In [27]:
sio.writable()

True

In [28]:
sio.seekable()

True

In [29]:
sio.write('abcd')

4

In [30]:
sio.seek(0)

0

In [31]:
sio.read()

'abcd'

File-like 对象

In [32]:
from io import BytesIO

In [33]:
bio = BytesIO()

In [35]:
bio.write(b'abcd')

4

In [36]:
bio.seek(0)

0

In [37]:
bio.read()

b'abcd'

In [39]:
bio.getvalue() # getvalue 可以一次性独处全部内容，不管文件指针在哪里

b'abcd'

In [40]:
import socket

In [41]:
bio.close()

In [42]:
bio.getvalue()

ValueError: I/O operation on closed file.

In [43]:
bio = BytesIO()

In [44]:
buf = bio.getbuffer()

In [45]:
bio.close()

BufferError: Existing exports of data: object cannot be re-sized

In [46]:
help(bio.getbuffer)

Help on built-in function getbuffer:

getbuffer() method of _io.BytesIO instance
    Get a read-write view over the contents of the BytesIO object.



In [47]:
buf.release()

In [48]:
bio.close()

## 路径操作

In [49]:
import os

os.path 是以字符串的方式操作路径的

In [51]:
import pathlib # 以OO的方式操作路径

In [52]:
cwd = pathlib.Path('.')

In [53]:
cwd

PosixPath('.')

### 对目录的操作

In [55]:
cwd.is_dir()

True

In [56]:
cwd.iterdir()

<generator object Path.iterdir at 0x7fde380dde60>

In [58]:
for f in cwd.iterdir():  # 遍历目录并不会递归遍历
    print(type(f))
    print(f)

<class 'pathlib.PosixPath'>
.python-version
<class 'pathlib.PosixPath'>
.ipynb_checkpoints
<class 'pathlib.PosixPath'>
基本语法.ipynb
<class 'pathlib.PosixPath'>
hello.py
<class 'pathlib.PosixPath'>
第一周作业解析.ipynb
<class 'pathlib.PosixPath'>
列表及其常用操作.ipynb
<class 'pathlib.PosixPath'>
元组及其操作.ipynb
<class 'pathlib.PosixPath'>
字符串及其操作.ipynb
<class 'pathlib.PosixPath'>
字符串格式化.ipynb
<class 'pathlib.PosixPath'>
字符串与bytes.ipynb
<class 'pathlib.PosixPath'>
线性结构与切片.ipynb
<class 'pathlib.PosixPath'>
第二周作业解析.ipynb
<class 'pathlib.PosixPath'>
解构与封装.ipynb
<class 'pathlib.PosixPath'>
集合与集合操作.ipynb
<class 'pathlib.PosixPath'>
字典及其操作.ipynb
<class 'pathlib.PosixPath'>
拉链法实现字典.ipynb
<class 'pathlib.PosixPath'>
解析式.ipynb
<class 'pathlib.PosixPath'>
可迭代对象与迭代器.ipynb
<class 'pathlib.PosixPath'>
第三周作业解析.ipynb
<class 'pathlib.PosixPath'>
函数.ipynb
<class 'pathlib.PosixPath'>
第四周作业解析.ipynb
<class 'pathlib.PosixPath'>
高阶函数.ipynb
<class 'pathlib.PosixPath'>
类型提示.ipynb
<class 'pathlib.PosixPath'>
functools.ipynb
<class

In [59]:
cwd.mkdir('abcd') # 不对的

TypeError: an integer is required (got type str)

In [60]:
d = pathlib.Path('./abcd')

In [61]:
d.exists()

False

In [62]:
d.mkdir(0o755)

In [64]:
%ls -ld ./abcd

drwxr-xr-x 2 comyn comyn 6 1月  15 09:41 [0m[01;34m./abcd[0m/


In [65]:
help(d.rmdir) # rm

Help on method rmdir in module pathlib:

rmdir() method of pathlib.PosixPath instance
    Remove this directory.  The directory must be empty.



In [66]:
help(d.mkdir)

Help on method mkdir in module pathlib:

mkdir(mode=511, parents=False, exist_ok=False) method of pathlib.PosixPath instance



In [69]:
d = pathlib.Path('./ab/cd/ef')

In [70]:
d.mkdir()

FileNotFoundError: [Errno 2] No such file or directory: 'ab/cd/ef'

In [71]:
d.mkdir(parents=True) # 自动创建父目录 -> mkdir -p

In [73]:
d.mkdir(parents=True, exist_ok=True) # mkdir -p

In [76]:
d = pathlib.Path('./abcd/')

In [77]:
d.rmdir()

In [78]:
d = pathlib.Path('./ab')

In [79]:
d.rmdir() # 删除必须是空目录

OSError: [Errno 39] Directory not empty: 'ab'

### 通用操作

In [80]:
f = pathlib.Path('./ab/cd/a.txt')

In [81]:
f.exists() #判断路径是否存在

False

In [82]:
f.is_file() # 当路径不存在时 is_* 方法都返回false

False

In [83]:
f.is_dir()

False

In [84]:
f = pathlib.Path('./hello.py')

In [85]:
f.is_file()

True

In [86]:
f.is_absolute()

False

In [87]:
f.absolute()

PosixPath('/home/comyn/workspace/hello.py')

In [89]:
f.absolute().as_uri()

'file:///home/comyn/workspace/hello.py'

In [90]:
help(f.chmod)

Help on method chmod in module pathlib:

chmod(mode) method of pathlib.PosixPath instance
    Change the permissions of the path, like os.chmod().



In [91]:
f.cwd()

PosixPath('/home/comyn/workspace')

In [93]:
f.drive # windows 特有

''

In [94]:
f.expanduser()

PosixPath('hello.py')

In [95]:
pathlib.Path('~').expanduser()

PosixPath('/home/comyn')

In [96]:
f.home()

PosixPath('/home/comyn')

In [97]:
help(f.lchmod) # 如果一个路径是一个符号链接， 修改符号链接的权限

Help on method lchmod in module pathlib:

lchmod(mode) method of pathlib.PosixPath instance
    Like chmod(), except if the path points to a symlink, the symlink's
    permissions are changed, rather than its target's.



In [98]:
f.name # basename

'hello.py'

In [99]:
f.home().name

'comyn'

In [101]:
f.owner()

'comyn'

In [107]:
f.home().parent # dirname

PosixPath('/home')

In [111]:
f.home().parts

('/', 'home', 'comyn')

In [113]:
f.home().root

'/'

In [114]:
f.suffix

'.py'

In [115]:
f.suffixes

['.py']

In [116]:
f.stat() # stat

os.stat_result(st_mode=33204, st_ino=47744811, st_dev=2051, st_nlink=1, st_uid=1000, st_gid=1000, st_size=32, st_atime=1484386325, st_mtime=1484386317, st_ctime=1484386317)

In [117]:
f.lstat() # 针对符号链接

os.stat_result(st_mode=33204, st_ino=47744811, st_dev=2051, st_nlink=1, st_uid=1000, st_gid=1000, st_size=32, st_atime=1484386325, st_mtime=1484386317, st_ctime=1484386317)

In [118]:
d = pathlib.Path('.')

In [120]:
for x in d.glob('**/*.py'): # 通配符匹配
    print(x)

hello.py


In [121]:
help(d.rglob)

Help on method rglob in module pathlib:

rglob(pattern) method of pathlib.PosixPath instance
    Recursively yield all existing files (of any kind, including
    directories) matching the given pattern, anywhere in this subtree.



join

In [122]:
'/' + 'comyn' + '/' + 'workspace'

'/comyn/workspace'

In [123]:
pathlib.Path('/', 'home', 'comyn', 'workspace')

PosixPath('/home/comyn/workspace')

In [126]:
print(pathlib.PureWindowsPath('c:', '/windows', 'system32'))

c:\windows\system32


In [127]:
pathlib.Path('/', '/home', 'comyn')

PosixPath('/home/comyn')

In [128]:
def get_home(user):
    return pathlib.Path('/home', user)

### copy， move， rm

In [130]:
import shutil

* shutil.copyfileobj  # 操作对象是文件对象
* shutil.copyfile     # 仅复制内容
* shutil.copymode     # 仅复制权限
* shutil.copystat     # 仅复制元数据
* shutil.copy         # 复制文件内容和权限  copyfile + copymode
* shutil.copy2        # 复制文件内容和元数据 copyfile + copystat

**针对文件**

shutil.copytree # 递归复制目录， copy_function 参数指定用何种方法复制文件

* shutil.rmtree  # 用于递归删除目录， ignore_errors 表示是否忽略错误， onerror 参数表示如何处理错误， 仅当ignore_errors 为False时， onerror才生效, ignore_errors 为True是遇到错误直接抛出异常

shutil.move # 具体实现依赖操作系统， 如果操作系统实现了 rename系统调用， 直接走rename系统调用，如果没实现，先使用copytree复制， 然后使用rmtree删除源文件

## 序列化与反序列化

* 序列化： 对象转化为数据
* 反序列化： 数据转化为对象

In [131]:
import pickle

In [132]:
class A:
    def print(self):
        print('xxxx')

In [133]:
a = A()

In [134]:
pickle.dumps(a)

b'\x80\x03c__main__\nA\nq\x00)\x81q\x01.'

In [135]:
b = pickle.dumps(a)

In [136]:
pickle.loads(b)

<__main__.A at 0x7fde3812fd68>

In [137]:
a

<__main__.A at 0x7fde3812f2e8>

In [138]:
aa = pickle.loads(b)

In [139]:
a.print()

xxxx


In [140]:
aa.print()

xxxx


当反序列化一个对象时， 必须存在此对象的类

虽然序列化的对象， 但是事实仅仅只是数据被序列化了

In [141]:
obj = list(range(10))

In [142]:
obj

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [143]:
pickle.dumps(obj)

b'\x80\x03]q\x00(K\x00K\x01K\x02K\x03K\x04K\x05K\x06K\x07K\x08K\te.'

序列化和反序列化针对的是类型和数据

In [144]:
class RPC:
    def __init__(self):
        self.data = []
    
    def server(self):
        self.data = list(range(10))
    
    def client(self):
        print(self.data)

In [145]:
s = RPC()

In [146]:
s.server()

In [147]:
s.data

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [148]:
pickle.dumps(s)

b'\x80\x03c__main__\nRPC\nq\x00)\x81q\x01}q\x02X\x04\x00\x00\x00dataq\x03]q\x04(K\x00K\x01K\x02K\x03K\x04K\x05K\x06K\x07K\x08K\tesb.'

pickle 是用序列化协议是Python私有协议

In [149]:

import json

json       python
-----      -------
* object        dict
* array        list 
* integer      int
* float           float
* string         str

In [150]:
d = {'a': 1, 'b': [1, 2, 3]}

In [151]:
json.dumps(d)

'{"a": 1, "b": [1, 2, 3]}'

In [153]:
json.loads('{"a": 1, "b": [1, 2, 3]}')

{'a': 1, 'b': [1, 2, 3]}

In [154]:
pickle.dumps(d)

b'\x80\x03}q\x00(X\x01\x00\x00\x00aq\x01K\x01X\x01\x00\x00\x00bq\x02]q\x03(K\x01K\x02K\x03eu.'

In [156]:
help(json.loads)

Help on function loads in module json:

loads(s, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)
    Deserialize ``s`` (a ``str`` instance containing a JSON
    document) to a Python object.
    
    ``object_hook`` is an optional function that will be called with the
    result of any object literal decode (a ``dict``). The return value of
    ``object_hook`` will be used instead of the ``dict``. This feature
    can be used to implement custom decoders (e.g. JSON-RPC class hinting).
    
    ``object_pairs_hook`` is an optional function that will be called with the
    result of any object literal decoded with an ordered list of pairs.  The
    return value of ``object_pairs_hook`` will be used instead of the ``dict``.
    This feature can be used to implement custom decoders that rely on the
    order that the key and value pairs are decoded (for example,
    collections.OrderedDict will remember the ord

IDL 接口定义语言

thrift protobuf avro

grpc