- 文件打开和关闭
- 读写文件
- 上下文管理
- File-like 对象
- 路径操作
- 序列化和反序列化

IO：input/output

今天要讲的IO 就是 专指文件IO

## 打开和关闭文件

In [1]:
help(open)

Help on built-in function open in module io:

open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)
    Open file and return a stream.  Raise IOError upon failure.
    
    file is either a text or byte string giving the name (and the path
    if the file isn't in the current working directory) of the file to
    be opened or an integer file descriptor of the file to be
    wrapped. (If a file descriptor is given, it is closed when the
    returned I/O object is closed, unless closefd is set to False.)
    
    mode is an optional string that specifies the mode in which the file
    is opened. It defaults to 'r' which means open for reading in text
    mode.  Other common values are 'w' for writing (truncating the file if
    it already exists), 'x' for creating and writing to a new file, and
    'a' for appending (which on some Unix systems, means that all writes
    append to the end of the file regardless of the current seek position

In [3]:
f = open('./hello.py')

In [4]:
f

<_io.TextIOWrapper name='./hello.py' mode='r' encoding='UTF-8'>

In [5]:
f.read()

''

In [6]:
f.close()

In [7]:
f.read()

ValueError: I/O operation on closed file.

## 文件对象的操作

- 读
- 写


In [8]:
f = open('./hello.py')

In [9]:
f.write('test') # mode=r 只读权限

UnsupportedOperation: not writable

In [10]:
f.close()

In [11]:
f = open('./hello.py', mode='r')

In [12]:
f.write('test')

UnsupportedOperation: not writable

In [13]:
f.read()

''

In [14]:
f.close()

In [15]:
f = open('./not_exist.py', mode='r')

FileNotFoundError: [Errno 2] No such file or directory: './not_exist.py'

In [25]:
f = open('./hello.py', mode='w') # mode=w 文件不可读

In [26]:
%cat hello.py

In [27]:
f.read()

UnsupportedOperation: not readable

In [28]:
f.write('aaaaa')

5

In [29]:
f.close()

In [30]:
%cat hello.py

aaaaa

In [31]:
f = open('./hello.py', mode='w') # mode=w 即时打开文件后，不做任何操作，也会清空文件

In [32]:
f.close()

In [33]:
%cat hello.py

In [34]:
f = open('./not_exist.py', mode='w') # mode=w 即时打开文件后，不做任何操作，也会清空文件 并且 当文件不存在的时候，会创建该文件

In [35]:
f.close()

In [36]:
%ls ./not_exist.py

./not_exist.py


In [37]:
f = open('./hello.py', mode='x') # 当文件存在的时候 会报错

FileExistsError: [Errno 17] File exists: './hello.py'

In [38]:
%rm ./not_exist.py

In [39]:
f = open('./not_exist.py', mode='x') # 当文件存在的时候 会报错, 不可读，可写

In [40]:
f.read()

UnsupportedOperation: not readable

In [41]:
f.write('abcd')

4

In [42]:
f.close()

In [43]:
%cat not_exist.py

abcd

In [45]:
%cat hello.py

print('hello world')


In [46]:
f = open('./hello.py', mode='a') # mode=a 不可读，可写，并且是追加写

In [47]:
f.read()

UnsupportedOperation: not readable

In [48]:
f.write('abcd')

4

In [49]:
f.close()

In [50]:
%cat hello.py

print('hello world')
abcd

In [51]:
%rm ./not_exist.py

In [52]:
f = open('./not_exist.py', mode='a')

In [53]:
f.write('abcd')

4

In [54]:
f.close()

In [55]:
%cat not_exist.py

abcd

从 控制读写的模式：
- r 只读，文件必须存在
- w 只写，清空文件，文件不存在会创建文件
- x 只写，文件必须不存在
- a 只写，追加到文件末尾，文件不存在会创建文件

从其他方面来看：
- 从读写方面来看，只有r 是 可读不可写的，其他都是可写不可读
- 从文件不存在来看，只有r 抛出异常，其他都是创建新文件
- 从文件存在来看，只有x 抛出异常
- 从是否影响内容来看，只有w 会清空文件

In [56]:
%cat hello.py

print('hello world')
abcd

In [57]:
f = open('./hello.py', mode='a')

In [58]:
f.write('qwer')

4

In [60]:
f.close()

In [61]:
%cat hello.py

print('hello world')
abcdqwer

In [62]:
f = open('./hello.py', mode='rt') # mode=t 是以文本方式打开文件

In [63]:
s = f.read()

In [64]:
s

"print('hello world')\nabcdqwer"

In [65]:
f.close()

In [66]:
f = open('./hello.py', mode='rb')

In [67]:
s = f.read()

In [68]:
s

b"print('hello world')\nabcdqwer"

- mode=t 按字符操作
- mode=b 按字节操作

In [69]:
f.close()

In [70]:
f = open('hello.py',mode='wb')

In [73]:
f.write('马哥教育'.encode())

12

In [77]:
f.close()

In [75]:
%cat hello.py

马哥教育

In [78]:
f = open('hello.py',mode='wt')

In [79]:
f.write('马哥教育')

4

In [80]:
f.close()

In [81]:
%cat hello.py

马哥教育

In [82]:
f = open('./hello.py', mode='rw') # rwxa 四种模式同时只能有一个存在

ValueError: must have exactly one of create/read/write/append mode

In [83]:
f = open('./hello.py', mode='r+') # mode=r+ 可读可写，并且是一个追加写

In [84]:
f.read()

'马哥教育'

In [85]:
f.write('haha')

4

In [86]:
f.read()

''

In [87]:
f.close()

In [88]:
%cat hello.py

马哥教育haha

In [89]:
f = open('./hello.py', mode='w+') # 同时也会先清空文件

In [90]:
f.read()

''

In [91]:
f.write('haha')

4

In [92]:
f.close()

In [93]:
%cat hello.py

haha

**由于w+ 会先清空文件，所以一般打开文件都会使用r+**

In [94]:
f = open('./hello.py', mode='r+')

In [95]:
f.write('he')

2

In [96]:
f.read()

'ha'

In [97]:
f.close()

In [98]:
%cat hello.py

heha

EOF, end of file

In [99]:
f = open('./hello.py', mode='a+')

In [100]:
f.read()

''

In [101]:
f.write('heihei')

6

In [102]:
f.close()

In [103]:
%cat hello.py

hehaheihei

In [104]:
f = open('hello.py', mode='+') # rwxa 模式有且仅有一种 存在于场上，并且加号不能单独使用

ValueError: Must have exactly one of create/read/write/append mode and at most one plus

### 文件指针

In [105]:
f = open('./hello.py') # mode=rt

In [106]:
f.tell()

0

In [107]:
f.read()

'hehaheihei'

In [108]:
f.tell()

10

In [109]:
f.read()

''

In [110]:
f.close()

In [111]:
f = open('./hello.py', mode='a')

In [112]:
f.tell()

10

In [113]:
f.close()

In [114]:
f = open('./hello.py')

In [115]:
help(f.seek)

Help on built-in function seek:

seek(cookie, whence=0, /) method of _io.TextIOWrapper instance
    Change stream position.
    
    Change the stream position to the given byte offset. The offset is
    interpreted relative to the position indicated by whence.  Values
    for whence are:
    
    * 0 -- start of stream (the default); offset should be zero or positive
    * 1 -- current stream position; offset may be negative
    * 2 -- end of stream; offset is usually negative
    
    Return the new absolute position.



In [116]:
f.tell()

0

In [117]:
f.read()

'hehaheihei'

In [118]:
f.tell()

10

In [119]:
f.seek(0, 0)

0

In [120]:
f.tell()

0

In [121]:
f.read()

'hehaheihei'

In [122]:
f.seek(4, 0)

4

In [123]:
f.tell()

4

In [124]:
f.read()

'heihei'

In [125]:
f.seek(0, 0)

0

In [126]:
f.read()

'hehaheihei'

In [127]:
f.seek(4, 1)

UnsupportedOperation: can't do nonzero cur-relative seeks

In [128]:
f.seek(4, 2)

UnsupportedOperation: can't do nonzero end-relative seeks

In [129]:
f.seek(0, 1)

10

In [130]:
f.seek(0, 2)

10

In [131]:
f.tell()

10

In [132]:
f.close()

mode=t
- 按字节移动文件指针
- 当whence （第二个参数）位 start(0) （默认值） ， 可以移动任意位置，offset 可以是任意整数，（offset 就是他的第一个参数）
- 当whence 位current 也就是1，或者是end 也就是2的时候，offset 只能为0

In [133]:
f = open('./hello.py', mode='w')

In [134]:
f.write('马哥教育')

4

In [135]:
f.close()

In [137]:
f = open('./hello.py', mode='rb')

In [138]:
f.tell()

0

In [139]:
f.seek(3)

3

In [140]:
f.read()

b'\xe5\x93\xa5\xe6\x95\x99\xe8\x82\xb2'

In [141]:
f.seek(3)

3

In [142]:
f.read().decode()

'哥教育'

In [143]:
f.seek(3)

3

In [144]:
f.seek(3, 1)

6

In [145]:
f.seek(3, 2)

15

In [146]:
f.read()

b''

In [147]:
f.seek(-3, 2)

9

In [148]:
f.read().decode()

'育'

In [149]:
f.seek(13)

13

In [150]:
f.seek(-13, 2)

OSError: [Errno 22] Invalid argument

mode=b
- 按字节移动指针
- 当whence start(0)， 可以移动任意位置，offset 可以是任意整数
- 当whence 位 为 current 也就是1，end 也就是2的时候，也可以是任意整数

In [151]:
f.close()

In [152]:
f = open('./hello.py', mode='a+')

In [153]:
f.seek(13)

13

In [154]:
f.write('abc')

3

In [155]:
f.seek(0)

0

In [156]:
f.read()

'马哥教育abc'

In [157]:
f.tell()

15

移动文件指针：
- 文件指针按字节操作
- tell 方法返回当前文件指针位置
- seek 方法移动文件指针
- whence 参数start(0), current(1), end(2) 事实上，这些变量有我们的常量 SEEK_SET(0), SEEK_CUR(1), SEEK_END(2)
    - SEEK_SET(0) 从0开始向后移动offset 个字节
    - SEEK_CUR(1) 从当前位置向后移动offset个字节
    - SEEK_END(2) 从EOF向后移动offset个字节
- offset 是整数
- 当mode 为t 时，whence 为SEEK_CUR 或者SEEK_END ，offset 只能为0
- 文件指针不能为负数
- 读文件的时候，从文件指针开始向后读
- 写文件的时候，从min(EOF)处 开始向后写
- 当以mode 为a 模式打开的时候，无论文件指针在何处，都从EOF开始写

In [158]:
help(open)

Help on built-in function open in module io:

open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)
    Open file and return a stream.  Raise IOError upon failure.
    
    file is either a text or byte string giving the name (and the path
    if the file isn't in the current working directory) of the file to
    be opened or an integer file descriptor of the file to be
    wrapped. (If a file descriptor is given, it is closed when the
    returned I/O object is closed, unless closefd is set to False.)
    
    mode is an optional string that specifies the mode in which the file
    is opened. It defaults to 'r' which means open for reading in text
    mode.  Other common values are 'w' for writing (truncating the file if
    it already exists), 'x' for creating and writing to a new file, and
    'a' for appending (which on some Unix systems, means that all writes
    append to the end of the file regardless of the current seek position

In [160]:
f = open('hello.py', 'wb')

In [161]:
f.write(b'abc')

3

In [162]:
%cat hello.py

In [163]:
f.flush() # 落盘操作

In [164]:
%cat hello.py

abc

In [165]:
f.write(b'apapapa')

7

In [166]:
%cat hello.py

abc

In [167]:
f.close() # 自带flush 

In [168]:
%cat hello.py

abcapapapa

In [169]:
f = open('./hello.py', 'wb', buffering=5)

In [170]:
f.write(b'abc')

3

In [171]:
%cat hello.py

In [172]:
f.write(b'abc')

3

In [173]:
%cat hello.py

abc

In [174]:
f.close()

In [175]:
f = open('./hello.py', 'wb', buffering=5)

In [176]:
f.write(b'a')

1

In [177]:
f.write(b'b')

1

In [178]:
%cat hello.py

In [180]:
f.write(b'q')

1

In [181]:
f.write(b'ab')

2

In [182]:
%cat hello.py

In [183]:
f.write(b'1')

1

In [184]:
%cat hello.py

abqab

In [185]:
f.close()

In [186]:
f = open('hello.py', 'wb', buffering=0)

In [187]:
f.write(b'a')

1

In [188]:
%cat hello.py

a

In [189]:
f.close()

In [190]:
f = open('hello.py', 'wb', buffering=0)

In [191]:
f.write(b'abcdefgh')

8

In [192]:
%cat hello.py

abcdefgh

In [193]:
import io

In [194]:
io.DEFAULT_BUFFER_SIZE

8192

In [195]:
f = open('hello.py', 'wt', buffering=1) # 只有当buffering=1 的时候，才是line buffer

In [196]:
f.write('abc')

3

In [197]:
%cat hello.py

In [198]:
f.write('\n')

1

In [199]:
%cat hello.py

abc


In [200]:
f.close()

In [201]:
f = open('hello.py', 'wt', buffering=5)

In [202]:
f.write('abc')

3

In [203]:
%cat hello.py

In [204]:
f.write('\n')

1

In [205]:
%cat hello.py

In [206]:
f.write('a' * io.DEFAULT_BUFFER_SIZE)

8192

In [207]:
f.close()

In [208]:
f = open('hello.py', 'wt', buffering=5)

In [209]:
f.write('a' * io.DEFAULT_BUFFER_SIZE)

8192

In [210]:
%cat hello.py

In [211]:
f.write('b')

1

**buffering > 1 的时候，缓冲区大小为io.DEFAULT_BUFFER_SIZE, 当缓冲区满时，flush 缓冲区连同本次写入的内容**

In [213]:
f.close()

In [214]:
f = open('hello.py', 'wt', buffering=0)

ValueError: can't have unbuffered text I/O

- buffering = -1
    - 二进制模式：io.DEFAULT_BUFFER_SIZE
    - 文本模式：io.DEFAULT_BUFFER_SIZE
- buffering = 0
    - 二进制模式：关闭buffering 也就是unbuffered
    - 文本模式：不允许
- buffering = 1
    - 二进制模式：1
    - 文本模式： line buffer
- buffering > 1
    - 二进制模式：buffering
    - 文本模式：io.DEFAULT_BUFFER_SIZE
    - 二进制模式：判断缓冲区甚于位置是否足够存放当前字节，如果不能，先flush，再把当前字节写入缓冲区，如果当前字节大于缓冲区大小，直接flush
    - 文本模式：line buffering，遇到换行就flush，非line buffering，当前字节加缓冲区中的字节超出缓冲区大小，直接flush 缓冲区和当前字节

**flush 和close 方法可以强制刷新缓冲区**

In [215]:
f = open('hello.py', 'wt+', buffering=1)

In [216]:
f.write('abcd')

4

In [217]:
f.read()

''

In [218]:
f.seek(0)

0

In [219]:
f.tell()

0

In [220]:
f.read()

'abcd'

In [221]:
f.close()

In [222]:
f = open('hello.py', 'r+')

In [223]:
f.read(4)

'sldf'

In [224]:
f.read(4) # 跟着指针走

'jsld'

In [225]:
f.read(-1)

'kfslkkfsalkdfl\n'

In [226]:
f.seek(0)

0

In [227]:
f.read(0)

''

In [228]:
f.readline()

'sldfjsldkfslkkfsalkdfl\n'

In [229]:
help(f.readline)

Help on built-in function readline:

readline(size=-1, /) method of _io.TextIOWrapper instance
    Read until newline or EOF.
    
    Returns an empty string if EOF is hit immediately.



In [233]:
f.seek(0)

0

In [235]:
f.readline(2)

'ld'

In [236]:
f.readlines()

['fjsldkfslkkfsalkdfl\n']

In [237]:
f.seek(0)

0

In [238]:
for line in f:
    print(line)

sldfjsldkfslkkfsalkdfl



In [239]:
f.writable()

True

In [240]:
f.write('abc\n')

4

In [241]:
f.writelines(['abc\n', 'sdfsdfsdf\n'])

In [242]:
f.flush()

In [243]:
%cat hello.py

sldfjsldkfslkkfsalkdfl
abc
abc
sdfsdfsdf


In [244]:
f.seekable()

True

In [245]:
import sys

In [246]:
sys.stderr.seekable()

False

In [247]:
f.fileno()

56

In [248]:
f.isatty()

False

In [249]:
f.name

'hello.py'

In [250]:
f.buffer

<_io.BufferedRandom name='hello.py'>

In [251]:
f.truncate()

41

In [252]:
f.close()

In [253]:
f = open('hello.py', 'r+b')

In [263]:
f1 = open('test.txt', 'r+b')

In [258]:
f1.write(b'sdfsdfsfsdf')

11

In [259]:
f1.close()

In [262]:
f.seek(0)

0

In [264]:
f.readinto(f1)

TypeError: readinto() argument must be read-write bytes-like object, not _io.BufferedRandom

In [265]:
f.close()

In [266]:
f1.close()

In [267]:
f1 = open('test.txt', 'r+b')

In [268]:
f = open('hello.py', 'r+b')

In [269]:
buffer = bytearray()

In [270]:
f.readinto(buffer)

0

In [271]:
f.close()

In [273]:
f1.close()