# 第18章：文件读写

学习如何读写文件，处理数据持久化，让程序能够保存和读取数据。

## 为什么需要文件操作？

程序运行时的数据都在内存中，一旦程序结束，数据就消失了。如果我们想保存数据（比如用户信息、游戏进度、设置等），就需要把数据写入文件。

**文件操作的用途**：
- **配置文件** - 保存程序设置
- **日志文件** - 记录程序运行日志
- **数据存储** - 保存用户数据
- **数据交换** - 与其他程序共享数据
- **备份恢复** - 数据备份和恢复

In [None]:
# 没有文件操作 - 数据会丢失
score = 100
# 程序关闭后，score就消失了

# 有文件操作 - 数据持久化
with open("score.txt", "w") as f:
    f.write(str(score))
# 下次打开程序时可以读取score

## 打开文件

### 基本语法

```python
# 基本格式
file = open(filename, mode, encoding)
```

**参数说明**：
- `filename` - 文件名（可以包含路径）
- `mode` - 打开模式（读/写/追加等）
- `encoding` - 编码方式（推荐使用"utf-8"）

In [None]:
# 基本格式示例
file = open("example.txt", "r", encoding="utf-8")
content = file.read()
file.close()  # 记得关闭文件！
print(content)

### 使用with语句（推荐）

**为什么要用with？**
- 自动关闭文件，即使出错也会关闭
- 代码更简洁
- 避免资源泄漏

In [None]:
# ✅ 推荐 - 自动关闭文件
with open("example.txt", "r", encoding="utf-8") as file:
    content = file.read()
    print(content)
# 离开with块时自动关闭文件

### 文件模式

```python
# 文本模式
'r'   # 只读（默认），文件必须存在
'w'   # 写入，文件存在则覆盖，不存在则创建
'a'   # 追加，文件存在则在末尾追加，不存在则创建
'x'   # 独占创建，文件存在则报错

# 读写模式
'r+'  # 读写，文件必须存在
'w+'  # 读写，文件存在则覆盖，不存在则创建
'a+'  # 读写，文件存在则在末尾追加，不存在则创建

# 二进制模式（加b）
'rb'  # 二进制读
'wb'  # 二进制写
'ab'  # 二进制追加
```

In [None]:
# 只读模式
with open("file.txt", "r") as f:
    content = f.read()

# 写入模式（会覆盖原文件！）
with open("output.txt", "w") as f:
    f.write("新内容")

# 追加模式（不会覆盖）
with open("log.txt", "a") as f:
    f.write("新的一行\n")

# 二进制模式（用于图片、音频等）
with open("image.png", "rb") as f:
    data = f.read()

## 读取文件

### 方法1：read() - 读取全部

**适用场景**：小文件、需要一次性处理全部内容

In [None]:
# 读取整个文件
with open("file.txt", "r", encoding="utf-8") as f:
    content = f.read()
    print(content)

# 读取指定字符数
with open("file.txt", "r") as f:
    first_10_chars = f.read(10)  # 读取前10个字符
    print(first_10_chars)

### 方法2：readline() - 读取一行

**适用场景**：需要逐行处理、文件很大时

In [None]:
# 读取一行
with open("file.txt", "r") as f:
    line1 = f.readline()  # 第一行
    line2 = f.readline()  # 第二行
    print(line1)
    print(line2)

# 读取所有行（逐行）
with open("file.txt", "r") as f:
    while True:
        line = f.readline()
        if not line:  # 读到文件末尾
            break
        print(line.strip())

### 方法3：readlines() - 读取所有行

**适用场景**：需要多次访问所有行

In [None]:
# 返回列表，每个元素是一行
with open("file.txt", "r") as f:
    lines = f.readlines()
    print(type(lines))  # <class 'list'>
    for line in lines:
        print(line.strip())  # strip()去除换行符

### 方法4：迭代文件对象（推荐）

**优点**：
- 内存效率高（逐行读取）
- 代码简洁
- 自动处理换行

In [None]:
# 最pythonic的方式
with open("file.txt", "r") as f:
    for line in f:
        print(line.strip())

### 对比示例

In [None]:
# 假设file.txt内容：
# 第一行
# 第二行
# 第三行

# read() - 读取全部
with open("file.txt", "r") as f:
    content = f.read()
    print(repr(content))  # '第一行\n第二行\n第三行\n'

# readlines() - 读取为列表
with open("file.txt", "r") as f:
    lines = f.readlines()
    print(lines)  # ['第一行\n', '第二行\n', '第三行\n']

# 迭代 - 逐行处理
with open("file.txt", "r") as f:
    for line in f:
        print(line.strip())  # 第一行 第二行 第三行

## 写入文件

### 方法1：write() - 写入字符串

**注意事项**：
- `write()` 不会自动添加换行符，需要手动添加 `\n`
- `'w'` 模式会清空原文件内容
- `'a'` 模式在文件末尾追加

In [None]:
# 覆盖写入
with open("output.txt", "w", encoding="utf-8") as f:
    f.write("第一行\n")
    f.write("第二行\n")
    # 注意：write()不会自动添加换行符

# 追加写入
with open("output.txt", "a") as f:
    f.write("第三行\n")

### 方法2：writelines() - 写入多行

In [None]:
lines = ["第一行\n", "第二行\n", "第三行\n"]

with open("output.txt", "w") as f:
    f.writelines(lines)
# 注意：每个字符串需要自己带\n

### 格式化写入

In [None]:
# 使用f-string
name = "张三"
score = 85
with open("result.txt", "w", encoding="utf-8") as f:
    f.write(f"姓名：{name}\n")
    f.write(f"分数：{score}\n")

# 使用format()
with open("result.txt", "w") as f:
    f.write("姓名：{}\n".format(name))
    f.write("分数：{}\n".format(score))

# 写入数字（需要转字符串）
numbers = [1, 2, 3, 4, 5]
with open("numbers.txt", "w") as f:
    for num in numbers:
        f.write(str(num) + "\n")

### 覆盖 vs 追加

In [None]:
# 覆盖模式 'w' - 会删除原内容
with open("test.txt", "w") as f:
    f.write("新内容\n")
# 原文件内容被清空

# 追加模式 'a' - 保留原内容
with open("test.txt", "a") as f:
    f.write("追加内容\n")
# 原文件内容保留，新内容添加到末尾

## 文件路径

### 相对路径 vs 绝对路径

In [None]:
# 相对路径（相对于当前工作目录）
with open("file.txt", "r") as f:
    content = f.read()

with open("data/file.txt", "r") as f:
    content = f.read()

# 绝对路径
with open("C:/Users/username/file.txt", "r") as f:
    content = f.read()

# Windows路径
with open("C:\\Users\\username\\file.txt", "r") as f:
    content = f.read()

# 原始字符串（推荐）
with open(r"C:\Users\username\file.txt", "r") as f:
    content = f.read()

### 使用os.path处理路径

In [None]:
import os

# 当前工作目录
current_dir = os.getcwd()
print(current_dir)

# 拼接路径（跨平台）
file_path = os.path.join("data", "files", "example.txt")
print(file_path)  # data/files/example.txt（Windows: data\files\example.txt）

# 检查文件是否存在
if os.path.exists("file.txt"):
    print("文件存在")
else:
    print("文件不存在")

# 检查是文件还是目录
if os.path.isfile("test.txt"):
    print("这是文件")

if os.path.isdir("data"):
    print("这是目录")

# 获取文件信息
if os.path.exists("file.txt"):
    size = os.path.getsize("file.txt")
    print(f"文件大小：{size}字节")

# 分割路径
path = "/home/user/documents/file.txt"
dir_path = os.path.dirname(path)   # /home/user/documents
filename = os.path.basename(path)  # file.txt
print(dir_path, filename)

### 创建目录

In [None]:
import os

# 创建单级目录
if not os.path.exists("data"):
    os.mkdir("data")

# 创建多级目录
os.makedirs("data/files/backup", exist_ok=True)
# exist_ok=True：如果目录已存在不报错

## 文件定位

### seek() 和 tell()

**seek()参数**：
```python
# seek(offset, whence)
# whence: 0-从文件开头, 1-从当前位置, 2-从文件末尾
```

In [None]:
with open("file.txt", "r") as f:
    # tell() - 获取当前位置
    print(f.tell())  # 0

    # 读取10个字符
    content = f.read(10)
    print(f.tell())  # 10

    # seek() - 移动到指定位置
    f.seek(0)  # 回到文件开头
    print(f.tell())  # 0

    # 从头再读一次
    content = f.read()

## 二进制文件

### 读取二进制文件

In [None]:
# 读取图片
with open("image.png", "rb") as f:
    data = f.read()
    print(type(data))  # <class 'bytes'>
    print(len(data))   # 字节数

# 复制图片
with open("source.png", "rb") as src:
    with open("copy.png", "wb") as dst:
        dst.write(src.read())

### 文本 vs 二进制

In [None]:
# 文本模式（自动处理换行符、编码）
with open("text.txt", "r") as f:
    content = f.read()  # str类型

# 二进制模式（原始字节）
with open("text.txt", "rb") as f:
    content = f.read()  # bytes类型

## 实战例子

### 例子1：读取配置文件

In [None]:
def read_config(filename):
    """
    读取配置文件
    格式：key=value
    """
    config = {}
    try:
        with open(filename, "r", encoding="utf-8") as f:
            for line in f:
                line = line.strip()
                # 跳过空行和注释
                if not line or line.startswith("#"):
                    continue
                if "=" in line:
                    key, value = line.split("=", 1)
                    config[key.strip()] = value.strip()
    except FileNotFoundError:
        print(f"配置文件不存在：{filename}")
    return config

# config.txt内容：
# host=localhost
# port=8080
# # 这是注释
# debug=true

config = read_config("config.txt")
print(config)  # {'host': 'localhost', 'port': '8080', 'debug': 'true'}

### 例子2：日志记录系统

In [None]:
from datetime import datetime

class Logger:
    """简单的日志记录器"""

    def __init__(self, filename="app.log"):
        self.filename = filename

    def _write_log(self, level, message):
        """写入日志"""
        timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
        log_line = f"[{timestamp}] [{level}] {message}\n"
        with open(self.filename, "a", encoding="utf-8") as f:
            f.write(log_line)

    def info(self, message):
        """信息日志"""
        self._write_log("INFO", message)

    def warning(self, message):
        """警告日志"""
        self._write_log("WARNING", message)

    def error(self, message):
        """错误日志"""
        self._write_log("ERROR", message)

# 使用
logger = Logger("app.log")
logger.info("程序启动")
logger.warning("内存使用过高")
logger.error("数据库连接失败")

### 例子3：简单的CSV处理

In [None]:
def read_csv(filename):
    """读取CSV文件"""
    data = []
    with open(filename, "r", encoding="utf-8") as f:
        lines = f.readlines()
        if not lines:
            return data

        # 第一行是表头
        headers = [h.strip() for h in lines[0].strip().split(",")]

        # 处理数据行
        for line in lines[1:]:
            values = [v.strip() for v in line.strip().split(",")]
            row = dict(zip(headers, values))
            data.append(row)

    return data

def write_csv(filename, data, headers):
    """写入CSV文件"""
    with open(filename, "w", encoding="utf-8") as f:
        # 写入表头
        f.write(",".join(headers) + "\n")

        # 写入数据
        for row in data:
            values = [str(row.get(h, "")) for h in headers]
            f.write(",".join(values) + "\n")

# 使用
students = [
    {"name": "张三", "age": "18", "score": "85"},
    {"name": "李四", "age": "19", "score": "92"},
    {"name": "王五", "age": "20", "score": "78"}
]

write_csv("students.csv", students, ["name", "age", "score"])
data = read_csv("students.csv")
print(data)

### 例子4：文件复制工具

In [None]:
import os

def copy_file(source, destination):
    """复制文件"""
    if not os.path.exists(source):
        print(f"源文件不存在：{source}")
        return False

    try:
        with open(source, "rb") as src:
            with open(destination, "wb") as dst:
                # 分块复制（适合大文件）
                chunk_size = 1024 * 1024  # 1MB
                while True:
                    chunk = src.read(chunk_size)
                    if not chunk:
                        break
                    dst.write(chunk)

        print(f"复制成功：{source} -> {destination}")
        return True
    except Exception as e:
        print(f"复制失败：{e}")
        return False

# 使用
copy_file("source.txt", "backup/source.txt")

### 例子5：单词统计

In [None]:
def count_words(filename):
    """统计文件中的单词频率"""
    word_count = {}

    try:
        with open(filename, "r", encoding="utf-8") as f:
            for line in f:
                # 转小写，按空格分割
                words = line.lower().split()
                for word in words:
                    # 去除标点符号
                    word = word.strip(",.!?;:'\"")
                    if word:
                        word_count[word] = word_count.get(word, 0) + 1
    except FileNotFoundError:
        print(f"文件不存在：{filename}")
        return {}

    return word_count

def print_top_words(word_count, n=10):
    """打印出现最多的n个单词"""
    sorted_words = sorted(word_count.items(), key=lambda x: x[1], reverse=True)
    print(f"\n出现最多的{n}个单词：")
    for word, count in sorted_words[:n]:
        print(f"{word}: {count}次")

# 使用
words = count_words("article.txt")
print_top_words(words, 5)

### 例子6：简单的记事本

In [None]:
class Notebook:
    """简单的记事本程序"""

    def __init__(self, filename="notes.txt"):
        self.filename = filename

    def add_note(self, note):
        """添加笔记"""
        from datetime import datetime
        timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
        with open(self.filename, "a", encoding="utf-8") as f:
            f.write(f"[{timestamp}]\n{note}\n{'-'*50}\n")
        print("笔记已保存")

    def view_notes(self):
        """查看所有笔记"""
        try:
            with open(self.filename, "r", encoding="utf-8") as f:
                content = f.read()
                if content:
                    print(content)
                else:
                    print("还没有笔记")
        except FileNotFoundError:
            print("还没有笔记")

    def clear_notes(self):
        """清空所有笔记"""
        with open(self.filename, "w") as f:
            pass
        print("所有笔记已清空")

# 使用
notebook = Notebook()
notebook.add_note("今天学习了Python文件操作")
notebook.add_note("明天继续学习异常处理")
notebook.view_notes()

### 例子7：文件合并

In [None]:
def merge_files(input_files, output_file):
    """合并多个文件"""
    with open(output_file, "w", encoding="utf-8") as out:
        for filename in input_files:
            try:
                with open(filename, "r", encoding="utf-8") as f:
                    content = f.read()
                    out.write(f"\n{'='*50}\n")
                    out.write(f"来自文件：{filename}\n")
                    out.write(f"{'='*50}\n")
                    out.write(content)
                    out.write("\n")
            except FileNotFoundError:
                print(f"文件不存在：{filename}")

    print(f"文件已合并到：{output_file}")

# 使用
files = ["file1.txt", "file2.txt", "file3.txt"]
merge_files(files, "merged.txt")

### 例子8：查找文件中的关键词

In [None]:
def search_in_file(filename, keyword):
    """在文件中搜索关键词"""
    results = []
    try:
        with open(filename, "r", encoding="utf-8") as f:
            for line_num, line in enumerate(f, 1):
                if keyword.lower() in line.lower():
                    results.append((line_num, line.strip()))
    except FileNotFoundError:
        print(f"文件不存在：{filename}")
        return []

    return results

def search_in_directory(directory, keyword, extension=".txt"):
    """在目录中的所有文件中搜索"""
    import os

    print(f"在 {directory} 中搜索 '{keyword}'...\n")

    for filename in os.listdir(directory):
        if filename.endswith(extension):
            filepath = os.path.join(directory, filename)
            results = search_in_file(filepath, keyword)
            if results:
                print(f"\n文件：{filename}")
                for line_num, line in results:
                    print(f"  第{line_num}行：{line}")

# 使用
results = search_in_file("article.txt", "python")
for line_num, line in results:
    print(f"第{line_num}行：{line}")

## 异常处理

### 常见文件异常

In [None]:
# FileNotFoundError - 文件不存在
try:
    with open("nonexistent.txt", "r") as f:
        content = f.read()
except FileNotFoundError:
    print("文件不存在")

# PermissionError - 没有权限
try:
    with open("/root/file.txt", "w") as f:
        f.write("test")
except PermissionError:
    print("没有权限写入文件")

# UnicodeDecodeError - 编码错误
try:
    with open("file.txt", "r", encoding="utf-8") as f:
        content = f.read()
except UnicodeDecodeError:
    print("文件编码错误，尝试其他编码")
    with open("file.txt", "r", encoding="gbk") as f:
        content = f.read()

# IOError - 通用I/O错误
try:
    with open("file.txt", "r") as f:
        content = f.read()
except IOError as e:
    print(f"读取文件出错：{e}")

### 安全的文件操作

In [None]:
def safe_read_file(filename, default=""):
    """安全地读取文件"""
    try:
        with open(filename, "r", encoding="utf-8") as f:
            return f.read()
    except FileNotFoundError:
        print(f"文件不存在：{filename}")
        return default
    except Exception as e:
        print(f"读取文件出错：{e}")
        return default

def safe_write_file(filename, content):
    """安全地写入文件"""
    try:
        with open(filename, "w", encoding="utf-8") as f:
            f.write(content)
        return True
    except Exception as e:
        print(f"写入文件出错：{e}")
        return False

# 使用
content = safe_read_file("config.txt", default="# 默认配置")
success = safe_write_file("output.txt", "Hello World")

## 最佳实践

### 1. 始终使用with语句

In [None]:
# ✅ 推荐
with open("file.txt", "r") as f:
    content = f.read()

### 2. 明确指定编码

In [None]:
# ✅ 推荐
with open("file.txt", "r", encoding="utf-8") as f:
    content = f.read()

### 3. 处理异常

In [None]:
# ✅ 推荐
try:
    with open("file.txt", "r") as f:
        content = f.read()
except FileNotFoundError:
    print("文件不存在")

### 4. 大文件逐行处理

In [None]:
# ✅ 推荐 - 内存友好
with open("large_file.txt", "r") as f:
    for line in f:
        print(line.strip())

### 5. 使用pathlib（Python 3.4+）

In [None]:
from pathlib import Path

# 现代化的路径操作
file = Path("data") / "file.txt"

# 读取
content = file.read_text(encoding="utf-8")

# 写入
file.write_text("内容", encoding="utf-8")

# 检查存在
if file.exists():
    print("文件存在")

## 本章重点

- ✅ 理解文件操作的基本概念
- ✅ 掌握文件的打开、读取、写入
- ✅ 使用with语句管理文件
- ✅ 处理不同的文件模式
- ✅ 理解路径操作
- ✅ 处理文件异常
- ✅ 掌握最佳实践

**记住**
- 始终使用with语句
- 明确指定encoding="utf-8"
- 'w'模式会清空文件
- 大文件要逐行处理
- 处理文件异常
- 注意路径分隔符问题