# 讀寫 Text File (文字檔)

- 文字檔的副檔名為 .txt
- 內容: 通常只包含字母、數字、標點符號、空白字符、tab (\t) 和換行符號 (\n)，不包含任何樣式、字體或其他格式屬性
- 字元編碼: Mac, Windows 預設是UTF-8
- 範例: /Users/jacky/demo_path/files/txt/定風波.txt
  - 用 Notepad 打開檔案讀內容 (看不到隱藏字元)
  - 用 Word 打開檔案讀內容 (可以看到隱藏字元) (select UTF-8 encoding)
  - 用 Python 程式檔案讀內容 (通過變數的觀察，可以看到隱藏字元)

In [None]:
# 用 Python 程式檔案讀內容
from pathlib import Path
file_path = Path.home() / "demo_path" / "files" / "txt" / "定風波.txt"

with open(file_path, 'r', encoding = 'utf-8') as f:
    foo = f.read()
    print(repr(foo)) # 顯示字串的原始碼
    print(foo) # 顯示字串的內容

# Open Text File

- Alternative 1: **f = open(檔案名稱 ,模式 , encoding = 編碼方式)**
- Alternative 2: **with open(檔案名稱 ,模式 , encoding = 編碼方式) as f:**
- 模式: 預設是 r
  - r &nbsp;&nbsp;&nbsp; Read - Default value. Opens a file for reading, error if the file does not exist<br>
  - a &nbsp;&nbsp;&nbsp; Append - Opens a file for appending, creates the file if it does not exist (附加內容)<br>
  - w &nbsp;&nbsp;&nbsp; Write - Opens a file for writing, creates the file if it does not exist (覆寫內容)<br>
  - x &nbsp;&nbsp;&nbsp; X - 檔案不存在，建新檔案，若檔案存在，則error
- 編碼方式: 預設是 uft-8
- 檔案名稱: 利用絕對路徑或相對路徑來指定


In [None]:
# Alternative 1
# mac 使用絕對路徑開檔案 (注意斜線/), 記得關檔案
# 模式是 'r' (read) 代表讀取
from pathlib import Path
file_path = Path.home() / "demo_path" / "files" / "txt" / "定風波.txt"
f = open(file_path,'r', encoding = 'utf-8')
print(f.read())
f.close()

In [None]:
# Alternative 2: with open(), not need call close function
from pathlib import Path
file_path = Path.home() / "demo_path" / "files" / "txt" / "定風波.txt"
with open(file_path,'r', encoding = 'utf-8') as f:
    print(f.read())

# Open for read

- read()
- read(int)
- readline()
- readlines()

In [None]:
# Read all contents
from pathlib import Path
file_path = Path.home() / "demo_path" / "files" / "txt" / "定風波.txt"
with open(file_path, 'r', encoding = 'utf-8') as f:
    foo = f.read()
    print(foo)

In [None]:
# Read 3, 10, 2 characters
from pathlib import Path
file_path = Path.home() / "demo_path" / "files" / "txt" / "定風波.txt"
with open(file_path, 'r', encoding = 'utf-8') as f:
    print(f.read(3))
    print(f.read(10))
    print(f.read(2))

In [None]:
# Read four lines
from pathlib import Path
file_path = Path.home() / "demo_path" / "files" / "txt" / "定風波.txt"
with open(file_path, 'r', encoding = 'utf-8') as f:
    f.readline()
    print(f.readline(), end='')
    print(f.readline(), end='')
    print(f.readline(), end='')
    print(f.readline(), end='')

In [None]:
# Read lines and use for-Loop print line by line
from pathlib import Path
file_path = Path.home() / "demo_path" / "files" / "txt" / "定風波.txt"
with open(file_path, 'r', encoding = 'utf-8') as f:
    lines = f.readlines()
    for line in lines:
        print(line, end='')        

# Open for write

- 模式是 w, 檔案不存在，建新檔案，若檔案存在，則蓋掉原來的內容
- 模式是 a, 檔案不存在，建新檔案，若檔案存在，則加在原來的內容之後
- 模式是 x, 檔案不存在，建新檔案，若檔案存在，則error
- write(str)
- writelines(list)

In [None]:
# 參數 w, 檔案不存在，建新檔案，若檔案存在，則蓋掉原來的內容
from pathlib import Path
file_path = Path.home() / "demo_path" / "files" / "txt" / "demofile.txt"

with open(file_path, 'w', encoding='utf-8') as f:
    f.write("write first line\n")
with open(file_path, 'r', encoding='utf-8') as f:
    print(f.read())

In [None]:
# 參數 a, 檔案不存在，建新檔案，若檔案存在，則加在原來的內容之後
from pathlib import Path
file_path = Path.home() / "demo_path" / "files" / "txt" / "demofile.txt"
with open(file_path, 'a', encoding='utf-8') as f:
    f.write("append second line\n")
with open(file_path, 'r', encoding='utf-8') as f:
    print(f.read())

In [None]:
# write again use w
from pathlib import Path
file_path = Path.home() / "demo_path" / "files" / "txt" / "demofile.txt"
with open(file_path, 'w', encoding='utf-8') as f:
    f.write("write again. existed content has been cleared\n")
with open(file_path, 'r', encoding='utf-8') as f:
    print(f.read())

In [None]:
# 參數 x, 檔案不存在，建新檔案，則error若檔案存在，
from pathlib import Path
file_path = Path.home() / "demo_path" / "files" / "txt" / "demofile.txt"
with open(file_path, 'x', encoding='utf-8') as f:
        f.write('use x parameter\n')

# Write strings to file

In [None]:
# writelines(list)
foo = ['Hello', 'a', 'Great', 'World!']
file_path = '/Users/jacky/demo_path/files/txt/demofile.txt'
with open(file_path, 'w', encoding='utf-8') as f:
    f.writelines(foo) # 會將 list 中的元素寫入檔案，但不會自動換行
with open(file_path, 'r', encoding='utf-8') as f:
    print(f.read())

In [None]:
# writelines(list)
foo = ['Hello\n', 'a\n', 'Great\n', 'World!']
file_path = '/Users/jacky/demo_path/files/txt/demofile.txt'
with open(file_path, 'w', encoding='utf-8') as f:
    f.writelines(foo) # 會將 list 中的元素寫入檔案，且因為'\n'會自動換行
with open(file_path, 'r', encoding='utf-8') as f:
    print(f.read())

# Lab

Assume a file name ’demo_file.txt’
1. Write ‘Apple’ and ‘Google’ to this file
2. Read this file
3. Append ‘ 小米’ and ‘三星’ to this file
4. Read the file again

# W3 School

- [File Handling] https://www.w3schools.com/python/exercise.asp?x=xrcise_file_handling1
- [Open File] https://www.w3schools.com/python/exercise.asp?x=xrcise_file_open1
- [Write to File] https://www.w3schools.com/python/exercise.asp?x=xrcise_file_write1

# Backup

f.seek(offset, whence): 用於操作檔案指標位置的函數
- offset：位移量，以字節（byte）為單位
- whence：參考位置，決定了 offset 是基於哪個位置進行計算：
  - 0（預設值）：檔案開頭。
  - 1：當前指標位置。
  - 2：檔案結尾。

In [None]:
# operate cursor
from pathlib import Path
file_path = Path.home() / "demo_path" / "files" / "txt" / "定風波.txt"
with open(file_path, 'r', encoding = 'utf-8') as f:
    print(f.read(7))
    print(f.tell())  # 以字節為單位
    f.seek(10,0)
    print(f.read(3), end='')

In [None]:
'''
字節的算法依照編碼方式不同而有所不同
UTF-8: 每個字符佔用 1 至 4 個字節。
UTF-16: 每個字符通常佔用 2 或 4 個字節。
Big5: 每個中文字符通常佔用 2 個字節。
'''

text = "你好世界"  # 中文
utf8_bytes = text.encode('utf-8')  # UTF-8 編碼
utf16_bytes = text.encode('utf-16')  # UTF-16 編碼
big5_bytes = text.encode('big5')  # Big5 編碼

print('utf-8:', len(utf8_bytes))  # UTF-8 的字節數
print('utf-16:', len(utf16_bytes))  # UTF-16 的字節數
print('big5:', len(big5_bytes))  # Big5 的字節數