# Lecture 10 Text Files

Learning Objectives: 

* Use the os module to manipulate file paths
* Understand how to open, close, read, and write text files in Python.
* Know what a context manager is and why it’s useful.
* Use a context manager to handle text files safely.

## What's a Text File? 

* A file stores data as **plain** text
* Examples:
| **Extension** | **File Type** | **What's Stored** | **Usage** |
|:---|:---|:---|:---|
| `.txt` | General Text | Unstructured plain text | Notes, simple data storage, documentation |
| `.md` | Markdown | Text with simple formatting syntax | Documentation, README files, formatted notes |
| `.csv` | Comma-Separated Values | Tabular data | Spreadsheets, data analysis, database export |
| `.json` | JavaScript Object Notation | Structured data in key-value pairs | Web APIs, config files, data transfer between applications |
| `.log` | Log File | Event messages, often with timestamps | Debugging, system event tracking, audit logs |

* **NON-plain** text file examples: pdf, doc, xls

## File Paths
* Specify where a file is located on your machine
* **Absolute** Path: Full path from the root of the file system
    * Windows: C:\Users\Username\Documents\example.txt
    * macOS/Linux: /Users/Username/Documents/example.txt
* **Relative** Path: Path relative to the current working directory (where the script is running).

In [None]:
# os is a built-in module that provides functions to perform tasks interacting with operating system, 
# such as file and directory manipulation
import os
# returns the absolute path of the current working directory
os.getcwd()

In [None]:
# lists files and directories in the specified path
os.listdir()

In [None]:
# obtain files in the 'files' directory
# you can use absolute path
os.listdir('/Users/yiyinshen/Documents/CS/368/dev-cs368-python-fa24/10-text-files/files')

In [None]:
# or use relative path 
# you are currently at '/Users/yiyinshen/Documents/CS/368/dev-cs368-python-fa24/10-text-files'
# find files with the current working directory
os.listdir('files')

In [None]:
# relative path to example.txt in files directory
# you can use string concatenation
'files' + '/' + 'example.txt'

In [None]:
# joins one or more path components
# considers the operating system
os.path.join('files', 'example.txt')

## Open, Close, Read, Write, and Create Files

### Open/close files
* open with syntax `open(file_path, mode)`, where modes are
    * `'r'`: read (default)
    * `'w'`: – write
    * `'a'`: – append
    * `'r+'`: – read and write
* After `open()`, Python requests OS to allocate a buffer in memory for the file based on the mode
* **Always** `close()` to free up the allocated memory

In [None]:
example_txt_path = os.path.join('files', 'example.txt')
file = open(example_txt_path)
# perform operations
file.close()

### Read files
* OS caches (loads) files from disk to memory buffer

In [None]:
file = open(example_txt_path)
# read() reads all contents as a single string.
content = file.read()
file.close()
content

In [None]:
content.split('\n')

In [None]:
file = open(example_txt_path)
# readline() reads one line at a time.
line = file.readline()
while line:
    print(line)
    line = file.readline()
file.close()

In [None]:
file = open(example_txt_path)
# readlines() reads all lines into a list.
lines = file.readlines()
file.close()
lines

### Write files 
* OS writes content to the memory buffer
* Only flushes (writes back) content from memory buffer to disk when file closed or by request (`flush()`)

In [None]:
# write() overwrites the file in 'w' mode or appends in 'a' mode
file = open(example_txt_path, 'w')
file.write("This is a new line of text.\n")
file.close()

In [None]:
lines = ["Line 1\n", "Line 2\n", "Line 3\n"]
file = open(example_txt_path, 'a')
# writelines() writes a list of strings all at once.
file.writelines(lines)
file.close()

In [None]:
file = open(example_txt_path, 'w')
file.write("Line 1\n")
file.write("Line 2\n")
# file.flush()
# file.write("Line 3\n")
# file.close()

In [None]:
file.close()

### Create files


In [None]:
# read mode: FileNotFoundError
file = open('nonexist.txt')
file.close()

In [None]:
# w, a, r+ modes: the file is automatically created if it doesn’t exist
file = open('nonexist.txt', 'a')
file.close()

## Context Manager

* Context managers automatically close files after operations, even if an error occurs.

In [None]:
# if code crushes before file is closed
# what's been write can't be flushed from memory buffer to disk
file = open('error.txt', 'w')
file.write('error')
assert 1 == 2
file.close()

In [None]:
file.close()

In [None]:
??? open('error.txt', 'w') as file: 
    # enter the context manager 
    file.write('error')
    assert 1 == 2
# exit the context manager
# file is closed automatically even if there's an error

* Context managers (with statements) implicitly invoke two special methods
    * `__enter__()`: Defines setup actions when entering the context.
    * `__exit__()`: Defines cleanup actions when exiting the context.

In [None]:
class FileManager:
    def __init__(self, filename, mode):
        self.filename = filename
        self.mode = mode

    def __enter__(self):
        self.file = ???
        return self.file

    def __exit__(self, exc_type, exc_value, traceback):
        ???
        return False  

In [None]:
with FileManager("context.txt", "w") as f: # invokes FileManager.__enter__()
    f.write("Hello, World!")
# invokes FileManager.__exit__()