# Lecture 10 Text Files

Learning Objectives: 

* Use the os module to manipulate file paths
* Understand how to open, close, read, and write text files in Python.
* Know what a context manager is and why it’s useful.
* Use a context manager to handle text files safely.

## What's a Text File? 

* A file stores data as **plain** text
* Examples:
| **Extension** | **File Type** | **What's Stored** | **Usage** |
|:---|:---|:---|:---|
| `.txt` | General Text | Unstructured plain text | Notes, simple data storage, documentation |
| `.md` | Markdown | Text with simple formatting syntax | Documentation, README files, formatted notes |
| `.csv` | Comma-Separated Values | Tabular data | Spreadsheets, data analysis, database export |
| `.json` | JavaScript Object Notation | Structured data in key-value pairs | Web APIs, config files, data transfer between applications |
| `.log` | Log File | Event messages, often with timestamps | Debugging, system event tracking, audit logs |

* **NON-plain** text file examples: pdf, doc, xls

## File Paths
* Specify where a file is located on your machine
* **Absolute** Path: Full path from the root of the file system
    * Windows: C:\Users\Username\Documents\example.txt
    * macOS/Linux: /Users/Username/Documents/example.txt
* **Relative** Path: Path relative to the current working directory (where the script is running).

In [None]:
# os is a built-in module that provides functions to perform tasks interacting with operating system, 
# such as file and directory manipulation
import os
# returns the absolute path of the current working directory
os.???()

In [None]:
# lists files and directories in the specified path
os.???()

In [None]:
# obtain files in the 'files' directory
# you can use absolute path
os.listdir('/Users/yiyinshen/Documents/CS/368/dev-cs368-python-fa24/10-text-files/???')

In [None]:
# or use relative path 
# you are currently at '/Users/yiyinshen/Documents/CS/368/dev-cs368-python-fa24/10-text-files'
# find files with the current working directory
os.listdir('files')

In [None]:
# relative path to example.txt in files directory
# you can use string concatenation
'files' + '/' + 'example.txt'

In [None]:
# joins one or more path components
# considers the operating system
os.???('files', 'example.txt')

## Open, Close, Read, Write, and Create Files

### Open/close files
* open with syntax `open(file_path, mode)`, where modes are
    * `'r'`: read (default)
    * `'w'`: – write
    * `'a'`: – append
    * `'r+'`: – read and write
    * `'w+'`: – write and read
    * `'a+'`: – append and read
* After `open()`, Python requests OS to allocate a buffer in memory for the file based on the mode
* **Always** `close()` to free up the allocated memory

In [None]:
example_txt_path = os.path.join('files', 'example.txt')
file = ???(example_txt_path)
# perform operations
file.???()

### Read files
* OS caches (loads) files from disk to memory buffer

In [None]:
file = open(example_txt_path)
# read() reads all contents as a single string.
content = file.???()
file.close()
content

In [None]:
content.split('\n')

In [None]:
file = open(example_txt_path)
# readline() reads one line at a time.
line = file.???()
while line:
    print(line)
    line = file.???()
file.close()

In [None]:
file = open(example_txt_path)
# readlines() reads all lines into a list.
lines = file.???()
file.close()
lines

### Write files 
* OS writes content to the memory buffer
* Only flushes (writes back) content from memory buffer to disk when file closed or by request (`flush()`)

In [None]:
write_example_txt_path = os.path.join('files', 'write_example.txt')

In [None]:
# write() truncates/overwrites the file in 'w' mode or appends in 'a' mode
file = open(write_example_txt_path, 'w')
file.???("This is a new line of text.\n")
file.close()

In [None]:
lines = ["Line 1\n", "Line 2\n", "Line 3\n"]
file = open(write_example_txt_path, 'a')
# writelines() writes a list of strings all at once.
file.???(lines)
file.close()

In [None]:
file = open(write_example_txt_path, 'w')
file.write("Line 1\n")
file.write("Line 2\n")
# file.flush() # flush content from memory buffer to disk
# file.write("Line 3\n")
# file.close()

In [None]:
file.close()

### Hybrid modes

In [None]:
r_plus_example_txt_path = os.path.join('files', 'r_plus_example.txt')
file = open(r_plus_example_txt_path, '???')
content = file.readline()   # Read existing content
file.write("\nNew line appended.")   # Write new content at the end
file.close()
content

In [None]:
w_plus_example_txt_path = os.path.join('files', 'w_plus_example.txt')
file = open(w_plus_example_txt_path, '???')
content_before_write = file.read()   # Truncates/overwrites existing content in the file
file.write("New line appended.") 
# file.seek(0) # Move pointer to the beginning of the file
content_after_write = file.read()
file.close()
content_before_write, content_after_write

In [None]:
a_plus_example_txt_path = os.path.join('files', 'a_plus_example.txt')
file = open(a_plus_example_txt_path, '???')
content_before_seek = file.read()   # Exisiting content is not truncated/overwrited, but pointer is at the end of the file
# file.seek(0)
content_after_seek = file.read()
file.write("\nNew line appended.")   # Write new content at the end
file.close()
content_before_seek, content_after_seek

### Create files


In [None]:
# r, r+ mode: FileNotFoundError
file = open('nonexist.txt', 'r+')
file.close()

In [None]:
# w, a, w+, a+ modes: the file is automatically created if it doesn’t exist
file = open('nonexist.txt', 'a')
file.close()

## Context Manager

* Context managers automatically close files after operations, even if an error occurs.

In [None]:
# if code crushes before file is closed
# what's been write can't be flushed from memory buffer to disk
file = open('error.txt', 'w')
file.write('error')
assert 1 == 2
file.close()

In [None]:
file.close()

In [None]:
??? open('error.txt', 'w') ???: 
    # enter the context manager 
    file.write('error')
    assert 1 == 2
# exit the context manager
# file is closed automatically even if there's an error

* Context managers (with statements) implicitly invoke two special methods
    * `__enter__()`: Defines setup actions when entering the context.
    * `__exit__()`: Defines cleanup actions when exiting the context.

In [None]:
class FileManager:
    def __init__(self, filename, mode):
        self.filename = filename
        self.mode = mode

    def __enter__(self):
        self.file = open(self.filename, self.mode)
        return self.file

    def __exit__(self, exc_type, exc_value, traceback):
        self.file.close()
        return False  

In [None]:
with FileManager("context.txt", "w") as f: # invokes FileManager.__enter__()
    f.write("Hello, World!")
# invokes FileManager.__exit__()