# What is File Handling?

File handling in Python allows us to **read from** and **write to files** stored on our computer or server.  This is essential for working with datasets, saving model outputs, logging results, and processing large amounts of data outside of our program’s memory.

In AI/ML, file handling is everywhere: from loading training data in `.txt` or `.csv` files, to saving model predictions or checkpoints. We will often interact with raw data stored in files, clean or transform it, then save our results for later use or sharing.

Python provides simple and powerful built-in functions to open files in different modes (read, write, append, etc.), process them line-by-line or in bulk, and close them safely to avoid data loss.

Mastering these basics prepares you to handle more complex tasks later, like working with structured data using Pandas, handling JSON files from APIs, or saving binary files like model weights.

### File Modes

| Mode | Description |
| --- | --- |
| `'r'` | Read (default) |
| `'w'` | Write (overwrite existing) |
| `'a'` | Append (add to file end) |
| `'x'` | Create new file (error if exists) |
| `'rb'` | Read binary |
| `'wb'` | Write binary |

### File Handling Concepts & Syntax

**Open a file:** The built-in `open()` function is used to open a file. It requires at least the filename and optionally the mode:

- `'r'` for reading (default mode)
- `'w'` for writing (creates a new file or truncates existing)
- `'a'` for appending
- `'x'` for exclusive creation (fails if file exists)

In [None]:
file = open('data.txt', 'r')  # Open file in read mode
content = file.read()         # Read the entire content of the file as a string
print(content)
file.close()                  # Close the file to free resources

# Output: Prints the entire content of 'data.txt' as a string.
# (Example: "This is the content of data.txt file.")

**Explanation:**

- `open()` returns a file object that we can operate on.
- `file.read()` reads the whole file content.
- `file.close()` is important to release system resources.

**Open a File Using the `with` Statement:** The `with` statement is a context manager that **automatically handles opening and closing the file**, even if errors occur inside the block. This makes our code safer and cleaner.

In [None]:
with open('data.txt', 'r') as file:
    content = file.read()
    print(content)
# File is automatically closed here

# Output: Prints the entire content of 'data.txt'.
# File is automatically closed after the block.

**Why use `with`?**

- Ensures files are closed properly.
- Avoids resource leaks.
- Cleaner syntax.

**Read File Line by Line:** Sometimes files are too large to read at once or we want to process line-by-line.

In [None]:
with open('data.txt', 'r') as file:
    for line in file:
        print(line.strip()) # strip() removes trailing newline characters
		
# Output: Prints each line of 'data.txt' on separate lines without extra newline spaces.
# (Example:
# Line 1 of file
# Line 2 of file
# Line 3 of file
# )

**Write to a File (Overwrite Mode):** Writing mode `'w'` creates a new file or **overwrites** existing content.

In [None]:
with open('output.txt', 'w') as file:
    file.write("Hello AI World!\n")
    file.write("Writing to files is easy.\n")
    
# Output: Creates or overwrites 'output.txt' with two lines:
# Hello AI World!
# Writing to files is easy.

**Append to a File (Add Content):** Appending mode `'a'` adds new content to the **end of the file** without erasing existing data.

In [None]:
with open('output.txt', 'a') as file:
    file.write("This line is appended.\n")
    
# Output: Adds the line "This line is appended." to the end of 'output.txt' without deleting existing content.

**Read All Lines as a List:** `.readlines()` reads the entire file and returns a list where each element is a line (including newline characters).

In [None]:
with open('data.txt', 'r') as file:
    lines = file.readlines()
    print(lines)
    
# Output: Prints a list of all lines including newline characters.
# Example:
# ['Line 1 of file\n', 'Line 2 of file\n', 'Line 3 of file\n']

**Write a List of Strings to a File:** 
We can write multiple lines at once using `.writelines()` — note that we need to include newline characters `\n` ourselves.

In [None]:
lines = ["Line 1\n", "Line 2\n", "Line 3\n"]
with open('output.txt', 'w') as file:
    file.writelines(lines)
    
# Output: Creates or overwrites 'output.txt' with the three lines exactly as given.

**Handling Errors with Try/Except/Finally:** File operations may raise errors (like file not found). We handle exceptions to avoid crashes and always close files safely.

In [None]:
try:
    file = open('data.txt', 'r')
    data = file.read()
    print(data)
except FileNotFoundError:
    print("File not found!")
finally:
    file.close()
    
# Output:
# - Prints the contents of 'data.txt' if it exists.
# - Prints "File not found!" if the file doesn't exist.
# The file is always closed in the 'finally' block.

**Working with Binary Files (e.g., images):** Open files in binary mode (`'rb'` or `'wb'`) when working with images, audio, or any non-text data.

In [None]:
with open('image.jpg', 'rb') as file:
    data = file.read()
    print(type(data))  # <class 'bytes'>
    
# Output: Prints "<class 'bytes'>" indicating that the file data is read as bytes (binary data).

**Creating a New File (Fail if Exists):** Mode `'x'` creates a new file but throws an error if the file already exists — useful for avoiding accidental overwrites.

In [None]:
with open('newfile.txt', 'x') as file:
    file.write("Created a new file!")
    
# Output:
# - Creates 'newfile.txt' and writes the text if it doesn't exist.
# - Throws FileExistsError if 'newfile.txt' already exists.

**Reading Large Files Efficiently Using Generators:** For very large files, we use a generator function to read line by line lazily without loading the entire file into memory.

In [None]:
def read_large_file(file_path):
    with open(file_path, 'r') as file:
        for line in file:
            yield line.strip()

for line in read_large_file('large_data.txt'):
    print(line)
   
# Output: Prints each line from 'large_data.txt' one by one, stripped of trailing newline characters.
# Efficient for large files as it doesn't load the whole file into memory.

**Note:**  `yield` is used in a Python function to make it a **generator**. Instead of returning all data at once, the function **produces one item at a time**, pausing its state between each item. This helps us **read large files efficiently** by processing one line at a time without loading the entire file into memory.

**Handling File Paths Using `os` Module:** For cross-platform compatibility, we use `os.path.join()` to build file paths correctly.

In [None]:
import os

path = os.path.join('folder', 'data.txt')
with open(path, 'r') as file:
    print(file.read())
    
# Output: Prints the content of the file 'data.txt' located inside 'folder' directory.
# os.path.join() ensures the correct path formatting across operating systems.

### Exercises

Q1. Read a text file and print its entire content.

In [None]:
with open('text.txt', 'r') as file:
	print(file.read())
	
# Output: Prints entire content of 'text.txt'.

Q2. Write three lines of text to a new file.

In [None]:
lines = ["Sujit\n", "Chaudhary\n", "AI/ML\n"]
with open('output.txt', 'w') as file:
    file.writelines(lines)
    
# Output: Creates/overwrites 'output.txt' with 3 lines: Sujit, Chaudhary, AI/ML.

Q3. Append a new line of text to an existing file.

In [None]:
with open('text.txt', 'a') as file:
    file.write("appended.\n")
	
# Output: Adds a line "appended." at the end of 'text.txt'.

Q4. Count and print the number of lines in a file.

In [None]:
count = 0
with open('text.txt', 'r') as file:
    for _ in file:
        count += 1
print("Number of lines:", count)

# Output: Prints the number of lines present in 'text.txt'.

### Summary

File handling is a fundamental skill for AI/ML practitioners to load, process, and save data efficiently. Whether working with raw datasets, saving predictions, or managing logs, knowing how to read and write files properly helps us build scalable and robust workflows.

The `with` statement is our best practice tool for safely managing files without forgetting to close them.

Once we master these basics, we’ll be ready to handle structured data with Pandas, JSON from APIs, and even save machine learning models to disk for reuse.