# Python File Handling & Text Manipulation

Welcome to the **best beginner-friendly** tutorial on **File Handling & Text Manipulation** in Python, with a sprinkle of **AI** at the end. 
In this notebook, we'll cover:

1. **Basic File I/O Modes**: `r`, `w`, `a`, etc.
2. **Reading**, **Writing**, and **Appending** text.
3. **Reading lines** vs. reading the entire file.
4. **Text Manipulation** techniques: split, strip, search, replace, and simple regex usage.
5. **Exercises** to reinforce learning.
6. A final section on how to **use AI** to implement similar code.

Let's get started!

## 1. Python File Modes Overview

When you open a file with Python's built-in `open()` function, you specify a **mode**:

- **`r`** (read): Opens a file for reading. If the file does not exist, an error is raised.
- **`w`** (write): Opens a file for writing. **Overwrites** the file if it exists, or **creates** a new one if it doesn't.
- **`a`** (append): Opens a file in **append** mode. All written data is appended to the end of the file.
- **`x`** (exclusive creation): Creates a file but **fails** if it already exists.
- **`r+`** (read/write): Opens a file for both reading and writing.
- **`w+`** (write/read): Like `w` but also allows reading. Overwrites if file exists.
- **`a+`** (append/read): Like `a` but also allows reading from the file.

## 2. Basic Reading & Writing Functions
Below are simple helper functions that read, write, and append text in **UTF-8**. 

1. **read_utf8_file**: returns the entire file content as a **single string**.
2. **write_utf8_file**: **overwrites** a file with new content.
3. **append_utf8_file**: **appends** content to the end of a file (creating it if it doesn’t exist).
4. **read_lines_utf8_file**: returns a list of lines instead of one big string.


In [None]:
def read_utf8_file(file_path):
    """
    Reads a file as UTF-8-encoded text, returning its content as a string.
    """
    file = open(file_path, 'r', encoding='utf-8')
    data = file.read()
    file.close()
    return data

def write_utf8_file(file_path, content):
    """
    Writes a string to a file in UTF-8 encoding.
    Overwrites if the file already exists.
    """
    file = open(file_path, 'w', encoding='utf-8')
    file.write(content)
    file.close()

def append_utf8_file(file_path, content):
    """
    Appends a string to a file in UTF-8 encoding.
    Creates the file if it does not exist.
    """
    file = open(file_path, 'a', encoding='utf-8')
    file.write(content)
    file.close()

def read_lines_utf8_file(file_path):
    """
    Reads a file as UTF-8-encoded text, returning its content as a list of lines.
    """
    file = open(file_path, 'r', encoding='utf-8')
    lines = file.readlines()
    file.close()
    return lines

# Let's create a simple text file for demonstration.
write_utf8_file("demo.txt", "Hello World!\nThis is a demo file.\nLine 3.")
print("demo.txt created with sample content.")

## 3. Demonstrating Different Modes
### 3.1 Reading (`r`)
If the file exists, reading is straightforward:


In [None]:
# Let's read our demo.txt
content = read_utf8_file("demo.txt")
print("CONTENT OF demo.txt:\n", content)

### 3.2 Writing (`w`)
Opening a file in **write** mode overwrites it if it exists, or **creates** it if not.

In [None]:
# Overwrite the content of 'demo.txt'
write_utf8_file("demo.txt", "New content here.\nOverwritten!")

# Let's read again to confirm
new_content = read_utf8_file("demo.txt")
print("After overwriting, demo.txt contains:\n", new_content)

### 3.3 Appending (`a`)
Appending will place new text at the **end** of the file, creating it if it doesn’t exist.

In [None]:
# Append a new line
append_utf8_file("demo.txt", "\nAppending a new line!")

# Confirm the appended content
appended_content = read_utf8_file("demo.txt")
print("After appending, demo.txt contains:\n", appended_content)

### 3.4 Reading lines (`readlines()`)
Sometimes, you need to process a file **line by line**.

In [None]:
# Let's show how read_lines_utf8_file works
lines = read_lines_utf8_file("demo.txt")
print("Type of 'lines':", type(lines))
print("Number of lines:", len(lines))
print("Lines:", lines)

## 4. Text Manipulation in Files

Often, you’ll **read** text from a file, **manipulate** it, and then **write** or **append** it. Below are some common operations:

1. **Splitting** lines or paragraphs.
2. **Stripping** whitespace.
3. **Replacing** substrings.
4. **Searching** for patterns, sometimes using **regex**.

### 4.1 Splitting
- `str.split()` without arguments splits on **any whitespace**.
- `str.split('\n')` splits explicitly on newlines.
- `str.split('\n\n')` might help isolate paragraphs if they're separated by **blank lines**.

In [None]:
# Example of splitting on newlines
demo_text = read_utf8_file("demo.txt")
split_lines = demo_text.split("\n")
print("Split by newlines:", split_lines)

### 4.2 Strip, Replace
- **strip()** removes leading and trailing whitespace.
- **replace(old, new)** replaces occurrences of one substring with another.

In [None]:
line_example = "   Hello, world!   "
stripped_line = line_example.strip()
print("Original:", repr(line_example))
print("Stripped:", repr(stripped_line))

replaced_line = stripped_line.replace("world", "Python")
print("Replaced:", replaced_line)

### 4.3 Searching & Regex
For more complex searches (like patterns), Python’s built-in **`re`** module helps. Example usage:
```python
import re
result = re.findall(r"\b\w+\b", "Hello, world!")  # find all words
```

We'll show a brief example below.

In [None]:
import re

sample_text = "Email me at test@example.com or admin@example.org."  
# Let's find all email addresses
emails = re.findall(r"[\w.-]+@[\w.-]+\.[\w.-]+", sample_text)
print("Found emails:", emails)

## 5. Exercises

### Exercise 5.1: Reading & Transforming Lines
1. **Write** a file called `exercise_input.txt` with at least 3 lines of text.
2. **Read** the file line by line.
3. For each line:
   - **strip** whitespace,
   - convert to **lowercase**,
   - **replace** any occurrence of the word "python" with "snake" (just for fun).
4. **Print** each transformed line.

### Exercise 5.2: Splitting Paragraphs
1. Create or modify `exercise_input.txt` so it has **two paragraphs** separated by a blank line.
2. **Read** the file in one go.
3. Split on double-newlines (`"\n\n"`).
4. Print each paragraph on a separate line.

### Exercise 5.3: Simple Regex Search
1. Use the `re` module to **find** all words that begin with a capital letter.
2. Print them out.

Below is some partial code for **Exercise 5.1**. Fill in the placeholders `"..."`.

In [None]:
# EXERCISE 5.1 SKELETON
def exercise_read_and_transform(file_path):
    """
    1. Read the file line by line.
    2. For each line: strip whitespace, lower, replace 'python' -> 'snake'.
    3. Print the transformed line.
    """
    lines = read_lines_utf8_file(file_path)

    for line in lines:
        # your code here

# Try calling exercise_read_and_transform("exercise_input.txt") after you create the file!
# exercise_read_and_transform("exercise_input.txt")

## 6. Using AI to Implement Similar Code

Now that you understand the fundamentals of file handling and text manipulation, you can leverage **AI** to generate or refactor your code. Below is an example **prompt** you could provide to an AI tool (like ChatGPT, GitHub Copilot, etc.) to create a function that does the reading, line-by-line transformation, and printing.

### AI Prompt (Comment)
```
# Please generate a Python function named 'process_file' that:
# 1. Reads 'exercise_input.txt' in UTF-8.
# 2. Splits the file by lines.
# 3. For each line, strips whitespace, converts to lowercase, and replaces 'python' with 'snake'.
# 4. Prints the transformed lines.
# 5. Uses 'open()' with 'r' mode.
```

_Below is an example of what the AI might produce._

In [15]:
# (Example) AI-Generated Implementation
def process_file():