## File Operation - Read and Write Files 

### File handling is a crucial part of any programming language. Python provides built-in functions and methods to read from and write to files, both text and binary. This notepad covers the basics of file handling, including reading and writing text files and binary files.

In [3]:
### This example illustrates reading the whole fle

with open("text_to_read.txt", "r") as file:
    file_contents = file.read()
    print(f"Reading the content of the file: \n\n {file_contents}")

Reading the content of the file: 

 This is a sample text file which will be used to for illustration purpose
--------------------------------------------------------------------------------

Python is a powerful, high-level programming language valued for its simplicity, flexibility, and extensive ecosystem. 
Its clean syntax enables rapid development, while dynamic typing and vast libraries make it highly adaptable. 
Python excels in automation, data analysis, web applications, and especially artificial intelligence and machine learning,
thanks to frameworks like TensorFlow, PyTorch, and Scikit-learn. It supports object-oriented, functional,
and procedural paradigms, making it versatile for diverse use cases. With strong community support, seamless integration 
with other technologies, and cross-platform portability, Python empowers developers to build efficient, scalable, 
and innovative solutions across AI, data science, and enterprise applications.


### Lets see various possible ways to read the file

#### 1. Using `for` loop directly on file object (most Pythonic ✅)

In [6]:
with open("text_to_read.txt", "r") as f:
    for line in f: # reads the file line by line
        print(line.strip()) # I am using strip(), because it return a copy of the string with leading and trailing whitespace removed.

This is a sample text file which will be used to for illustration purpose
--------------------------------------------------------------------------------

Python is a powerful, high-level programming language valued for its simplicity, flexibility, and extensive ecosystem.
Its clean syntax enables rapid development, while dynamic typing and vast libraries make it highly adaptable.
Python excels in automation, data analysis, web applications, and especially artificial intelligence and machine learning,
thanks to frameworks like TensorFlow, PyTorch, and Scikit-learn. It supports object-oriented, functional,
and procedural paradigms, making it versatile for diverse use cases. With strong community support, seamless integration
with other technologies, and cross-platform portability, Python empowers developers to build efficient, scalable,
and innovative solutions across AI, data science, and enterprise applications.


##### Advatages
- Efficient: doesn’t load the whole file into memory.
- Best for large files.



#### 2. Using `while` loop with `readline()`

In [9]:
with open('text_to_read.txt', 'r') as f:
    line = f.readline()
    while line:
        print(line.strip())
        line = f.readline()

This is a sample text file which will be used to for illustration purpose
--------------------------------------------------------------------------------

Python is a powerful, high-level programming language valued for its simplicity, flexibility, and extensive ecosystem.
Its clean syntax enables rapid development, while dynamic typing and vast libraries make it highly adaptable.
Python excels in automation, data analysis, web applications, and especially artificial intelligence and machine learning,
thanks to frameworks like TensorFlow, PyTorch, and Scikit-learn. It supports object-oriented, functional,
and procedural paradigms, making it versatile for diverse use cases. With strong community support, seamless integration
with other technologies, and cross-platform portability, Python empowers developers to build efficient, scalable,
and innovative solutions across AI, data science, and enterprise applications.


##### Advantages
- Gives you more control (e.g., skipping lines).
- Still reads one line at a time.

#### 3. Using `readlines()` with `for` loop

In [10]:
with open('text_to_read.txt', 'r') as f:
    for line in f.readlines():
        print(line.strip())

This is a sample text file which will be used to for illustration purpose
--------------------------------------------------------------------------------

Python is a powerful, high-level programming language valued for its simplicity, flexibility, and extensive ecosystem.
Its clean syntax enables rapid development, while dynamic typing and vast libraries make it highly adaptable.
Python excels in automation, data analysis, web applications, and especially artificial intelligence and machine learning,
thanks to frameworks like TensorFlow, PyTorch, and Scikit-learn. It supports object-oriented, functional,
and procedural paradigms, making it versatile for diverse use cases. With strong community support, seamless integration
with other technologies, and cross-platform portability, Python empowers developers to build efficient, scalable,
and innovative solutions across AI, data science, and enterprise applications.


#### 4. Using `yield`

In [14]:
def read_lines(filename):
    with open(filename, 'r') as f:
        for line in f:
            yield line.strip()

# Example usage:
for line in read_lines('text_to_read.txt'):
    print(line)

Python do contain vairous native libraries that shall we used for Machine learning


##### Advantages:
- yield turns a function into a generator.
- First call → returns the first line.
- Next loop iteration → resumes, returns the next line.
-- Continues until the file ends.

##### Benefits:
1. Memory Efficiency
    - Doesn’t load the whole file into memory (important for large files).
    - Reads and processes one line at a time.
2. Lazy Evaluation
    - Only generates values when needed.
    - Saves time if you don’t need the entire dataset.
3. Cleaner Code
    - No need to manually manage indexes or read chunks.
    - Loops like for line in read_file_lines("text_to_read.txt") just work.
4. Pipeline Flexibility
    - Generators can be chained together for data processing pipelines (filter, transform, aggregate).

### Writing a content to a file
I have a file already `text_to_read.txt`, so I will be using the same file to illustrate the write operation. 

In [None]:
with open('text_to_read.txt', 'w') as file:
    file.write("Python do contain vairous native libraries that shall we used for Machine learning\n")

## Lets us try reading the file, the existing file should have been over written
for line in read_lines('text_to_read.txt'): # I have created this generator method already to read the file
    print(line)


Python do contain vairous native libraries that shall we used for Machine learning


Using `writelines()`

- It writes a list (or iterable) of strings to a file.
- It does not add newline characters automatically — you must include \n if you want line breaks.

In [16]:
lines = ["First line\n", "Second line\n", "Third line\n"]

with open("text_to_read.txt", "w") as f:
    f.writelines(lines)

for line in read_lines('text_to_read.txt'):
    print(line)


First line
Second line
Third line


#### Appening to text to the existing file
Inorder to append the text to existing file, we will have to use append mode `a`

In [17]:
# Using write()
with open('text_to_read.txt', 'a') as f:
    f.write("Python do contain vairous native libraries that shall we used for Machine learning\n")

for line in read_lines('text_to_read.txt'):
    print(line)

First line
Second line
Third line
Python do contain vairous native libraries that shall we used for Machine learning


In [18]:
# Using writelines()
new_lines = ["Fourth line\n", "Fifth line\n"]

with open('text_to_read.txt', 'a') as f:
    f.writelines(new_lines)

for line in read_lines('text_to_read.txt'):
    print(line) 

First line
Second line
Third line
Python do contain vairous native libraries that shall we used for Machine learning
Fourth line
Fifth line


### Binary file I/O is super important (images, audio, models, executables, etc.).

- `rb` → Read Binary
    - Opens file for reading in binary mode.
    - File must already exist.
- `wb` → Write Binary
    - Opens file for writing in binary mode.
    - Overwrites the file if it exists, or creates a new one.
- `ab` → Append Binary
    - Opens file for appending in binary mode.
    - Writes data at the end without erasing existing content.
- `rb+` → Read & Write Binary
    - Opens file for both reading and writing in binary mode.
    - File must exist.
- `wb+` → Write & Read Binary
    - Opens file for reading and writing in binary mode.
    - Overwrites file if it exists, otherwise creates new.
- `ab+` → Append & Read Binary
    - Opens file for reading and appending in binary mode.
    - File pointer is at the end for writing, but you can read existing data.

In [19]:
# 1. Writing raw bytes to a binary file
data = bytes([65, 66, 67, 68])   # Equivalent to ASCII 'ABCD'

with open("sample.bin", "wb") as f:
    f.write(data)   # Writes raw bytes to the file

In [21]:
# 2. Reading the entire binary file
with open("sample.bin", "rb") as f:
    content = f.read()           # Reads all bytes
    print(content)               # b'ABCD'
    print(list(content))         # [65, 66, 67, 68] (decimal values)

b'ABCD'
[65, 66, 67, 68]


In [22]:
# 3. Reading binary file in chunks (useful for large files)
with open("sample.bin", "rb") as f:
    chunk = f.read(2)            # Read 2 bytes at a time
    while chunk:                 # Loop until EOF
        print(chunk)             # Print the chunk
        chunk = f.read(2)

b'AB'
b'CD'


In [23]:
# 4. Copying a binary file (common use case: images, PDFs, audio)
with open("source.jpeg", "rb") as src, open("copy.jpeg", "wb") as dst:
    # iter(callable, sentinel) keeps calling the function (callable)
    # until it returns the sentinel value (here: b"")
    #
    # - lambda: src.read(4096) → small anonymous function that reads 4096 bytes at a time
    # - b"" (empty bytes) → sentinel; when src.read() returns b"", it means EOF (end of file)
    #
    # So this loop keeps yielding 4KB chunks until the file ends.
    for chunk in iter(lambda: src.read(4096), b""):  
        dst.write(chunk)  # Write each chunk to the destination file


In [24]:
# Open the image in binary read mode
with open("copy.jpeg", "rb") as f:
    img_bytes = f.read()   # read the entire image as bytes

print(type(img_bytes))     # <class 'bytes'>
print(len(img_bytes))      # total number of bytes in the image
print(img_bytes[:20])      # first 20 bytes (JPEG header)


<class 'bytes'>
3487371
b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x01\x01,\x01,\x00\x00'


In [25]:
# 5. Using seek() and tell() for positioning in a binary file
with open("sample.bin", "rb") as f:
    # tell() → returns the current file pointer position (in bytes)
    # Initially, when we open a file, the pointer is at the start (position 0)
    print("Initial position:", f.tell())   # Output: 0 (start of file)

    # seek(offset, whence) → moves the file pointer
    # - offset = how many bytes to move
    # - whence = reference point (default is 0 → beginning of file)
    #   whence = 0 → beginning of file
    #   whence = 1 → current position
    #   whence = 2 → end of file
    #
    # Here: f.seek(2) → move pointer to the 3rd byte (0-based index)
    f.seek(2)                              

    # tell() again → shows the new position
    print("After seek:", f.tell())         # Output: 2

    # f.read(1) → read 1 byte from current pointer position (byte index 2)
    print("Byte at position 2:", f.read(1))  


Initial position: 0
After seek: 2
Byte at position 2: b'C'


In [None]:
# 6. Writing and reading structured binary data (using struct)

import struct
# ------------------------------------------------------
# Writing: pack Python values into binary format
# ------------------------------------------------------
with open("numbers.bin", "wb") as f:
    # struct.pack(format, value) → converts Python values into raw bytes
    #
    # "i" → integer (4 bytes)
    # "f" → float (4 bytes)
    #
    # Here, we write 123 as a 4-byte integer, then 3.14 as a 4-byte float.
    f.write(struct.pack("i", 123))    # Write integer 123 → 4 bytes
    f.write(struct.pack("f", 3.14))   # Write float 3.14 → 4 bytes

In [29]:
# ------------------------------------------------------
# Reading: unpack binary data back into Python values
# ------------------------------------------------------
with open("numbers.bin", "rb") as f:
    # Read 4 bytes for the integer
    int_bytes = f.read(4)             
    # Read 4 bytes for the float
    float_bytes = f.read(4)           

    # struct.unpack(format, bytes) → converts raw bytes back into Python values
    #
    # struct.unpack() always returns a tuple → so we take [0] to get the value
    num = struct.unpack("i", int_bytes)[0]
    flt = struct.unpack("f", float_bytes)[0]

    print("Read integer:", num)   # 123
    print("Read float:", flt)     # 3.14

Read integer: 123
Read float: 3.140000104904175
