# Advanced File Operations in Python

This notebook will guide you through various advanced operations related to files in Python. We will cover the following topics:

1. Working with Binary Files
2. File and Directory Metadata
3. Handling File and Directory Paths
4. Searching for Files
5. Handling Large Files

Let's get started!

## 1. Working with Binary Files

In Python, you can open a file in binary mode by appending `'b'` to the mode string when calling `open()`. This allows you to read and write binary data to and from the file.

Let's create a binary file and write some binary data to it.

In [None]:
# Open a binary file for writing
file = open('binary_file', 'wb')

# Write some binary data to the file
file.write(b'Hello, world!')

# Close the file
file.close()

In the code above, `b'Hello, world!'` is a bytes object. In Python, bytes objects are sequences of bytes. When you open a file in binary mode, you read and write bytes objects.

Now, let's read the binary data we just wrote to the file.

In [None]:
# Open the binary file for reading
file = open('binary_file', 'rb')

# Read the binary data from the file
data = file.read()

# Close the file
file.close()

# Print the binary data
print(data)

As you can see, when we read the binary data from the file, we get a bytes object that is identical to the one we wrote to the file.

## 2. File and Directory Metadata

The `os` package provides functions to get metadata about files and directories. For example, you can use `os.stat()` to get information such as the size of a file, the time it was last accessed, and the time it was last modified.

In [None]:
import os

# Get metadata about a file
metadata = os.stat('binary_file')

# Print the size of the file in bytes
print('Size:', metadata.st_size)

# Print the time the file was last accessed
print('Last accessed:', metadata.st_atime)

# Print the time the file was last modified
print('Last modified:', metadata.st_mtime)

The `os.stat()` function returns a `os.stat_result` object, which contains several attributes that hold information about the file. In the code above, we accessed the following attributes:

- `st_size`: The size of the file in bytes.
- `st_atime`: The time the file was last accessed.
- `st_mtime`: The time the file was last modified.

The times are returned as floating-point numbers representing seconds since the epoch (January 1, 1970, 00:00:00 (UTC)).

## 3. Handling File and Directory Paths

The `os.path` module contains many useful functions for manipulating file and directory paths.

In [None]:
# Get the absolute path of a file
print('Absolute path:', os.path.abspath('binary_file'))

# Check if a path is absolute
print('Is absolute?', os.path.isabs('/home'))

# Get the directory name of a path
print('Directory name:', os.path.dirname(os.path.abspath('binary_file')))

# Get the base name of a path
print('Base name:', os.path.basename(os.path.abspath('binary_file')))

# Check if a path exists
print('Does path exist?', os.path.exists('/home'))

# Check if a path is a file
print('Is path a file?', os.path.isfile('/home'))

# Check if a path is a directory
print('Is path a directory?', os.path.isdir('/home'))

The `os.path` module provides the following functions:

- `os.path.abspath(path)`: Returns the absolute path of `path`.
- `os.path.isabs(path)`: Returns `True` if `path` is an absolute pathname.
- `os.path.dirname(path)`: Returns the directory name of `path`.
- `os.path.basename(path)`: Returns the base name of `path`.
- `os.path.exists(path)`: Returns `True` if `path` refers to an existing path.
- `os.path.isfile(path)`: Returns `True` if `path` is an existing regular file.
- `os.path.isdir(path)`: Returns `True` if `path` is an existing directory.

## 4. Searching for Files

The `os` module also provides functions to search for files. For example, you can use `os.walk()` to generate the file names in a directory tree by walking the tree either top-down or bottom-up.

In [None]:
# Walk the current directory
for dirpath, dirnames, filenames in os.walk('.'):
    print(f'Found directory: {dirpath}')
    for file_name in filenames:
        print(file_name)

The `os.walk()` function generates the file names in a directory tree by walking the tree either top-down or bottom-up. For each directory in the tree rooted at directory top (including top itself), it yields a 3-tuple `(dirpath, dirnames, filenames)`.

- `dirpath` is a string, the path to the directory.
- `dirnames` is a list of the names of the subdirectories in `dirpath` (excluding '.' and '..').
- `filenames` is a list of the names of the non-directory files in `dirpath`.

Note that the names in the lists contain no path components. To get a full path (which begins with top) to a file or directory in `dirpath`, do `os.path.join(dirpath, name)`.

## 5. Creating and Removing Directories

The `os` module also provides functions to create and remove directories.

In [None]:
# Create a directory
os.mkdir('new_directory')

# Check if the directory was created
print('Does directory exist?', os.path.exists('new_directory'))

# Remove the directory
os.rmdir('new_directory')

# Check if the directory was removed
print('Does directory exist?', os.path.exists('new_directory'))

## 6. Format Operators

Python uses format operators to format strings. The `%` operator is used to format a set of variables enclosed in a tuple. Here are some basic argument specifiers you should know:

- `%s` - String (or any object with a string representation, like numbers)
- `%d` - Integers
- `%f` - Floating point numbers
- `%.<number of digits>f` - Floating point numbers with a fixed amount of digits to the right of the dot.
- `%x`/`%X` - Integers in hex representation (lowercase/uppercase)
- `%e`/`%E` - Floating point numbers in scientific notation (lowercase/uppercase)
- `%g`/`%G` - Floating point numbers in scientific notation or in fixed decimal notation depending on the value and the precision.

Let's see some examples.

In [None]:
# String
name = 'John'
print('Hello, %s!' % name)

# Integer
age = 20
print('I am %d years old.' % age)

# Floating point number
pi = 3.14159
print('Pi is approximately %f.' % pi)

# Floating point number with fixed amount of digits
print('Pi is approximately %.2f.' % pi)

# Integer in hex representation
number = 255
print('The number %d is %x in hexadecimal.' % (number, number))

# Floating point number in scientific notation
number = 0.000123
print('The number is %e in scientific notation.' % number)

# Floating point number in scientific notation or in fixed decimal notation
number1 = 0.000123
number2 = 1234.56789
print('The number is %g in scientific notation or in fixed decimal notation.' % number1)
print('The number is %g in scientific notation or in fixed decimal notation.' % number2)

## 7. Modifying Files

In this section, we will cover how to add new content to a file, append to a file, remove lines from a file, delete a file, and copy a file.

In [None]:
# Writing to a file

with open('test_file.txt', 'w') as f:
    f.write('Hello, world!\n')

# Appending to a file
with open('test_file.txt', 'a') as f:
    f.write('Goodbye, world!\n')

# Reading the file
with open('test_file.txt', 'r') as f:
    print(f.read())

In the above example, we first wrote to a file using the 'w' mode, which stands for 'write'. This mode will create the file if it doesn't exist, or overwrite it if it does.

Then, we appended to the file using the 'a' mode, which stands for 'append'. This mode will write at the end of the file without truncating it, creating it if necessary.

Finally, we read the file using the 'r' mode, which stands for 'read'. This is the default mode if you don't specify one.

The output shows that the file contains the lines we wrote and appended.

Now let's see how to remove lines from a file.

In [None]:
# Removing lines from a file
with open('test_file.txt', 'r') as f:
    lines = f.readlines()

with open('test_file.txt', 'w') as f:
    for line in lines:
        if line.strip() != 'Goodbye, world!':
            f.write(line)

# Reading the file
with open('test_file.txt', 'r') as f:
    print(f.read())
    

In the above example, we first read all the lines from the file into a list. Then, we opened the file in write mode, which erased all its content, and wrote back only the lines that didn't match the string 'Goodbye, world!'.

The output shows that the line 'Goodbye, world!' was successfully removed from the file.

Now let's see how to delete a file.

In [None]:
import os

# Deleting a file
os.remove('test_file.txt')

# Checking if the file exists
print(os.path.exists('test_file.txt'))

In the above example, we used the `os.remove` function to delete the file. Then, we checked if the file exists using `os.path.exists`, which returned `False`, indicating that the file was successfully deleted.

Now let's see how to copy a file.

In [None]:
import shutil

# Creating a file to copy
with open('source_file.txt', 'w') as f:
    f.write('This is the source file.\n')

# Copying the file
shutil.copy('source_file.txt', 'destination_file.txt')

# Reading the copied file
with open('destination_file.txt', 'r') as f:
    content = f.read()

In the above example, we used the `shutil.copy` function to copy the file. We first created a source file, then copied it to a destination file, and finally read the copied file. The output shows that the copied file contains the same content as the source file.