<a href="https://colab.research.google.com/github/usshaa/Data_Science/blob/main/File_Handling_Python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Python File Handling - From Basic to Advanced

This notebook guides you through Python file handling, starting with basic concepts and progressively moving to more advanced topics. Each section builds on previous knowledge with simple code examples.

**1. Understanding File Objects and Methods**

In Python, we access files through file objects. These objects provide methods to interact with files.


- Key file methods include:
- open() - Creates a file object
- read() - Reads file content
- write() - Writes to a file
- close() - Closes the file
- tell() - Returns current position
- seek() - Changes position in file

In [1]:
print("File handling starts with creating a file object using open()")

File handling starts with creating a file object using open()


**2. Opening and Closing Files**

**Basic File Opening**

The first step in file handling is opening a file to create a file object.


- Basic syntax: open(filename, mode)
- Mode determines what operations are allowed

In [5]:
# Opening a file for reading (default mode is 'r')
file = open('example.txt', 'r')

# Always close files when done to free system resources
file.close()

print("File was opened and closed")

File was opened and closed


**File Opening Modes**


**Different file modes:**
- 'r' - Read (default) - File must exist
- 'w' - Write - Creates new file or overwrites existing
- 'a' - Append - Adds to end of file
- 'r+' - Read and write - File must exist
- 'w+' - Read and write - Creates new or overwrites
- 'a+' - Read and append

- Adding 'b' to any mode opens file in binary mode
- Example: 'rb', 'wb', 'ab'

In [3]:
print("Available file modes: r, w, a, r+, w+, a+")
print("Add 'b' for binary mode: rb, wb, ab")

Available file modes: r, w, a, r+, w+, a+
Add 'b' for binary mode: rb, wb, ab


**Using the 'with' Statement (Recommended)**


- The 'with' statement is the recommended way to open files
- It automatically closes the file when the block ends

In [4]:
# Example using 'with' statement
with open('example.txt', 'w') as file:
    file.write("This is an example file.\n")
    file.write("Using the 'with' statement is recommended.\n")
    file.write("It automatically closes the file when done.")

print("File created using 'with' statement. The file is now automatically closed.")

File created using 'with' statement. The file is now automatically closed.


**3. Writing to Files**

**Basic File Writing**

In [6]:
# Let's start by writing to a file
with open('notes.txt', 'w') as file:
    file.write("This is my first line.\n")
    file.write("This is my second line.\n")
    file.write("This is my third line.")

print("Successfully wrote to notes.txt")

Successfully wrote to notes.txt


**Writing Multiple Lines at Once**

In [7]:
# Writing multiple lines using writelines()
lines = [
    "This is line one.\n",
    "This is line two.\n",
    "This is line three.\n",
    "This is line four."
]

with open('multiple_lines.txt', 'w') as file:
    file.writelines(lines)

print("Successfully wrote multiple lines at once")

Successfully wrote multiple lines at once


**Appending to Files**

In [8]:
# Appending to an existing file
with open('notes.txt', 'a') as file:
    file.write("\nThis line is appended to the file.\n")
    file.write("We can add as many lines as we want.")

print("Successfully appended to notes.txt")

Successfully appended to notes.txt


**4. Reading Files**

**Reading an Entire File**

In [9]:
# Reading the entire file content at once
with open('notes.txt', 'r') as file:
    content = file.read()

print("Content of notes.txt:")
print(content)

Content of notes.txt:
This is my first line.
This is my second line.
This is my third line.
This line is appended to the file.
We can add as many lines as we want.


**Reading a File Line by Line**

In [11]:
# Method 1: Reading line by line with a for loop
print("\nReading line by line using a for loop:")
with open('notes.txt', 'r') as file:
    for line in file:
        print(f"Line: {line.strip()}")  # strip() removes newline characters


Reading line by line using a for loop:
Line: This is my first line.
Line: This is my second line.
Line: This is my third line.
Line: This line is appended to the file.
Line: We can add as many lines as we want.


In [12]:
# Method 2: Using readline() method
print("\nReading line by line using readline():")
with open('notes.txt', 'r') as file:
    line = file.readline()
    while line:
        print(f"Line: {line.strip()}")
        line = file.readline()


Reading line by line using readline():
Line: This is my first line.
Line: This is my second line.
Line: This is my third line.
Line: This line is appended to the file.
Line: We can add as many lines as we want.


In [13]:
# Method 3: Using readlines() to get a list of all lines
print("\nReading all lines into a list:")
with open('notes.txt', 'r') as file:
    lines = file.readlines()

print(f"Number of lines: {len(lines)}")
for i, line in enumerate(lines):
    print(f"Line {i+1}: {line.strip()}")


Reading all lines into a list:
Number of lines: 5
Line 1: This is my first line.
Line 2: This is my second line.
Line 3: This is my third line.
Line 4: This line is appended to the file.
Line 5: We can add as many lines as we want.


**Reading Specific Amount of Data**

In [14]:
# Reading specific number of characters
with open('notes.txt', 'r') as file:
    # Read first 10 characters
    first_part = file.read(10)
    # Read next 10 characters
    second_part = file.read(10)

print("\nReading in chunks:")
print(f"First 10 characters: '{first_part}'")
print(f"Next 10 characters: '{second_part}'")


Reading in chunks:
First 10 characters: 'This is my'
Next 10 characters: ' first lin'


**5. File Positions and Navigation**

**Understanding File Position**


- File position refers to the current location in the file
- Initially the position is at the beginning (position 0)
- As you read or write, the position moves forward

In [15]:
with open('notes.txt', 'r') as file:
    print("\nFile position demonstration:")

    # Check initial position
    position = file.tell()
    print(f"Initial position: {position}")

    # Read 10 characters
    data = file.read(10)
    print(f"Read: '{data}'")

    # Check new position
    position = file.tell()
    print(f"New position: {position}")


File position demonstration:
Initial position: 0
Read: 'This is my'
New position: 10


**Moving Around in Files (Seeking)**

In [16]:
# Moving to specific positions using seek()
with open('notes.txt', 'r') as file:
    print("\nMoving around in files:")

    # Move to position 5
    file.seek(5)
    print(f"After seek(5), position is: {file.tell()}")
    print(f"Reading from position 5: '{file.read(10)}'")

    # Move to beginning
    file.seek(0)
    print(f"After seek(0), back to beginning: {file.tell()}")
    print(f"Reading from beginning: '{file.read(10)}'")

    # Move to position 15
    file.seek(15)
    print(f"After seek(15), position is: {file.tell()}")
    print(f"Reading from position 15: '{file.read(10)}'")


Moving around in files:
After seek(5), position is: 5
Reading from position 5: 'is my firs'
After seek(0), back to beginning: 0
Reading from beginning: 'This is my'
After seek(15), position is: 15
Reading from position 15: 't line.
Th'


**Sequential vs. Random Access**


- Sequential access: reading/writing data in order
- Random access: jumping to specific positions using seek()

In [17]:
print("\nSequential vs. Random Access:")


Sequential vs. Random Access:


In [18]:
# Sequential access example (most common)
with open('notes.txt', 'r') as file:
    print("Sequential access - reading first 3 lines:")
    for _ in range(3):
        print(file.readline().strip())

Sequential access - reading first 3 lines:
This is my first line.
This is my second line.
This is my third line.


In [19]:
# Random access example
with open('notes.txt', 'r') as file:
    print("\nRandom access - reading from different positions:")

    # Read from beginning
    file.seek(0)
    print(f"From position 0: '{file.read(10)}'")

    # Jump to middle
    file.seek(20)
    print(f"From position 20: '{file.read(10)}'")

    # Jump to another position
    file.seek(40)
    print(f"From position 40: '{file.read(10)}'")


Random access - reading from different positions:
From position 0: 'This is my'
From position 20: 'e.
This is'
From position 40: ' line.
Thi'


**6. Working with Binary Files**

**Understanding Binary Mode**


- Text files store characters, binary files store bytes
- Use binary mode ('b') for non-text files like images

In [20]:
print("\nBinary mode is used for non-text files")
print("Add 'b' to the mode: 'rb', 'wb', 'ab'")


Binary mode is used for non-text files
Add 'b' to the mode: 'rb', 'wb', 'ab'


**Writing Binary Data**

In [21]:
# Writing binary data
with open('binary_example.bin', 'wb') as file:
    # Create some binary data
    binary_data = bytes([65, 66, 67, 68, 69])  # ASCII for 'ABCDE'
    file.write(binary_data)

    # Writing string as binary requires encoding
    text = "Hello, Binary World!"
    file.write(text.encode('utf-8'))

print("Binary data written to binary_example.bin")

Binary data written to binary_example.bin


**Reading Binary Data**

In [22]:
# Reading binary data
with open('binary_example.bin', 'rb') as file:
    # Read binary content
    binary_content = file.read()

print("\nBinary data read:")
print(f"Raw bytes: {binary_content}")
print(f"Decoded as text: {binary_content.decode('utf-8')}")


Binary data read:
Raw bytes: b'ABCDEHello, Binary World!'
Decoded as text: ABCDEHello, Binary World!


**Working with Image Files**

In [23]:
# Example of copying an image file
def copy_image(source_path, destination_path):
    # Open source in binary read mode
    with open(source_path, 'rb') as source:
        # Read binary data
        image_data = source.read()

        # Open destination in binary write mode
        with open(destination_path, 'wb') as destination:
            # Write binary data
            destination.write(image_data)

    print(f"Image copied from {source_path} to {destination_path}")

In [24]:
# Create a sample binary file to act as our "image"
with open('sample.jpg', 'wb') as file:
    # Write some dummy data to represent an image
    file.write(bytes(range(0, 100)))  # Just sample bytes

print("\nCreated a sample binary file")

# Copy the "image"
copy_image('sample.jpg', 'sample_copy.jpg')


Created a sample binary file
Image copied from sample.jpg to sample_copy.jpg


**7. Working with CSV Files**

**Understanding CSV Format**


- CSV = Comma-Separated Values
- Used for storing tabular data

**Example CSV content:**
- Name,Age,City
- John,30,New York
- Alice,25,Boston

In [25]:
import csv  # Python's built-in CSV module

print("\nCSV files store tabular data like spreadsheets")


CSV files store tabular data like spreadsheets


**Writing CSV Files**

In [26]:
# Writing a simple CSV file
with open('people.csv', 'w', newline='') as file:
    # Create a CSV writer
    writer = csv.writer(file)

    # Write header row
    writer.writerow(['Name', 'Age', 'City'])

    # Write data rows
    writer.writerow(['John', 30, 'New York'])
    writer.writerow(['Alice', 25, 'Boston'])
    writer.writerow(['Bob', 35, 'Chicago'])
    writer.writerow(['Emma', 28, 'Los Angeles'])

print("CSV file 'people.csv' created")

CSV file 'people.csv' created


**Reading CSV Files**

In [27]:
# Reading from a CSV file
with open('people.csv', 'r', newline='') as file:
    # Create a CSV reader
    reader = csv.reader(file)

    print("\nReading CSV file 'people.csv':")

    # Read and display each row
    for i, row in enumerate(reader):
        if i == 0:
            print(f"Header: {row}")
        else:
            print(f"Row {i}: {row}")


Reading CSV file 'people.csv':
Header: ['Name', 'Age', 'City']
Row 1: ['John', '30', 'New York']
Row 2: ['Alice', '25', 'Boston']
Row 3: ['Bob', '35', 'Chicago']
Row 4: ['Emma', '28', 'Los Angeles']


**Using CSV DictReader and DictWriter**


- CSV DictReader and DictWriter work with dictionaries
- This is often more convenient than working with lists

In [28]:
# Writing CSV with DictWriter
with open('products.csv', 'w', newline='') as file:
    # Define field names
    fieldnames = ['Product', 'Price', 'Quantity']

    # Create a DictWriter
    writer = csv.DictWriter(file, fieldnames=fieldnames)

    # Write header
    writer.writeheader()

    # Write rows
    writer.writerow({'Product': 'Laptop', 'Price': 1200, 'Quantity': 5})
    writer.writerow({'Product': 'Phone', 'Price': 800, 'Quantity': 10})
    writer.writerow({'Product': 'Tablet', 'Price': 500, 'Quantity': 7})

print("\nCreated 'products.csv' using DictWriter")


Created 'products.csv' using DictWriter


In [29]:
# Reading CSV with DictReader
with open('products.csv', 'r', newline='') as file:
    # Create a DictReader
    reader = csv.DictReader(file)

    print("\nReading 'products.csv' using DictReader:")
    for row in reader:
        print(f"Product: {row['Product']}, Price: ${row['Price']}, Quantity: {row['Quantity']}")


Reading 'products.csv' using DictReader:
Product: Laptop, Price: $1200, Quantity: 5
Product: Phone, Price: $800, Quantity: 10
Product: Tablet, Price: $500, Quantity: 7


**8. Practical Applications**

**Example 1: Simple Note-Taking App**

In [30]:
# Simple Note-Taking App
def add_note(note):
    with open('my_notes.txt', 'a') as file:
        file.write(note + '\n')
    print("Note added!")

In [31]:
def read_notes():
    try:
        with open('my_notes.txt', 'r') as file:
            notes = file.readlines()

            print("\nYour Notes:")
            print("-----------")

            if notes:
                for i, note in enumerate(notes, 1):
                    print(f"{i}. {note.strip()}")
            else:
                print("No notes found!")
    except FileNotFoundError:
        print("No notes file found. Add some notes first!")

In [32]:
# Add some notes
print("\nSimple Note-Taking App:")
add_note("Remember to buy groceries")
add_note("Call mom on Sunday")
add_note("Finish Python homework")

# Read all notes
read_notes()


Simple Note-Taking App:
Note added!
Note added!
Note added!

Your Notes:
-----------
1. Remember to buy groceries
2. Call mom on Sunday
3. Finish Python homework


**Example 2: Simple Data Analysis with CSV**

In [37]:
# Simple Data Analysis with CSV
import csv

In [38]:
# Create a CSV file with sales data
with open('sales.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    writer.writerow(['Date', 'Product', 'Amount'])
    writer.writerow(['2023-01-01', 'Laptop', 1200])
    writer.writerow(['2023-01-01', 'Phone', 800])
    writer.writerow(['2023-01-02', 'Laptop', 1200])
    writer.writerow(['2023-01-02', 'Tablet', 500])
    writer.writerow(['2023-01-03', 'Phone', 800])

print("\nCreated sales.csv with sample data")


Created sales.csv with sample data


In [39]:
# Analyze sales data
with open('sales.csv', 'r', newline='') as file:
    reader = csv.reader(file)
    next(reader)  # Skip header row

    # Calculate total sales
    total_sales = 0
    product_sales = {}

    for row in reader:
        date, product, amount = row
        amount = int(amount)
        total_sales += amount

        # Track sales by product
        if product in product_sales:
            product_sales[product] += amount
        else:
            product_sales[product] = amount

    # Print results
    print("\nSales Analysis:")
    print(f"Total Sales: ${total_sales}")
    print("\nSales by Product:")
    for product, amount in product_sales.items():
        print(f"{product}: ${amount}")


Sales Analysis:
Total Sales: $4500

Sales by Product:
Laptop: $2400
Phone: $1600
Tablet: $500


**Example 3: Simple Log File Generator and Analyzer**

In [34]:
# Simple Log File Generator and Analyzer
import random
from datetime import datetime, timedelta

In [35]:
# Generate a sample log file
def generate_log(days=3, entries_per_day=5):
    log_types = ['INFO', 'WARNING', 'ERROR']
    messages = {
        'INFO': ['Server started', 'User logged in', 'Database connected', 'Request processed'],
        'WARNING': ['Slow response', 'High memory usage', 'Connection unstable'],
        'ERROR': ['Database error', 'Connection lost', 'Service unavailable']
    }

    with open('server.log', 'w') as log_file:
        # Start date (3 days ago)
        date = datetime.now() - timedelta(days=days)

        for day in range(days):
            current_date = date + timedelta(days=day)

            for _ in range(entries_per_day):
                # Generate random time
                hour = random.randint(0, 23)
                minute = random.randint(0, 59)
                second = random.randint(0, 59)

                # Format timestamp
                timestamp = current_date.replace(hour=hour, minute=minute, second=second)
                timestamp_str = timestamp.strftime('%Y-%m-%d %H:%M:%S')

                # Select random log type and message
                log_type = random.choice(log_types)
                message = random.choice(messages[log_type])

                # Write log entry
                log_file.write(f"{timestamp_str} {log_type} {message}\n")

    print("\nGenerated sample log file: server.log")

In [36]:
# Analyze log file
def analyze_log():
    log_counts = {'INFO': 0, 'WARNING': 0, 'ERROR': 0}

    with open('server.log', 'r') as log_file:
        for line in log_file:
            for log_type in log_counts:
                if f" {log_type} " in line:
                    log_counts[log_type] += 1

    print("\nLog Analysis Results:")
    total = sum(log_counts.values())
    for log_type, count in log_counts.items():
        percentage = (count / total) * 100 if total > 0 else 0
        print(f"{log_type}: {count} entries ({percentage:.1f}%)")

# Generate and analyze logs
generate_log()
analyze_log()


Generated sample log file: server.log

Log Analysis Results:
INFO: 5 entries (33.3%)
ERROR: 3 entries (20.0%)


## Summary

This notebook covered a comprehensive journey through Python file handling:

1. **Basic File Operations**:
   - Opening and closing files
   - Writing to files
   - Reading from files

2. **File Positioning**:
   - Sequential access
   - Random access using seek()

3. **Different File Types**:
   - Text files
   - Binary files
   - CSV files

4. **Advanced Topics**:
   - Working with multiple files
   - Buffering
   - Update modes (r+, w+, a+)

5. **Practical Applications**:
   - Note-taking app
   - Data analysis with CSV
   - Log file generator and analyzer

These file handling skills form the foundation for data processing, configuration management, logging, and many other programming tasks in Python.