# Working with Existing Course Examples

This notebook demonstrates the concepts from the original course materials, showing how they fit into our comprehensive understanding of file programming.

## 🔄 File Modes in Practice (from class01.py)

Let's revisit the original examples and understand them better with our new knowledge:

In [None]:
# Import our examples module
import sys
sys.path.append('../examples')

from basic_file_operations import (
    demo_write_mode, demo_append_mode, demo_exclusive_mode,
    demo_binary_write_mode, demo_read_plus_write_mode
)

# Run the original demonstrations with our enhanced understanding
print("🔄 Running Original Course Examples with Enhanced Understanding")
print("=" * 70)

demo_write_mode()
demo_append_mode()
demo_exclusive_mode()
demo_binary_write_mode()
demo_read_plus_write_mode()

## 📊 Performance Analysis (from class02.py)

The original course included a performance comparison between pickle and text files. Let's expand on this:

In [None]:
import pickle
import json
import time
import os

def enhanced_pickle_vs_text_comparison():
    """Enhanced version of the original pickle vs text comparison"""
    
    # Create more complex test data
    test_data = {
        'user_info': {
            'name': 'John Doe',
            'age': 30,
            'email': 'john@example.com',
            'preferences': ['python', 'data_science', 'machine_learning']
        },
        'session_data': {
            'login_time': time.time(),
            'actions': ['login', 'view_dashboard', 'edit_profile', 'logout'],
            'settings': {'theme': 'dark', 'notifications': True}
        },
        'large_dataset': list(range(10000)),  # Simulate larger data
        'metadata': {
            'version': '2.1.0',
            'created_by': 'file_programming_course',
            'encoding': 'utf-8'
        }
    }
    
    print("📊 Enhanced Pickle vs Text File Comparison")
    print("=" * 50)
    
    # Test pickle operations
    pickle_file = '../sample_files/enhanced_data.pkl'
    
    # Pickle write
    start_time = time.time()
    with open(pickle_file, 'wb') as f:
        pickle.dump(test_data, f)
    pickle_write_time = time.time() - start_time
    
    # Pickle read
    start_time = time.time()
    with open(pickle_file, 'rb') as f:
        pickle_loaded = pickle.load(f)
    pickle_read_time = time.time() - start_time
    
    # Test JSON operations
    json_file = '../sample_files/enhanced_data.json'
    
    # JSON write
    start_time = time.time()
    with open(json_file, 'w', encoding='utf-8') as f:
        json.dump(test_data, f, indent=2)
    json_write_time = time.time() - start_time
    
    # JSON read
    start_time = time.time()
    with open(json_file, 'r', encoding='utf-8') as f:
        json_loaded = json.load(f)
    json_read_time = time.time() - start_time
    
    # Get file sizes
    pickle_size = os.path.getsize(pickle_file)
    json_size = os.path.getsize(json_file)
    
    # Display results
    print(f"⏱️ Performance Results:")
    print(f"   Pickle write: {pickle_write_time:.6f} seconds")
    print(f"   JSON write:   {json_write_time:.6f} seconds")
    print(f"   Pickle read:  {pickle_read_time:.6f} seconds")
    print(f"   JSON read:    {json_read_time:.6f} seconds")
    
    print(f"\n💾 File Size Comparison:")
    print(f"   Pickle file: {pickle_size:,} bytes")
    print(f"   JSON file:   {json_size:,} bytes")
    print(f"   Size ratio:  {json_size/pickle_size:.2f}x (JSON vs Pickle)")
    
    print(f"\n🔍 Data Integrity Check:")
    print(f"   Pickle data matches: {pickle_loaded == test_data}")
    print(f"   JSON data matches:   {json_loaded == test_data}")
    
    # Show the actual file contents (first few lines)
    print(f"\n📄 File Content Preview:")
    print(f"   Pickle (binary): {open(pickle_file, 'rb').read()[:50]}...")
    
    with open(json_file, 'r', encoding='utf-8') as f:
        json_preview = f.read()[:200]
    print(f"   JSON (text): {json_preview}...")

enhanced_pickle_vs_text_comparison()

## 🔍 File Seeking Operations (from class_03.py)

The original course touched on file seeking. Let's explore this more thoroughly:

In [None]:
from advanced_file_techniques import demonstrate_file_seeking, performance_comparison_with_seeking

print("🎯 Enhanced File Seeking Demonstration")
print("=" * 50)

# Run the enhanced seeking demonstrations
demonstrate_file_seeking()
performance_comparison_with_seeking()

## 📝 File Analysis Homework Solutions

Let's solve the homework problems from the original course with proper implementations:

In [None]:
from file_analysis_homework import (
    count_words, count_lines, count_characters, count_sentences,
    count_vowels, count_consonants, count_uppercase_letters,
    count_digits, count_spaces, count_tabs, count_newlines,
    count_punctuation, analyze_file_comprehensive
)

# Create a comprehensive test file
test_content = """Hello World! This is a comprehensive test file.
It contains UPPERCASE and lowercase letters, numbers like 123 and 456.
There are punctuation marks: periods, commas, exclamation points!
Some lines have\ttabs\tand   multiple   spaces.
We also have vowels (a, e, i, o, u) and consonants (b, c, d, f, g...).

This file tests all our analysis functions.
Can you count everything correctly? Let's see!"""

test_file = '../sample_files/comprehensive_test.txt'
with open(test_file, 'w', encoding='utf-8') as f:
    f.write(test_content)

print("📊 Comprehensive File Analysis (Homework Solutions)")
print("=" * 60)

# Run comprehensive analysis
analyze_file_comprehensive(test_file)

print("\n🎯 Individual Function Tests:")
functions_to_test = [
    ('Words', count_words),
    ('Lines', count_lines),
    ('Characters', count_characters),
    ('Sentences', count_sentences),
    ('Vowels', count_vowels),
    ('Consonants', count_consonants),
    ('Uppercase', count_uppercase_letters),
    ('Digits', count_digits),
    ('Spaces', count_spaces),
    ('Tabs', count_tabs),
    ('Newlines', count_newlines),
    ('Punctuation', count_punctuation)
]

for name, func in functions_to_test:
    result = func(test_file)
    print(f"   {name:12}: {result:4,}")

## 🌟 Demonstrating File Extension Independence

Let's prove that file extensions don't define content, as mentioned in the original course:

In [None]:
def demonstrate_extension_independence():
    """Show that file extensions don't define content"""
    
    print("🎭 File Extension Independence Demonstration")
    print("=" * 50)
    
    # Create the same content with different extensions
    content = "This is the same content, regardless of extension!"
    
    files_with_different_extensions = [
        '../sample_files/same_content.txt',
        '../sample_files/same_content.random_world',
        '../sample_files/same_content.fake_video',
        '../sample_files/same_content.not_an_image',
        '../sample_files/same_content.whatever'
    ]
    
    # Write the same content to all files
    for filepath in files_with_different_extensions:
        with open(filepath, 'w', encoding='utf-8') as f:
            f.write(content)
    
    print("✅ Created files with identical content but different extensions:")
    
    # Read and verify all files have the same content
    for filepath in files_with_different_extensions:
        with open(filepath, 'r', encoding='utf-8') as f:
            read_content = f.read()
        
        extension = filepath.split('.')[-1]
        matches = read_content == content
        print(f"   .{extension:15} → Content matches: {matches} ✅")
    
    print("\n💡 Key Insight: The extension doesn't change the content!")
    print("   Programs determine file type by examining the actual content,")
    print("   not by trusting the extension.")
    
    # Demonstrate with binary content too
    print("\n🔢 Same demonstration with binary content:")
    
    binary_content = b'\x89PNG\r\n\x1a\n'  # PNG file signature
    binary_files = [
        '../sample_files/fake_png.txt',
        '../sample_files/fake_png.doc',
        '../sample_files/fake_png.mp3'
    ]
    
    for filepath in binary_files:
        with open(filepath, 'wb') as f:
            f.write(binary_content)
    
    for filepath in binary_files:
        with open(filepath, 'rb') as f:
            read_binary = f.read()
        
        extension = filepath.split('.')[-1]
        is_png_signature = read_binary.startswith(b'\x89PNG')
        print(f"   .{extension:3} file → Contains PNG signature: {is_png_signature} ✅")
    
    print("\n🎯 Conclusion: A file's content is independent of its extension!")

demonstrate_extension_independence()

## 🔧 Advanced Binary Operations

Building on the binary file concepts from the original course:

In [None]:
from advanced_file_techniques import demonstrate_binary_operations, demonstrate_struct_operations

print("🔢 Advanced Binary File Operations")
print("=" * 50)

# Run binary demonstrations
demonstrate_binary_operations()
demonstrate_struct_operations()

## 📈 Memory-Efficient File Processing

For large files, we need efficient processing techniques:

In [None]:
def demonstrate_efficient_processing():
    """Show memory-efficient file processing techniques"""
    
    print("🚀 Memory-Efficient File Processing")
    print("=" * 40)
    
    # Create a moderately large file for demonstration
    large_file = '../sample_files/processing_demo.txt'
    
    print("📝 Creating demonstration file...")
    with open(large_file, 'w', encoding='utf-8') as f:
        for i in range(5000):
            f.write(f"Line {i+1:04d}: This is sample content for processing demonstration.\n")
    
    file_size = os.path.getsize(large_file)
    print(f"✅ Created file with {file_size:,} bytes")
    
    # Method 1: Process line by line (memory efficient)
    print("\n🔄 Method 1: Line-by-line processing (memory efficient)")
    start_time = time.time()
    
    line_count = 0
    word_count = 0
    char_count = 0
    
    with open(large_file, 'r', encoding='utf-8') as f:
        for line in f:
            line_count += 1
            word_count += len(line.split())
            char_count += len(line)
    
    line_time = time.time() - start_time
    
    print(f"   Processed {line_count:,} lines")
    print(f"   Found {word_count:,} words")
    print(f"   Counted {char_count:,} characters")
    print(f"   Time taken: {line_time:.3f} seconds")
    
    # Method 2: Process in chunks (for binary or when you need more control)
    print("\n🔄 Method 2: Chunk-based processing")
    start_time = time.time()
    
    chunk_size = 4096  # 4KB chunks
    total_bytes = 0
    chunk_count = 0
    
    with open(large_file, 'rb') as f:
        while True:
            chunk = f.read(chunk_size)
            if not chunk:
                break
            total_bytes += len(chunk)
            chunk_count += 1
    
    chunk_time = time.time() - start_time
    
    print(f"   Processed {chunk_count:,} chunks")
    print(f"   Total bytes: {total_bytes:,}")
    print(f"   Time taken: {chunk_time:.3f} seconds")
    
    # Method 3: Generator-based processing (most memory efficient)
    print("\n🔄 Method 3: Generator-based processing (most efficient)")
    
    def process_file_generator(filepath):
        """Generator that yields processed lines"""
        with open(filepath, 'r', encoding='utf-8') as f:
            for line_num, line in enumerate(f, 1):
                # Process each line and yield results
                words = line.split()
                yield {
                    'line_number': line_num,
                    'word_count': len(words),
                    'char_count': len(line),
                    'first_word': words[0] if words else ''
                }
    
    start_time = time.time()
    
    total_words = 0
    total_chars = 0
    processed_lines = 0
    
    for line_info in process_file_generator(large_file):
        total_words += line_info['word_count']
        total_chars += line_info['char_count']
        processed_lines += 1
        
        # Show progress every 1000 lines
        if processed_lines % 1000 == 0:
            print(f"   Processed {processed_lines:,} lines...")
    
    generator_time = time.time() - start_time
    
    print(f"   Final results: {processed_lines:,} lines, {total_words:,} words, {total_chars:,} chars")
    print(f"   Time taken: {generator_time:.3f} seconds")
    
    print(f"\n📊 Performance Summary:")
    print(f"   Line-by-line: {line_time:.3f}s")
    print(f"   Chunk-based:  {chunk_time:.3f}s")
    print(f"   Generator:    {generator_time:.3f}s")
    
    # Clean up
    os.remove(large_file)
    print(f"\n🧹 Cleaned up demonstration file")

demonstrate_efficient_processing()

## 🎯 Key Takeaways from Original Course Integration

1. **File modes matter** - Understanding when to use 'r', 'w', 'a', 'x', and their combinations
2. **Performance considerations** - Pickle vs JSON, seeking vs full reads
3. **File extensions are just hints** - Content determines file type, not extension
4. **Memory efficiency** - Process large files line-by-line or in chunks
5. **Binary vs text** - Know when to use each mode
6. **Error handling** - Always use context managers and handle exceptions

## 🔜 What's Next?

The original course materials provide a solid foundation. In the next notebooks, we'll explore:
- Modern file handling with `pathlib`
- Working with structured data (CSV, JSON)
- Advanced binary file formats
- File system operations and management

This integration shows how fundamental concepts build into advanced file programming skills!