# Virtual Nodes: Lazy Operations

**Duration:** 30 minutes  
**Level:** Intermediate

Learn about virtual nodes - powerful lazy evaluation features for concatenation, diffs, and archives.

## What You'll Learn

- What are virtual nodes and why they're useful
- `iternode()` - Lazy concatenation of multiple files
- `diffnode()` - Lazy unified diff generation
- `zip()` - Creating ZIP archives
- Building documents dynamically
- Comparing file versions

## What are Virtual Nodes?

Virtual nodes are special nodes that **don't correspond to physical files**. They:
- Store references to other nodes
- Compute content **on-demand** (lazy evaluation)
- Are **read-only** (can't write to them)
- Always return `exists=False`

Let's explore! 🔮

In [None]:
from genro_storage import StorageManager

storage = StorageManager()
storage.configure([{'name': 'mem', 'type': 'memory'}])

print("✓ Storage ready")

## 1. Your First Virtual Node: iternode

Concatenate multiple files without creating intermediates:

In [None]:
# Create source files
part1 = storage.node('mem:part1.txt')
part1.write_text('This is part one. ')

part2 = storage.node('mem:part2.txt')
part2.write_text('This is part two. ')

part3 = storage.node('mem:part3.txt')
part3.write_text('This is part three.')

# Create virtual concatenation node
combined = storage.iternode(part1, part2, part3)

print(f"Is virtual: exists={combined.exists}")
print(f"\nCombined content:")
print(combined.read_text())

## 2. Lazy Evaluation

Content is only read when you access it:

In [None]:
# Create source
source = storage.node('mem:source.txt')
source.write_text('Original')

# Create iternode
lazy = storage.iternode(source)

# Modify source AFTER creating iternode
source.write_text('Modified')

# Read from iternode - gets current content!
print(f"Iternode reads: {lazy.read_text()}")
print("\n✓ Changes to source are reflected in virtual node!")

## 3. Building Documents Dynamically

Use `append()` and `extend()` to build content:

In [None]:
# Start with empty builder
builder = storage.iternode()

# Add header
header = storage.node('mem:header.txt')
header.write_text('=== REPORT ===\n\n')
builder.append(header)

# Add sections dynamically
for i in range(1, 4):
    section = storage.node(f'mem:section{i}.txt')
    section.write_text(f'Section {i}\nContent here...\n\n')
    builder.append(section)

# Add footer
footer = storage.node('mem:footer.txt')
footer.write_text('=== END ===')
builder.append(footer)

# Materialize the document
print("Built document:")
print(builder.read_text())

## 4. Extend with Multiple Nodes

Add many nodes at once:

In [None]:
# Create multiple files
files = []
for i in range(5):
    f = storage.node(f'mem:log{i}.txt')
    f.write_text(f'Log entry {i}\n')
    files.append(f)

# Build log file
log_builder = storage.iternode()
log_builder.extend(*files)  # Add all at once

print("Combined log:")
print(log_builder.read_text())

## 5. Saving Virtual Node Content

Use `copy()` to materialize to a real file:

In [None]:
# Build content
intro = storage.node('mem:intro.txt')
intro.write_text('Introduction\n')

body = storage.node('mem:body.txt')
body.write_text('Main content\n')

conclusion = storage.node('mem:conclusion.txt')
conclusion.write_text('Conclusion\n')

document = storage.iternode(intro, body, conclusion)

# Save to real file
final = storage.node('mem:final_document.txt')
document.copy(final)

print(f"✓ Document saved")
print(f"Final file exists: {final.exists}")
print(f"Size: {final.size} bytes")

## 6. Diffnode: Comparing Files

Generate unified diffs between files:

In [None]:
# Create two versions
v1 = storage.node('mem:config_v1.txt')
v1.write_text('''database:
  host: localhost
  port: 5432
  name: myapp
cache:
  enabled: true
  ttl: 3600
''')

v2 = storage.node('mem:config_v2.txt')
v2.write_text('''database:
  host: prod-server.example.com
  port: 5432
  name: myapp_prod
cache:
  enabled: true
  ttl: 7200
''')

# Create diff
diff = storage.diffnode(v1, v2)

print("Unified diff:")
print(diff.read_text())

## 7. Saving Diffs

Save diff output to a file:

In [None]:
# Create diff file
diff_file = storage.node('mem:changes.diff')
diff.copy(diff_file)

print("✓ Diff saved to file")
print(f"File size: {diff_file.size} bytes")
print(f"\nContent:")
print(diff_file.read_text())

## 8. Diff Empty Files

Diffing identical files returns empty:

In [None]:
# Two identical files
same1 = storage.node('mem:same1.txt')
same1.write_text('Identical content\n')

same2 = storage.node('mem:same2.txt')
same2.write_text('Identical content\n')

# Create diff
no_diff = storage.diffnode(same1, same2)
result = no_diff.read_text()

print(f"Diff output length: {len(result)} chars")
if result:
    print(f"Diff: {result}")
else:
    print("✓ Files are identical, no diff")

## 9. Binary Files and Diff

Diffnode detects and rejects binary files:

In [None]:
# Create binary files
bin1 = storage.node('mem:file1.bin')
bin1.write_bytes(b'\x00\x01\x02\x03')

bin2 = storage.node('mem:file2.bin')
bin2.write_bytes(b'\x04\x05\x06\x07')

# Try to diff binary files
binary_diff = storage.diffnode(bin1, bin2)

try:
    binary_diff.read_text()
except ValueError as e:
    print(f"✓ Binary files rejected: {e}")

## 10. Creating ZIP Archives

Use `zip()` to create archives:

In [None]:
# Create files to archive
doc1 = storage.node('mem:document1.txt')
doc1.write_text('First document')

doc2 = storage.node('mem:document2.txt')
doc2.write_text('Second document')

doc3 = storage.node('mem:document3.txt')
doc3.write_text('Third document')

# Create iternode with all files
archive = storage.iternode(doc1, doc2, doc3)

# Generate ZIP
zip_bytes = archive.zip()

print(f"✓ ZIP created: {len(zip_bytes)} bytes")
print(f"Starts with ZIP signature: {zip_bytes[:2] == b'PK'}")

## 11. Saving ZIP Archives

Write ZIP to a file:

In [None]:
# Save ZIP to file
zip_file = storage.node('mem:backup.zip')
zip_file.write_bytes(zip_bytes)

print(f"✓ ZIP file saved")
print(f"File: {zip_file.fullpath}")
print(f"Size: {zip_file.size} bytes")

# Verify it's a valid ZIP
import zipfile
import io

with zipfile.ZipFile(io.BytesIO(zip_bytes)) as zf:
    print(f"\nZIP contains:")
    for name in zf.namelist():
        print(f"  - {name}")

## 12. ZIP a Single File

You can also ZIP individual files:

In [None]:
# Create a file
report = storage.node('mem:report.pdf')
report.write_text('PDF content here...')

# ZIP it directly
zip_bytes = report.zip()

# Check contents
with zipfile.ZipFile(io.BytesIO(zip_bytes)) as zf:
    print(f"ZIP contains: {zf.namelist()}")
    print(f"Content: {zf.read('report.pdf').decode()}")

## 13. ZIP a Directory

ZIP an entire directory tree:

In [None]:
# Create directory structure
project = storage.node('mem:project')
project.mkdir()

project.child('README.md').write_text('# My Project')
project.child('main.py').write_text('print("hello")')

src = project.child('src')
src.mkdir()
src.child('app.py').write_text('# App code')
src.child('utils.py').write_text('# Utils')

# ZIP entire directory
zip_bytes = project.zip()

# Inspect
with zipfile.ZipFile(io.BytesIO(zip_bytes)) as zf:
    print("Project ZIP contains:")
    for name in sorted(zf.namelist()):
        print(f"  {name}")

## 14. Practical Example: Report Generator

Build a report from multiple data sources:

In [None]:
def generate_report(storage, data):
    """Generate a report from data"""
    builder = storage.iternode()
    
    # Header
    header = storage.node('mem:_report_header.txt')
    header.write_text(f"{'='*50}\n{data['title']}\n{'='*50}\n\n")
    builder.append(header)
    
    # Sections
    for i, section in enumerate(data['sections']):
        sec_node = storage.node(f'mem:_section_{i}.txt')
        sec_node.write_text(f"## {section['title']}\n{section['content']}\n\n")
        builder.append(sec_node)
    
    # Footer
    footer = storage.node('mem:_report_footer.txt')
    footer.write_text(f"\nGenerated: {data.get('date', 'today')}")
    builder.append(footer)
    
    return builder

# Use it
report_data = {
    'title': 'Q4 2024 Sales Report',
    'sections': [
        {'title': 'Summary', 'content': 'Revenue increased by 15%'},
        {'title': 'Details', 'content': 'Top products: A, B, C'},
        {'title': 'Forecast', 'content': 'Expected growth: 20%'}
    ],
    'date': '2024-01-15'
}

report = generate_report(storage, report_data)

print("Generated report:")
print(report.read_text())

# Save it
final_report = storage.node('mem:q4_report.txt')
report.copy(final_report)
print(f"\n✓ Report saved: {final_report.size} bytes")

## 15. Practical Example: Version Tracking

Track changes between config versions:

In [None]:
def track_config_changes(storage, old_config, new_config, version):
    """Save config and generate changelog"""
    # Save new config
    config_file = storage.node(f'mem:config_v{version}.txt')
    config_file.write_text(new_config)
    
    # If we have previous version, generate diff
    if old_config:
        old_file = storage.node(f'mem:config_v{version-1}.txt')
        old_file.write_text(old_config)
        
        diff = storage.diffnode(old_file, config_file)
        changelog = storage.node(f'mem:changelog_v{version}.diff')
        diff.copy(changelog)
        
        return changelog
    return None

# Track changes
v1_config = "timeout: 30\nretries: 3\n"
v2_config = "timeout: 60\nretries: 5\n"

changelog = track_config_changes(storage, v1_config, v2_config, 2)

if changelog:
    print("Changes in v2:")
    print(changelog.read_text())

## 16. Try It Yourself! 🎯

**Exercise 1:** Create a function that builds an HTML page from sections:

In [None]:
def build_html_page(storage, title, sections):
    """
    Build HTML from sections.
    sections = [{'heading': 'H1', 'content': 'text'}, ...]
    """
    # Your code here
    pass

# Test it
# page = build_html_page(storage, 'My Page', [
#     {'heading': 'Welcome', 'content': 'Hello world'},
#     {'heading': 'About', 'content': 'This is my site'}
# ])
# print(page.read_text())

**Exercise 2:** Create a backup function that ZIPs files and saves with date:

In [None]:
from datetime import datetime

def backup_files(storage, files, backup_name):
    """
    Create ZIP backup of files with timestamp.
    Returns the backup node.
    """
    # Your code here
    pass

# Test it
# files_to_backup = [
#     storage.node('mem:important1.txt'),
#     storage.node('mem:important2.txt')
# ]
# backup = backup_files(storage, files_to_backup, 'daily_backup')
# print(f"Backup: {backup.fullpath}, {backup.size} bytes")

**Exercise 3:** Compare three config versions and generate summary:

In [None]:
def compare_versions(storage, v1, v2, v3):
    """
    Compare three versions and return summary of changes.
    Return dict with 'v1_to_v2' and 'v2_to_v3' diffs.
    """
    # Your code here
    pass

## Summary

You've mastered virtual nodes:

- ✓ `iternode()` for lazy concatenation
- ✓ `append()` and `extend()` for dynamic building
- ✓ `diffnode()` for comparing files
- ✓ `zip()` for creating archives
- ✓ Lazy evaluation principles
- ✓ Building documents dynamically
- ✓ Version tracking and comparison

## Key Concepts

- **Virtual nodes** have no physical storage
- **Lazy evaluation** - content computed on-demand
- **Read-only** - cannot write to virtual nodes
- **Always `exists=False`**
- Use `copy()` to materialize content

## When to Use

**Use iternode when:**
- Building documents from multiple sources
- Creating reports with dynamic sections
- Avoiding intermediate files
- Creating archives from multiple files

**Use diffnode when:**
- Comparing file versions
- Generating changelogs
- Tracking configuration changes
- Creating patch files

**Use zip() when:**
- Creating backups
- Packaging multiple files
- Compressing data for transfer

## What's Next?

Continue to:

- **[05_copy_strategies.ipynb](05_copy_strategies.ipynb)** - Smart copying with skip strategies
- **[06_versioning.ipynb](06_versioning.ipynb)** - S3 file versioning

Happy virtualizing! ✨