# ManifoldOS Extension System

Comprehensive demonstration of the ManifoldOS extension architecture.

## What We'll Cover

1. **Extension Basics** - Understanding the extension system
2. **Backward Compatibility** - Old code still works
3. **New Configuration** - Modern extension setup
4. **Capability Discovery** - Query available features
5. **Extension Lifecycle** - Initialize, use, cleanup
6. **Storage Extension** - DuckDB deep dive
7. **Best Practices** - Production patterns

In [1]:
# Setup
import sys
from pathlib import Path

project_root = Path.cwd()
if str(project_root) not in sys.path:
    sys.path.insert(0, str(project_root))

print(f"Project root: {project_root}")

Project root: /home/alexmy/SGS/SGS_lib/hllset_manifold


## 1. Extension System Basics

The extension system provides pluggable architecture for external resources.

In [2]:
from core.extensions import ExtensionRegistry, DuckDBStorageExtension

# Create extension registry
registry = ExtensionRegistry()

print("✓ Extension system loaded")
print(f"  Registered extensions: {registry.list_extensions()}")
print(f"  Available capabilities: {registry.list_capabilities()}")

✓ Extension system loaded
  Registered extensions: []
  Available capabilities: {}


### Manual Extension Registration

In [3]:
# Create and register storage extension
storage_ext = DuckDBStorageExtension()
success = registry.register('storage', storage_ext, config={'db_path': ':memory:'})

print(f"Registration: {'✓ Success' if success else '✗ Failed'}")

if success:
    # Get extension info
    info = storage_ext.get_info()
    print(f"\nExtension Info:")
    print(f"  Name: {info.name}")
    print(f"  Version: {info.version}")
    print(f"  Description: {info.description}")
    print(f"\nCapabilities:")
    for cap, available in storage_ext.get_capabilities().items():
        status = '✓' if available else '✗'
        print(f"  {status} {cap}")

✓ Extension registered: storage v1.4.4
Registration: ✓ Success

Extension Info:
  Name: DuckDB Storage
  Version: 1.4.4
  Description: Embedded SQL database for LUT persistence

Capabilities:
  ✓ persistent_storage
  ✓ lut_queries
  ✓ metadata_tracking
  ✓ sql_queries
  ✓ transactions
  ✓ analytics
  ✓ content_addressable
  ✓ append_only
  ✓ idempotent


### Query Capabilities

In [4]:
# Check what's available
print("Capability Check:\n")

capabilities_to_check = [
    'persistent_storage',
    'lut_queries',
    'sql_queries',
    'caching',  # Not available
    'monitoring'  # Not available
]

for cap in capabilities_to_check:
    available = registry.has_capability(cap)
    status = '✓' if available else '✗'
    print(f"  {status} {cap}")

# List all capabilities
print(f"\nAll available capabilities:")
for cap, extensions in registry.list_capabilities().items():
    print(f"  {cap}: {extensions}")

Capability Check:

  ✓ persistent_storage
  ✓ lut_queries
  ✓ sql_queries
  ✗ caching
  ✗ monitoring

All available capabilities:
  persistent_storage: ['storage']
  lut_queries: ['storage']
  metadata_tracking: ['storage']
  sql_queries: ['storage']
  transactions: ['storage']
  analytics: ['storage']
  content_addressable: ['storage']
  append_only: ['storage']
  idempotent: ['storage']


## 2. Backward Compatibility

Old code continues to work without changes.

In [5]:
from core.manifold_os import ManifoldOS

# Old style - still works!
print("[Old Style Configuration]\n")
os_old = ManifoldOS(lut_db_path=':memory:')

print(f"\nManifoldOS initialized")
print(f"  lut_store available: {os_old.lut_store is not None}")
print(f"  Extension registry: {len(os_old.extensions.list_extensions())} extensions")

# Old API still works
if os_old.lut_store:
    stats = os_old.lut_store.get_stats()
    print(f"\n✓ Old API (os.lut_store) still works")
    print(f"  Token hashes: {stats['total_token_hashes']}")

[Old Style Configuration]

✓ Extension registered: storage v1.4.4
✓ Extension registered: storage v1.4.4

ManifoldOS initialized
  lut_store available: True
  Extension registry: 1 extensions

✓ Old API (os.lut_store) still works
  Token hashes: 0


## 3. New Extension Configuration

Modern, explicit extension setup.

In [6]:
# New style - explicit extensions
print("[New Style Configuration]\n")

os_new = ManifoldOS(extensions={
    'storage': {
        'type': 'duckdb',
        'db_path': ':memory:'
    }
})

print("ManifoldOS initialized with extensions")
print(f"\nExtensions registered:")
for ext_name in os_new.extensions.list_extensions():
    ext = os_new.extensions.get(ext_name)
    info = ext.get_info()
    print(f"  • {ext_name}: {info.name} v{info.version}")

print(f"\nAvailable capabilities:")
for cap, exts in os_new.extensions.list_capabilities().items():
    print(f"  • {cap}")

[New Style Configuration]

✓ Extension registered: storage v1.4.4
✓ Extension registered: storage v1.4.4
ManifoldOS initialized with extensions

Extensions registered:
  • storage: DuckDB Storage v1.4.4

Available capabilities:
  • persistent_storage
  • lut_queries
  • metadata_tracking
  • sql_queries
  • transactions
  • analytics
  • content_addressable
  • append_only
  • idempotent


## 4. Using Extensions

Access and use extension functionality.

In [7]:
# Get storage extension
storage = os_new.extensions.get('storage')

if storage and storage.is_available():
    print("Storage Extension Active\n")
    
    # Ingest some data
    test_data = [
        "customer premium revenue growth",
        "enterprise software cloud platform",
        "analytics dashboard metrics performance"
    ]
    
    print("Ingesting test data:\n")
    for i, data in enumerate(test_data, 1):
        rep = os_new.ingest(data, metadata={
            'source': 'demo',
            'batch': i
        })
        
        if rep and hasattr(rep, 'hllsets') and rep.hllsets and 1 in rep.hllsets:
            hash_val = rep.hllsets[1].name
            print(f"  {i}. {data[:30]:30} | {hash_val[:16]}...")
    
    # Get storage stats
    stats = storage.get_stats()
    print(f"\nStorage Statistics:")
    print(f"  Total token hashes: {stats['total_token_hashes']}")
    print(f"  N-groups: {stats['n_groups']}")
else:
    print("⚠ Storage extension not available")

Storage Extension Active

Ingesting test data:

  ✓ LUT committed: n=1, hash=3b1500132d733dc3..., id=4
  ✓ LUT committed: n=2, hash=79e03d9dff5e45ba..., id=3
  ✓ LUT committed: n=3, hash=23d43e48e2bd8b17..., id=2
  1. customer premium revenue growt | 3b1500132d733dc3...
  ✓ LUT committed: n=1, hash=08288c85d0287022..., id=4
  ✓ LUT committed: n=2, hash=8e405f5b90924eea..., id=3
  ✓ LUT committed: n=3, hash=3e8ef665a404f80e..., id=2
  2. enterprise software cloud plat | 08288c85d0287022...
  ✓ LUT committed: n=1, hash=398c2fb114c7ef9b..., id=4
  ✓ LUT committed: n=2, hash=7cac31c19b439daa..., id=3
  ✓ LUT committed: n=3, hash=3035fe860012a098..., id=2
  3. analytics dashboard metrics pe | 398c2fb114c7ef9b...

Storage Statistics:
  Total token hashes: 27
  N-groups: {1: 12, 2: 9, 3: 6}


## 5. Extension Lifecycle

Proper initialization and cleanup.

In [8]:
# Context manager pattern
print("Using Extension Registry with Context Manager\n")

with ExtensionRegistry() as reg:
    # Register extension
    ext = DuckDBStorageExtension()
    reg.register('storage', ext, {'db_path': ':memory:'})
    
    print(f"Extensions active: {reg.list_extensions()}")
    
    # Use extension
    if reg.has('storage'):
        storage = reg.get('storage')
        print(f"✓ Storage ready: {storage.is_available()}")
    
    # Cleanup happens automatically on exit

print("\n✓ Extensions cleaned up automatically")

Using Extension Registry with Context Manager

✓ Extension registered: storage v1.4.4
Extensions active: ['storage']
✓ Storage ready: True

✓ Extensions cleaned up automatically


## 7. Best Practices

Production patterns for extension usage.

In [10]:
# Pattern 2: Configuration management
print("Pattern 2: Configuration Management\n")

# Development config
dev_config = {
    'extensions': {
        'storage': {
            'type': 'duckdb',
            'db_path': ':memory:'  # Fast, no persistence
        }
    }
}

# Production config
prod_config = {
    'storage_path': Path('./data'),
    'extensions': {
        'storage': {
            'type': 'duckdb',
            'db_path': './data/metadata.duckdb',
            'threads': 4
        }
    }
}

# Use appropriate config
config = dev_config  # Switch based on environment
os_configured = ManifoldOS(**config)

print(f"Environment: {'Development' if config is dev_config else 'Production'}")
print(f"Extensions: {os_configured.extensions.list_extensions()}")
print("\n✓ Configuration-driven setup")

Pattern 2: Configuration Management

✓ Extension registered: storage v1.4.4
✓ Extension registered: storage v1.4.4
Environment: Development
Extensions: ['storage']

✓ Configuration-driven setup


In [11]:
# Pattern 3: Error handling
print("Pattern 3: Graceful Error Handling\n")

def safe_query(os, token):
    """Query with error handling."""
    storage = os.extensions.get('storage')
    
    if not storage:
        print("  ℹ Storage extension not registered")
        return []
    
    if not storage.is_available():
        print("  ℹ Storage extension not available")
        return []
    
    try:
        results = storage.query_by_token(n=1, token_tuple=token)
        print(f"  ✓ Found {len(results)} results for {token}")
        return results
    except Exception as e:
        print(f"  ⚠ Query failed: {e}")
        return []

results = safe_query(os, ('customer',))
print("\n✓ Robust error handling")

Pattern 3: Graceful Error Handling



NameError: name 'os' is not defined

## 8. Performance Monitoring

In [None]:
import time

# Benchmark with storage
os_bench = ManifoldOS(extensions={
    'storage': {'type': 'duckdb', 'db_path': ':memory:'}
})

print("Performance Benchmark\n")

# Ingestion benchmark
test_data = [f"sample text {i} data content" for i in range(50)]

start = time.time()
for data in test_data:
    os_bench.ingest(data)
elapsed = time.time() - start

print(f"Ingestion: {len(test_data)} items in {elapsed:.3f}s")
print(f"Throughput: {len(test_data)/elapsed:.1f} items/sec")

# Storage stats
storage = os_bench.extensions.get('storage')
if storage:
    stats = storage.get_stats()
    print(f"\nStorage:")
    print(f"  Token hashes: {stats['total_token_hashes']}")
    print(f"  N-groups: {stats['n_groups']}")

print("\n✓ Benchmarking complete")

## Summary

### Key Takeaways

1. ✅ **Backward Compatible**: Old code works unchanged
2. ✅ **Explicit Configuration**: New extension parameter
3. ✅ **Capability Discovery**: Query what's available
4. ✅ **Graceful Degradation**: Core works without extensions
5. ✅ **Production Ready**: Proper lifecycle management

### Extension Benefits

- **Modularity**: Add capabilities without changing core
- **Testability**: Easy to mock extensions
- **Flexibility**: Choose what you need
- **Performance**: Optional features don't slow core
- **Maintainability**: Clear separation of concerns

### Next Steps

- Explore storage extension features
- Build custom extensions
- Deploy to production
- Monitor performance