# ManifoldOS: Universal Constructor Pattern

This notebook explores ManifoldOS as a universal constructor implementing the ICASRA pattern with driver management.

**Key Concepts**:
- Universal Constructor (A, B, C, D components)
- Driver lifecycle management
- Immutability and idempotence
- Content addressability

## 1. Create ManifoldOS

ManifoldOS automatically creates a default IngestDriver on initialization.

In [1]:
from core.manifold_os import ManifoldOS

# Create OS
os = ManifoldOS()

print("ManifoldOS created")
print("\nICASRA Components:")
print("  A (Constructor): Validates and commits states")
print("  B (Copier): Reproduces states with structure preservation")
print("  C (Controller): Coordinates driver lifecycle")
print("  D (Interface): Manages external data ingestion")

# Check default driver
drivers = os.list_drivers()
print(f"\nDefault drivers: {list(drivers.keys())}")

✓ Extension registered: storage v1.4.4
✓ Extension registered: storage v1.4.4
ManifoldOS created

ICASRA Components:
  A (Constructor): Validates and commits states
  B (Copier): Reproduces states with structure preservation
  C (Controller): Coordinates driver lifecycle
  D (Interface): Manages external data ingestion

Default drivers: ['ingest_default']


## 2. Basic Data Ingestion

Ingest text data through the IngestDriver.

In [2]:
# Ingest single document
text = "the quick brown fox jumps over the lazy dog"
result = os.ingest(text)

print(f"Ingested text: '{text}'")
print(f"\nResult type: {type(result).__name__}")
print(f"Original tokens: {result.original_tokens}")
print(f"\nHLLSets created: {len(result.hllsets)}")

for n, hllset in sorted(result.hllsets.items()):
    print(f"  {n}-token HLLSet: cardinality ≈ {hllset.cardinality():.2f}")

  ✓ LUT committed: n=1, hash=dbe1e60d59b56929..., id=8
  ✓ LUT committed: n=2, hash=fdb8b90c523da265..., id=8
  ✓ LUT committed: n=3, hash=e4545c77cb6272c7..., id=7
Ingested text: 'the quick brown fox jumps over the lazy dog'

Result type: NTokenRepresentation
Original tokens: ['the', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']

HLLSets created: 3
  1-token HLLSet: cardinality ≈ 9.00
  2-token HLLSet: cardinality ≈ 11.00
  3-token HLLSet: cardinality ≈ 7.00


## 3. Batch Ingestion

Process multiple documents efficiently.

In [3]:
# Multiple documents
documents = [
    "first document about data science",
    "second document about machine learning",
    "third document about artificial intelligence"
]

results = os.ingest_batch(documents)

print(f"Processed {len(results)} documents\n")

for i, result in enumerate(results, 1):
    print(f"Document {i}:")
    print(f"  Tokens: {len(result.original_tokens)}")
    print(f"  HLLSets: {len(result.hllsets)}")
    print(f"  1-token cardinality: {result.hllsets[1].cardinality():.2f}")
    print()

Processed 3 documents

Document 1:
  Tokens: 5
  HLLSets: 3
  1-token cardinality: 7.00

Document 2:
  Tokens: 5
  HLLSets: 3
  1-token cardinality: 6.00

Document 3:
  Tokens: 5
  HLLSets: 3
  1-token cardinality: 6.00



## 4. Driver Lifecycle

Understand driver states: Active → Idle → Remove (or Restart).

In [4]:
from core.manifold_os import DriverState

# Check driver states
drivers = os.list_drivers()

print("Driver Lifecycle States:")
for state in DriverState:
    print(f"  {state.name}: {state.value}")

print("\nCurrent drivers:")
for driver_id, info in drivers.items():
    print(f"  {driver_id}:")
    print(f"    State: {info['state']}")
    print(f"    Type: {info['type']}")

Driver Lifecycle States:
  CREATED: created
  IDLE: idle
  ACTIVE: active
  ERROR: error
  DEAD: dead

Current drivers:
  ingest_default:
    State: idle
    Type: ingest


## 5. Custom Driver Configuration

Create drivers with custom tokenization settings.

In [5]:
from core.manifold_os import TokenizationConfig, IngestDriver

# Custom configuration
config = TokenizationConfig(
    min_token_length=3,
    lowercase=True,
    use_n_tokens=True,
    n_token_groups=[1, 2, 3, 4],  # Include 4-tokens
    maintain_order=True
)

# Create and register custom driver
custom_driver = IngestDriver("ingest_custom", config=config)
os.register_driver(custom_driver)
custom_driver.wake()  # Activate the driver

print("Created custom driver with configuration:")
print(f"  Min token length: {config.min_token_length}")
print(f"  Lowercase: {config.lowercase}")
print(f"  N-token groups: {config.n_token_groups}")

# Use custom driver
result = os.ingest("Testing Custom Driver Configuration", driver_id="ingest_custom")
print(f"\nTokens (lowercase, min_length=3): {result.original_tokens}")
print(f"N-token groups: {list(result.n_token_groups.keys())}")

Created custom driver with configuration:
  Min token length: 3
  Lowercase: True
  N-token groups: [1, 2, 3, 4]
  ✓ LUT committed: n=1, hash=ad0166c3b81f8522..., id=4
  ✓ LUT committed: n=2, hash=80ba31d3a01605c2..., id=3
  ✓ LUT committed: n=3, hash=f330a4dde3864175..., id=2
  ✓ LUT committed: n=4, hash=396ceb858f846685..., id=1

Tokens (lowercase, min_length=3): ['testing', 'custom', 'driver', 'configuration']
N-token groups: [1, 2, 3, 4]


## 6. Driver Statistics

Monitor driver activity and performance.

In [6]:
# Get driver
driver = os.get_driver("ingest_default")

print("Driver Statistics:\n")
print(f"  Operations count: {driver.stats.operations_count}")
print(f"  Errors count: {driver.stats.errors_count}")
print(f"  Total active time: {driver.stats.total_active_time:.4f}s")
print(f"\n  Created at: {driver.stats.created_at:.2f}")
if driver.stats.last_active:
    print(f"  Last active: {driver.stats.last_active:.2f}")
print(f"\n  Driver state: {driver.state.value}")
print(f"  Ingested count: {driver.ingested_count}")

Driver Statistics:

  Operations count: 4
  Errors count: 0
  Total active time: 0.0076s

  Created at: 1770920237.67
  Last active: 1770920238.12

  Driver state: idle
  Ingested count: 4


## 7. Immutability & Idempotence

ManifoldOS operations are immutable and idempotent.

In [7]:
# Ingest same text multiple times
text = "immutability test data"

result1 = os.ingest(text)
result2 = os.ingest(text)
result3 = os.ingest(text)

# Check content-addressed names
name1 = result1.hllsets[1].name
name2 = result2.hllsets[1].name
name3 = result3.hllsets[1].name

print("Idempotence test:")
print(f"  Result 1 name: {name1[:16]}...")
print(f"  Result 2 name: {name2[:16]}...")
print(f"  Result 3 name: {name3[:16]}...")
print(f"\n  All identical: {name1 == name2 == name3}")
print("\n✓ Same input always produces same output (idempotent)")
print("✓ Results are content-addressed (same content → same name)")

  ✓ LUT committed: n=1, hash=be612b24a5efc059..., id=3
  ✓ LUT committed: n=2, hash=418005e177f0be05..., id=2
  ✓ LUT committed: n=3, hash=8528c97c7a192fd1..., id=1
  ✓ LUT committed: n=1, hash=be612b24a5efc059..., id=3
  ✓ LUT committed: n=2, hash=418005e177f0be05..., id=2
  ✓ LUT committed: n=3, hash=8528c97c7a192fd1..., id=1
  ✓ LUT committed: n=1, hash=be612b24a5efc059..., id=3
  ✓ LUT committed: n=2, hash=418005e177f0be05..., id=2
  ✓ LUT committed: n=3, hash=8528c97c7a192fd1..., id=1
Idempotence test:
  Result 1 name: be612b24a5efc059...
  Result 2 name: be612b24a5efc059...
  Result 3 name: be612b24a5efc059...

  All identical: True

✓ Same input always produces same output (idempotent)
✓ Results are content-addressed (same content → same name)


## 8. Driver Management Operations

Wake, idle, restart, and remove drivers.

In [8]:
# Create and register a test driver
from core.manifold_os import IngestDriver

test_driver = IngestDriver("test_driver")
os.register_driver(test_driver)

print("Driver lifecycle operations:\n")

# Check initial state (CREATED)
driver = os.get_driver("test_driver")
print(f"1. Initial state: {driver.state.value}")

# Wake up (CREATED → IDLE)
driver.wake()
print(f"2. After wake: {driver.state.value}")

# Activate (IDLE → ACTIVE)
driver.activate()
print(f"3. After activate: {driver.state.value}")

# Back to idle (ACTIVE → IDLE)
driver.idle()
print(f"4. After idle: {driver.state.value}")

# Mark error (IDLE → ERROR)
driver.mark_error()
print(f"5. After error: {driver.state.value}")
print(f"   Needs restart: {driver.needs_restart}")

# Restart (ERROR → IDLE)
driver.restart()
print(f"6. After restart: {driver.state.value}")
print(f"   Stats reset: operations = {driver.stats.operations_count}")

# Remove
removed = os.unregister_driver("test_driver")
print(f"7. Driver removed: {removed}")
print(f"   Driver is now: {driver.state.value}")

Driver lifecycle operations:

1. Initial state: created
2. After wake: idle
3. After activate: active
4. After idle: idle
5. After error: error
   Needs restart: True
6. After restart: idle
   Stats reset: operations = 0
7. Driver removed: True
   Driver is now: dead


## 9. No Scheduling Required

Immutability eliminates complex scheduling needs.

In [9]:
print("Why No Scheduling?\n")
print("1. Immutability:")
print("   - All data structures are immutable")
print("   - Operations return new instances")
print("   - No shared mutable state")
print("\n2. Pure Functional:")
print("   - Drivers are stateless workers")
print("   - Same input → same output")
print("   - No side effects")
print("\n3. Content-Addressed:")
print("   - Everything identified by content hash")
print("   - Duplicate detection automatic")
print("   - No coordination needed")
print("\n4. Isolation:")
print("   - Processes cannot harm each other")
print("   - No resource contention")
print("   - Trivial parallelization")

Why No Scheduling?

1. Immutability:
   - All data structures are immutable
   - Operations return new instances
   - No shared mutable state

2. Pure Functional:
   - Drivers are stateless workers
   - Same input → same output
   - No side effects

3. Content-Addressed:
   - Everything identified by content hash
   - Duplicate detection automatic
   - No coordination needed

4. Isolation:
   - Processes cannot harm each other
   - No resource contention
   - Trivial parallelization


## 10. Universal Constructor Pattern

Understanding the ICASRA components in ManifoldOS.

In [10]:
from core import Kernel

# Get kernel (Component A + B)
kernel = Kernel()

print("Universal Constructor Components:\n")

print("A (Constructor) - Validation & Commit:")
hll = kernel.absorb(['test', 'data'])
committed = kernel.commit(hll)
print(f"  ✓ Committed: {committed.short_name}...")

print("\nB (Copier) - Reproduction:")
reproduced = kernel.reproduce(hll, mutation_rate=0.1)
print(f"  ✓ Reproduced: {reproduced.short_name}...")
print(f"  Similarity: {kernel.similarity(hll, reproduced):.2%}")

print("\nC (Controller) - Driver Management:")
print(f"  ✓ Active drivers: {len(os.list_drivers())}")
print("  ✓ Lifecycle control: wake/idle/restart/remove")

print("\nD (Interface) - Data Ingestion:")
result = os.ingest("interface test")
print(f"  ✓ Ingested: {len(result.original_tokens)} tokens")
print(f"  ✓ Created: {len(result.hllsets)} HLLSets")

Universal Constructor Components:

A (Constructor) - Validation & Commit:
  ✓ Committed: d76e9860...

B (Copier) - Reproduction:
  ✓ Reproduced: d76e9860...
  Similarity: 100.00%

C (Controller) - Driver Management:
  ✓ Active drivers: 2
  ✓ Lifecycle control: wake/idle/restart/remove

D (Interface) - Data Ingestion:
  ✓ LUT committed: n=1, hash=1b5d7fcb60b4d304..., id=2
  ✓ LUT committed: n=2, hash=a824ed80ba973599..., id=1
  ✓ LUT committed: n=3, hash=1ceaf73df40e531d..., id=0
  ✓ Ingested: 2 tokens
  ✓ Created: 3 HLLSets


## Summary

**ManifoldOS Features**:

**Driver Management**:
- Ephemeral, stateless workers
- Lifecycle: Active → Idle → Remove (or Restart)
- Custom configuration per driver
- Statistics tracking

**Design Principles**:
- **Immutability**: All data structures immutable
- **Idempotence**: Same input → same output
- **Content Addressability**: Everything hashed by content
- **No Scheduling**: Immutability eliminates coordination

**ICASRA Pattern**:
- **A (Constructor)**: Validation and commitment
- **B (Copier)**: Reproduction with structure preservation
- **C (Controller)**: Driver lifecycle coordination
- **D (Interface)**: External data ingestion

**Data Flow**:
1. Ingest text → Tokenize
2. Generate n-tokens → Create HLLSets
3. Build LUTs → Preserve order info
4. Return NTokenRepresentation
5. All operations immutable and idempotent