Native Apple Intelligence Embeddings
# NornicDB macOS Architecture (Current State)
System Overview
┌─────────────────────────────────────────────────────────────────────────────┐
│ USER INTERFACE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────┐ ┌──────────────────────┐ │
│ │ Web UI │ │ Menu Bar App │ │
│ │ localhost:7474 │ │ (Swift) │ │
│ │ Auth: admin:pwd │ │ Configure & Monitor│ │
│ └──────────┬───────────┘ └──────────┬───────────┘ │
│ │ │ │
└─────────────┼─────────────────────────────────────┼──────────────────────────┘
│ │
│ ┌──────────▼──────────────────────┐
│ │ macOS Keychain (Encrypted) │
│ ├────────────────────────────────┤
│ │ • JWT Secret │
│ │ • Encryption Password │
│ │ • Apple Intelligence API Key │
│ │ (UUID: E4D8A2F1-...) │
│ └──────────┬──────────────────────┘
│ │
│ ┌──────────▼──────────────────────┐
│ │ Configuration │
│ ├────────────────────────────────┤
│ │ config.yaml + LaunchAgent │
│ │ (Secrets from Keychain) │
│ └──────────┬──────────────────────┘
│ │
┌─────────────▼─────────────────────────────────────▼──────────────────────────┐
│ NornicDB Server (LaunchAgent) │
│ Port: 7474 (HTTP), 7687 (Bolt) │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ Cypher Query Executor │ │
│ │ Neo4j-compatible graph queries │ │
│ └───────────────────────┬─────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────▼─────────────────────────────────────────┐ │
│ │ Vector Search Index │ │
│ │ Configurable: 512d (Apple) or 1024d (Local GGUF) │ │
│ └───────────────────────┬─────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────▼─────────────────────────────────────────┐ │
│ │ Embedding Client (OpenAI-compatible) │ │
│ │ Routes to: Apple Intelligence OR Local GGUF │ │
│ └─────────┬───────────────────────────┬───────────────────────────┘ │
│ │ │ │
│ ┌─────────▼─────────────┐ ┌─────────▼─────────────┐ │
│ │ BadgerDB Storage │ │ Auto-Embed Queue │ │
│ │ 41K+ nodes │ │ Background worker │ │
│ └───────────────────────┘ └───────────────────────┘ │
│ │
└────────────┬───────────────────────────┬─────────────────────────────────────┘
│ │
┌─────────▼────────────┐ ┌──────────▼────────────┐
│ OPTION A │ │ OPTION B │
│ Apple Intelligence │ │ Local GGUF Models │
└──────────────────────┘ └───────────────────────┘
Option A: Apple Intelligence (Secure Network Path)
┌─────────────────────────────────────────────────────────────────────────────┐
│ NornicDB Server Process Apple ML Server Process │
│ (LaunchAgent) (Menu Bar App) │
├──────────────────────────────────────┬──────────────────────────────────────┤
│ │ │
│ 1. Need embedding for node │ │
│ ↓ │ │
│ 2. Read API key from env var │ 5. Read API key from memory │
│ NORNICDB_EMBEDDING_API_KEY │ (set at startup from Keychain) │
│ ↓ │ ↓ │
│ 3. HTTP Request │ 6. Validate Bearer token │
│ ──────────────────────────────────┼─────────────────────► │
│ POST /v1/embeddings │ 7. Compare keys │
│ Authorization: Bearer E4D8A2F1-...│ providedKey == storedKey? │
│ Content-Type: application/json │ ↓ │
│ {"input": ["text"], ...} │ 8. Valid → Process request │
│ │ Invalid → 401 Unauthorized │
│ │ ↓ │
│ │ 9. Apple NLEmbedding │
│ │ • On-device ML │
│ │ • sentenceEmbedding(for:) │
│ │ • No network/cloud │
│ │ ↓ │
│ 4. Response │ 10. Return embedding │
│ ◄──────────────────────────────────┼────────────────────── │
│ {"data": [{"embedding": [512 floats]}]} │
│ ↓ │ │
│ 11. Index in Vector Search │ │
│ vectorIndex.Add(id, embedding) │ │
│ → total_embeddings++ │ │
│ │ │
└───────────────────────────────────────┴───────────────────────────────────────┘
Security Controls:
├─ Bind to 127.0.0.1 only (not 0.0.0.0)
├─ UUID API key in Keychain (not hardcoded)
├─ Bearer token validation on every request
├─ /health endpoint public (monitoring)
└─ All other endpoints require auth
Option B: Local GGUF Models (In-Process)
┌─────────────────────────────────────────────────────────────────────────────┐
│ NornicDB Server Process │
│ (Single Process - No Network) │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ 1. Need embedding for node │
│ ↓ │
│ 2. Local GGUF Engine (llama.cpp) │
│ embedder.Embed(ctx, text) │
│ ↓ │
│ 3. Load model from disk │
│ /usr/local/var/nornicdb/models/bge-m3.gguf │
│ ↓ │
│ 4. Generate embedding (in-process) │
│ • No HTTP call │
│ • No authentication needed (same process) │
│ • Direct memory access │
│ ↓ │
│ 5. Return []float32{1024 values} │
│ ↓ │
│ 6. Index in Vector Search │
│ vectorIndex.Add(id, embedding) │
│ → total_embeddings++ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Performance Benefits:
├─ Zero network latency
├─ No authentication overhead
├─ Direct memory access
└─ ~100ms per embedding
Data Flow: Node Creation to Vector Search
1. User Creates Node
↓
MATCH (n:Memory {content: "Apple Intelligence setup"})
↓
2. Storage Layer (BadgerDB)
↓
node.ID = "node-123"
node.Embedding = nil ← Not embedded yet
↓
3. Auto-Embed Queue (Background Worker)
↓
Detects node.Embedding == nil
↓
4. Embedding Provider (Apple Intelligence)
├─ HTTP POST to localhost:11435
├─ Authorization: Bearer <UUID from Keychain>
├─ Apple NLEmbedding.generate()
└─ Returns: [512 floats]
↓
5. Update Node
↓
node.Embedding = [0.24, -0.13, ..., 0.89] ← 512 values
node.Properties["embedding_model"] = "apple-ml-embeddings"
node.Properties["embedding_dimensions"] = 512
↓
6. Search Index Update (Callback)
↓
vectorIndex.Add("node-123", embedding)
total_embeddings++ ← UI shows count
↓
7. Ready for Semantic Search
↓
CALL db.index.vector.queryNodes('semantic', 10, 'setup guide')
↓
Returns: node-123 with similarity score
Configuration Comparison
┌──────────────────────────┬────────────────────────┬──────────────────────────┐
│ Configuration Key │ Apple Intelligence │ Local GGUF Models │
├──────────────────────────┼────────────────────────┼──────────────────────────┤
│ provider │ openai │ local │
│ url │ http://localhost:11435/│ (not used) │
│ model │ apple-ml-embeddings │ bge-m3.gguf │
│ dimensions │ 512 │ 1024 │
│ api_key (env) │ <UUID from Keychain> │ (not needed) │
│ network_required │ localhost only │ in-process │
│ authentication │ Bearer token │ not needed │
│ external_dependencies │ on-device ML │ self-contained │
│ cost │ FREE │ FREE │
│ privacy │ 100% local │ 100% local │
│ latency │ ~50ms (network) │ ~100ms (compute) │
└──────────────────────────┴────────────────────────┴──────────────────────────┘
Security Flow Detail
┌─────────────────────────────────────────────────────────────────────────────┐
│ Security Architecture │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Keychain (Encrypted by macOS) │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ apple_intelligence_api_key: "E4D8A2F1-XXXX-XXXX-..." │ │
│ └───────────────────────┬───────────────────────────────────┘ │
│ │ │
│ ┌─────────┴─────────┐ │
│ │ │ │
│ ┌─────────▼─────────┐ ┌──────▼─────────────┐ │
│ │ Menu Bar App │ │ LaunchAgent Plist │ │
│ │ (at startup) │ │ (env var) │ │
│ └─────────┬─────────┘ └──────┬─────────────┘ │
│ │ │ │
│ │ │ │
│ ┌─────────▼─────────┐ ┌──────▼─────────────────────────────┐ │
│ │ Apple ML Server │ │ NornicDB Server │ │
│ │ (Swift) │ │ (Go) │ │
│ ├───────────────────┤ ├────────────────────────────────────┤ │
│ │ • Load at init │ │ • Read from env var │ │
│ │ • Store in mem │ │ • Include in Authorization header │ │
│ └─────────┬─────────┘ └──────┬─────────────────────────────┘ │
│ │ │ │
│ │ │ │
│ │ ┌────────────────▼────────────────────┐ │
│ │ │ POST /v1/embeddings │ │
│ │ │ Authorization: Bearer E4D8A2F1-... │ │
│ │ └────────────────┬────────────────────┘ │
│ │ │ │
│ └───────────────────┤ │
│ │ │
│ ┌───────────────────▼────────────────────┐ │
│ │ validateAuth(headers) │ │
│ │ 1. Extract: "Bearer <key>" │ │
│ │ 2. Compare: key == apiKey? │ │
│ │ 3. Match → Process │ │
│ │ No match → 401 │ │
│ └────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Network Topology:
• Apple ML Server: [127.0.0.1:11435](http://127.0.0.1:11435/) (localhost only, no external access)
• NornicDB Server: [127.0.0.1:7474](http://127.0.0.1:7474/) (accessible, but authenticated)
• Both on same machine → minimal latency (~50ms roundtrip)
Embedding Flow Comparison
Option A: Apple Intelligence (Network-Based, Secured)
Node Creation → Auto-Embed Queue
↓
┌─────────────────────┐
│ Provider: openai │
│ URL: localhost:11435│
└─────────┬───────────┘
│
↓ HTTP POST
┌─────────────────────────────────┐
│ Authorization: Bearer <UUID> │
│ {"input": ["text"]} │
└─────────┬───────────────────────┘
│
↓ TCP to [127.0.0.1:11435](http://127.0.0.1:11435/)
┌─────────────────────────────────┐
│ Apple ML Embedding Server │
│ Validate API Key │
│ Valid → Continue │
│ Invalid → 401 │
└─────────┬───────────────────────┘
│
↓
┌─────────────────────────────────┐
│ Apple NLEmbedding Framework │
│ • On-device ML │
│ • sentenceEmbedding(for: .en) │
│ • Returns: 512 floats │
└─────────┬───────────────────────┘
│
↓ JSON Response
{"data": [{"embedding": [512 values]}]}
│
↓
Update node.Embedding = [...]
│
↓
vectorIndex.Add(id, embedding)
│
↓
total_embeddings: 145
Option B: Local GGUF (In-Process, No Auth)
Node Creation → Auto-Embed Queue
↓
┌─────────────────────┐
│ Provider: local │
│ (no URL/auth) │
└─────────┬───────────┘
│
↓ Direct function call
┌─────────────────────────────────┐
│ Local GGUF Engine (llama.cpp) │
│ • Same process (no network) │
│ • No authentication needed │
└─────────┬───────────────────────┘
│
↓
┌─────────────────────────────────┐
│ Load: bge-m3.gguf │
│ /usr/local/var/nornicdb/models/ │
└─────────┬───────────────────────┘
│
↓ llama.cpp inference
Returns: []float32{1024 values}
│
↓
Update node.Embedding = [...]
│
↓
vectorIndex.Add(id, embedding)
│
↓
total_embeddings: 41K
Why This Architecture Enables Multimodal
| Component | Current Capability | Multimodal Ready |
|---|---|---|
| Storage | node.Embedding []float32 |
Works for text/image/audio |
| Search Index | NewServiceWithDimensions(dims) |
Configurable per modality |
| API | OpenAI-compatible text endpoint | Extensible to image/audio |
| Provider Pattern | Apple ML + Local GGUF | Add Vision/Whisper providers |
| Metadata | embedding_model, embedding_dimensions |
Add source_modality field |
| Security | UUID-based auth | Reusable for all providers |
Same storage, same search, same API pattern - just different providers!--
[1.0.4] - 2025-12-10
Fixed
-
Critical: Node/Edge Count Tracking During DETACH DELETE - Edge counts became incorrect (negative, double-counted, or stale) during
DETACH DELETEoperationsdeleteEdgesWithPrefix()was deleting edges but not returning count of edges actually deleteddeleteNodeInTxn()wasn't tracking edges deleted along with the nodeBulkDeleteNodes()only decremented node count, not edge count for cascade-deleted edges- Unit tests showed counts going negative or remaining high after deletes, resetting to zero only on restart
- Fixed by updating
deleteEdgesWithPrefix()signature to return(int64, []EdgeID, error) - Fixed
deleteNodeInTxn()to aggregate and return edges deleted with node - Fixed
BulkDeleteNodes()to correctly decrementedgeCountand notifyedgeDeletedcallbacks - Added comprehensive tests in
pkg/storage/async_engine_delete_stats_test.go - Impact:
/admin/statsand Cyphercount()queries now remain accurate during bulk delete operations
-
Critical: ORDER BY Ignored for Relationship Patterns -
ORDER BY,SKIP, andLIMITclauses were completely ignored for queries with relationship patterns- Queries like
MATCH (p:Person)-[:WORKS_IN]->(a:Area) RETURN p.name ORDER BY p.namereturned unordered results executeMatchWithRelationships()was returning immediately without applying post-processing clauses- Fixed by capturing result, applying ORDER BY/SKIP/LIMIT, then returning
- Affects all queries with relationship traversal:
(a)-[:TYPE]->(b),(a)<-[:TYPE]-(b), chained patterns - Impact: Fixes data integrity issues where clients relied on sorted results
- Queries like
-
Critical: Cartesian Product MATCH Returns Zero Rows - Comma-separated node patterns returned empty results instead of cartesian product
MATCH (p:Person), (a:Area) RETURN p.name, a.codereturned 0 rows (should return N×M combinations)executeMatch()only parsed first pattern, ignoring subsequent comma-separated patterns- Fixed by detecting multiple patterns via
splitNodePatterns()and routing to newexecuteCartesianProductMatch() - Now correctly generates all combinations of matched nodes
- Supports WHERE filtering, aggregation, ORDER BY, SKIP, LIMIT on cartesian results
- Impact: Critical for Northwind-style bulk insert patterns like
MATCH (s), (c) CREATE (p)-[:REL]->(c)
-
Critical: Cartesian Product CREATE Only Creates One Relationship -
MATCHwith multiple patterns followed byCREATEonly created relationships for first matchMATCH (p:Person), (a:Area) CREATE (p)-[:WORKS_IN]->(a)created 1 relationship (should create 3 for 3 persons × 1 area)executeMatchCreateBlock()was collecting only first matching node per pattern variable- Fixed by collecting ALL matching nodes and iterating through cartesian product combinations
- Each CREATE now executes once per combination in the cartesian product
- Impact: Fixes bulk relationship creation patterns used in data import workflows
-
UNWIND CREATE with RETURN Returns Variable Name Instead of Values - Return clause after
UNWIND...CREATEreturned literal variable namesUNWIND ['A','B','C'] AS name CREATE (n {name: name}) RETURN n.namereturned["name","name","name"](should be["A","B","C"])replaceVariableInQuery()failed to handle variables inside curly braces like{name: name}- String splitting on spaces left
name}which didn't match variablename - Fixed by properly trimming braces
{}[]()and preserving surrounding punctuation during replacement - Impact: Fixes all UNWIND+CREATE+RETURN workflows, critical for bulk data ingestion with result tracking
Changed
- Cartesian Product Performance - New
executeCartesianProductMatch()efficiently handles multi-pattern queries- Builds combinations incrementally to avoid memory explosion on large datasets
- Supports early filtering with WHERE clause before building full product
- Properly integrates with query optimizer (ORDER BY, SKIP, LIMIT applied after filtering)
Technical Details
- Modified
pkg/storage/badger.go:- Fixed
deleteEdgesWithPrefix()to return accurate count and edge IDs - Fixed
deleteNodeInTxn()to track and return edges deleted with node - Fixed
BulkDeleteNodes()to correctly decrement edge count for cascade deletes
- Fixed
- Modified
pkg/cypher/match.go:- Added
executeCartesianProductMatch()for comma-separated pattern handling - Added
executeCartesianAggregation()for aggregation over cartesian results - Added
evaluateWhereForContext()for WHERE clause evaluation on node contexts - Fixed
executeMatch()to detect and route multiple patterns correctly - Fixed relationship pattern path to apply ORDER BY/SKIP/LIMIT before returning
- Added
- Modified
pkg/cypher/create.go:- Updated
executeMatchCreateBlock()to collect all pattern matches (not just first) - Added cartesian product iteration for CREATE execution
- Now creates relationships for every combination in MATCH cartesian product
- Updated
- Modified
pkg/cypher/clauses.go:- Fixed
replaceVariableInQuery()to handle variables in property maps{key: value} - Improved punctuation preservation during variable substitution
- Fixed
Test Coverage
- All existing tests pass (100% backwards compatibility)
- Added
pkg/storage/async_engine_delete_stats_test.gowith comprehensive count tracking tests - Fixed
TestWorksInRelationshipTypeAlternation- ORDER BY now works correctly - Fixed
TestUnwindWithCreate/UNWIND_CREATE_with_RETURN- Returns actual values, not variable names - Cartesian product patterns now pass all Northwind benchmark compatibility tests