Skip to content

Change the management of data tables and add some interface exposure for agent #37

@Jay-ju

Description

@Jay-ju

Architecture & Design Discussion

Summary

This document proposes a comprehensive architecture and design for lance-context, a high-performance, evolvable context management solution for AI Agents. The design establishes a clean separation between the logical information layer and the physical storage layer (LanceDB), defines a clear set of Agent-facing interfaces, and introduces a layered data model (L0-L2) to balance retrieval effectiveness and cost. It also includes specifications for data governance, multi-tenancy, and a phased implementation roadmap, aiming to provide a robust foundation for building and scaling complex Agent systems.

Motivation / Problem Statement

Current Agent development faces several challenges:

  • Fragmented Context: Information is often scattered across various systems, leading to inconsistent and incomplete context for the Agent.
  • Suboptimal Retrieval: Existing retrieval mechanisms lack the sophistication to balance precision, recall, and cost effectively.
  • Disorderly Information Growth: Without a proper strategy, an Agent's memory and knowledge base can expand indefinitely, leading to performance degradation and increased costs.
  • Lack of Extensibility: Tightly coupled business logic and storage implementations make it difficult to evolve the system, such as introducing new storage engines or adapting to new Agent capabilities.

The initial version of lance-context provides a solid starting point, but a more systematic architecture is required to address these issues and support the long-term growth of sophisticated AI Agents.

Design Overview

The proposed architecture is centered around three key concepts:

  1. Information Layer on LanceDB: A logical Information Layer is introduced on top of the physical LanceDB storage. This layer provides a stable, semantic view of the data, abstracting away the underlying implementation details. It organizes data into three distinct families: ctx_agent (core Agent business data), ctx_kb (external knowledge), and ctx_meta (internal system metadata).

  2. Layered Data Model (L0-L2): Inspired by industry best practices, we adopt a three-layer data model to optimize retrieval and processing:

    • L2 (Raw Content): The immutable source of truth.
    • L1 (Structured & Vector): The primary retrieval target, containing cleaned, chunked, and vectorized data.
    • L0 (Abstract/Summary): A high-level summary layer for efficient pre-filtering and low-cost relevance assessment.
  3. Task-Oriented Agent Interfaces: A set of high-level interfaces (Add, Search, Explain, Trace, Prune, Archive) are defined to provide Agents with intuitive, task-oriented capabilities for managing their context, memory, and knowledge.

Detailed Design

Overall Architecture and Data Layers

We propose an overall architecture that includes an Information Layer, which provides a stable logical view for Agent applications on top of the physical storage (LanceDB).

flowchart TD
    subgraph "Agent Application"
        Agent["AI Agent"]
    end

    subgraph "Information Layer (lance-context)"
        direction LR
        subgraph "Agent Interfaces"
            direction TB
            Add["Add"]
            Search["Search"]
            Prune["Prune"]
            Archive["Archive"]
        end

        subgraph "Table Families"
            direction TB
            ctx_agent["ctx_agent (Core)"]
            ctx_kb["ctx_kb (Knowledge)"]
            ctx_meta["ctx_meta (Internal)"]
        end

        subgraph "Data Layers"
            direction TB
            L0["L0 (Summary)"]
            L1["L1 (Structured/Vector)"]
            L2["L2 (Raw Content)"]
        end
    end

    subgraph "Physical Storage"
        LanceDB["LanceDB"]
    end

    Agent --> Add
    Agent --> Search
    Agent --> Prune
    Agent --> Archive

    Add --> L2
    L2 --> L1
    L1 --> L0

    Search -- "queries" --> L0
    Search -- "queries" --> L1
    L1 -- "links to" --> L2

    ctx_agent --> LanceDB
    ctx_kb --> LanceDB
    ctx_meta --> LanceDB
Loading

The core of this architecture is a governance strategy based on data layering and separation.

Data Layers: L0, L1, and L2

we process and store data in three logical layers to optimize retrieval efficiency and reduce LLM token consumption.

  • L2 (Raw Content Layer)

    • Semantics: Unprocessed raw data, serving as the "Source of Truth" for all information. Examples include complete conversation logs, user-uploaded original documents, and full tool-call logs.
    • Generation Pipeline: Data enters the system via the Add interface and is directly stored in the corresponding L2 table.
    • Storage Strategy: Stored in tables like ctx_agent.agent_l2_raw or ctx_kb.kb_l2_raw_documents in binary or text format.
    • Retrieval Strategy: Not directly involved in retrieval by default. It is accessed only for "evidence traceability" or deep analysis, via links from L1/L0.
  • L1 (Structured & Vector Layer)

    • Semantics: The result of cleaning, chunking, extracting metadata from, and generating vector embeddings for L2 data. This is the primary retrieval target of the system, balancing information density and contextual granularity.
    • Generation Pipeline: Triggered by a background task or write pipeline, it processes new L2 data to generate L1 records.
    • Storage Strategy: Chunked text and metadata are stored in ctx_agent.agent_l1_chunks, and vectors are stored in ctx_agent.agent_l1_embeddings.
    • Retrieval Strategy: This is the main layer for hybrid retrieval (Scalar + FTS + Vector). An Agent's Search request first retrieves candidates from this layer using vector similarity, keywords, and metadata filtering.
  • L0 (Abstract/Summary Layer)

    • Semantics: A brief summary generated for a group of L1 Chunks or an L2 object (like a session or a document). Its core function is pre-filtering, helping the Agent or retrieval strategy quickly determine if a larger entity (like an entire session) is worth exploring in-depth.
    • Generation Pipeline: Triggered by a background task or after L1 processing is complete, it calls an LLM to summarize L1 Chunks or L2 content.
    • Storage Strategy: Stored in the ctx_agent.agent_l0_summaries table and linked to the corresponding L1/L2 entities.
    • Retrieval Strategy: Acts as the first line of defense in retrieval. For instance, in cross-session retrieval, it can quickly filter L0 summaries of all sessions to locate the most relevant ones before performing a precise search within their L1 Chunks.

Retrieval Chain

The standard retrieval chain follows the L0 → L1 → L2 sequence:

  1. L0 Pre-filtering: Based on the query intent, a quick, low-cost match is first performed on the L0 summary layer to identify highly relevant entities (e.g., sessions, documents).
  2. L1 Main Retrieval: Within the scope of entities filtered by L0, or directly in the global L1 Chunks, a hybrid search is executed to recall the most relevant atomic information blocks.
  3. L2 Evidence Traceability: The L1 retrieval results are presented to the LLM. If more complete context or fact-verification is needed, the LLM or user can trace back to the L2 raw data using the links saved in the L1 records.

Table Families and Naming Conventions

To decouple business logic from internal management, we have designed three independent table families (which can be mapped to different databases or directories in LanceDB), each with clear naming conventions and responsibilities.

ctx_agent Family: Agent Business Core

Stores runtime data directly interacting with the Agent.

Logical Table Name Description Core Field Suggestions Index Suggestions
agent_sessions Session metadata session_id (PK), agent_id, user_id, status, start_time, end_time, metadata (JSON) B-Tree: agent_id, user_id, start_time
agent_l2_raw L2 raw data raw_id (PK), source_id (e.g., session_id), content (bytes/text), content_type, created_at B-Tree: source_id, created_at
agent_l1_chunks L1 structured chunks chunk_id (PK), raw_id, content (text), metadata (JSON), created_at, agent_id FTS: content; B-Tree: agent_id, created_at
agent_l1_embeddings L1 vector embeddings chunk_id (FK), vector (fixed_size_list), model_name, created_at Vector (IVF_PQ): vector
agent_l0_summaries L0 summaries target_id (PK), target_type (session/doc), summary (text), updated_at, agent_id FTS: summary; B-Tree: agent_id, target_type
agent_skills Agent skill definitions skill_id (PK), name, schema (JSON), description, agent_id, version B-Tree: agent_id, name
agent_tool_calls Tool call records call_id (PK), session_id, tool_name, params (JSON), result (text), status, timestamp B-Tree: session_id, tool_name, timestamp
agent_relations Entity relationship graph source_id, target_id, relation_type (e.g., 'cites', 'triggers'), agent_id, created_at B-Tree: source_id, target_id, agent_id

ctx_kb Family: External Knowledge Base

Used for storing relatively static, shareable background knowledge. Its structure is similar to ctx_agent but with an independent lifecycle and management strategy.

Logical Table Name Description Core Field Suggestions Index Suggestions
kb_l2_raw_documents L2 raw knowledge documents doc_id (PK), source_uri, content, metadata (JSON), imported_at B-Tree: source_uri
kb_l1_chunks L1 knowledge base chunks chunk_id (PK), doc_id, content, metadata (JSON) FTS: content
kb_l1_embeddings L1 knowledge base vectors chunk_id (FK), vector (fixed_size_list), model_name Vector (IVF_PQ): vector

ctx_meta Family: Internal Metadata

The system's "Information Schema," used for self-management and internal task scheduling, transparent to the Agent.

Logical Table Name Description Core Field Suggestions Purpose
meta_tables Registry of tables table_name, db_name, table_type, version Records which logical tables exist in the system and their ownership.
meta_columns Column attributes table_name, column_name, data_type, is_time, is_vector Queries and manages table schemas, especially special attributes like time and vectors.
meta_jobs Background task queue job_id (PK), job_type (prune/index), payload, status, scheduled_at Schedules asynchronous tasks like index rebuilding, data pruning, and archiving.
meta_mem_candidates Memory candidate pool session_id, chunk_id, score, candidate_level (promote/prune) The core table for implementing the daily pruning mechanism.

Temporal Attributes and Indexing Conventions

  • Temporal Attributes: All fields involving timestamps (e.g., created_at, timestamp) should use a uniform UTC Timestamp type.
  • Vector Indexing: The vector field is the primary target for vector indexing. IVF_PQ is recommended.
  • FTS Indexing: All content or summary fields should have a Full-Text Search (FTS) index.

Multi-Tenancy, Concurrency, and Data Governance

  • Multi-Tenancy and Versioning:

    • Logical Isolation: All records in ctx_agent must include an agent_id.
    • Physical Isolation: Data can be written to different directories based on agent_id.
    • Versioning: Relies on LanceDB's native transaction and versioning capabilities.
  • Concurrency and Isolation:

    • Read-Write Isolation: Ensured by LanceDB's Copy-on-Write mechanism.
    • Write Concurrency: A write queue at the API layer is recommended to batch-merge high-concurrency Add operations.
  • Index Maintenance and Archiving:

    • Index Maintenance: Handled by background jobs in meta_jobs during off-peak hours.
    • Data Archiving: Triggered via the Prune interface, performs a soft delete and migrates data to cold storage via a background task.

Agent Interface Mapping

The interfaces provided by lance-context to the Agent should be task-oriented and highly abstract.

flowchart TD
    subgraph Agent
        direction LR
        A[Add]
        S[Search]
        E[Explain]
        T[Trace]
        P[Prune]
        AR[Archive]
    end

    subgraph Backend
        direction TB
        subgraph ctx_agent
            agent_l2["agent_l2_raw"]
            agent_l1["agent_l1_chunks/embeddings"]
            agent_l0["agent_l0_summaries"]
            agent_relations["agent_relations"]
            agent_tool_calls["agent_tool_calls"]
        end
        subgraph ctx_meta
            meta_jobs["meta_jobs"]
            meta_mem_candidates["meta_mem_candidates"]
        end
    end

    A -- "Writes to" --> agent_l2
    A -- "Triggers async write to" --> agent_l1
    A -- "Triggers async write to" --> agent_l0
    
    S -- "Queries" --> agent_l1
    
    E -- "Traverses" --> agent_relations

    T -- "Fetches from" --> agent_l1
    T -- "Fetches from" --> agent_tool_calls

    P -- "Filters in" --> meta_mem_candidates
    P -- "Creates job in" --> meta_jobs

    AR -- "Creates job in" --> meta_jobs
Loading

Core Interface Prototypes (DTOs)

Interface Input Parameters (DTO) Output (DTO) Core Logic & Table Mapping Error Code Suggestions
Add content: Union[str, bytes], content_type: str, session_id: Optional[str], metadata: Dict job_id: str 1. Write to agent_l2_raw (L2).
2. Trigger background task: write to agent_l1_chunks/embeddings (L1).
3. (Optional) Trigger LLM to write to agent_l0_summaries (L0).
400, 503
Search query: str, filter: Dict, top_k: int, search_type: Literal[...] results: List[Chunk] 1. Concurrently query agent_l1_chunks (FTS) and agent_l1_embeddings (Vector).
2. Use RRF to fuse results.
3. Perform scalar filtering.
400, 404
Explain entity_id: str, entity_type: str graph: Dict Recursively trace relationships from the agent_relations table. 404
Trace session_id: str events: List[...] Fetch all records for session_id from agent_l1_chunks and agent_tool_calls and sort by timestamp. 404
Prune policy: Dict job_id: str 1. Filter candidates in meta_mem_candidates.
2. Create a prune task in meta_jobs.
3. Background task performs soft delete/archiving.
400
Archive session_id: str job_id: str Create an archive task in meta_jobs to migrate all session data. 404

Daily Compaction Mechanism

This mechanism distills incremental conversation data (Episodes) into valuable long-term memory (Profiles) and cleans up low-value information.

  1. Scoring and Candidacy

    • A daily background task scans recent agent_sessions and agent_l1_chunks.
    • Scoring Dimensions: Activity, Importance, Reusability, Time Decay.
    • Based on a weighted score, each item is marked in meta_mem_candidates as PROMOTION, RETENTION, or PRUNING.
  2. Promotion to Episode/Profile

    • A background task processes records marked for PROMOTION.
    • It aggregates content, calls an LLM to generate a narrative Episode or update a structured User Profile, and stores the result in a long-term memory table.
    • The operation is transactional and can be rolled back.

Limitations / Open Questions

  • Performance at Scale: While LanceDB is highly performant, the FTS and complex scalar query capabilities might become a bottleneck under heavy load. The hybrid storage model in Phase 2 is designed to mitigate this.
  • Write Concurrency Management: The proposed write queue adds a layer of complexity. Its implementation and tuning will be critical for high-throughput scenarios.
  • Cost of L0 Generation: Generating L0 summaries via LLM calls for every L1/L2 update can be costly. A selective or batch-based strategy for L0 generation might be needed.
  • Complexity of Query Router: The Query Router in Phase 3 is a significant engineering effort and will require careful design to handle query parsing, distribution, and result fusion correctly.

Rollout Plan / Roadmap

We recommend a phased approach to implement this design.

  • Phase 1: Unified Information Layer View based on LanceDB

    • Goal: Implement the complete L0/L1/L2 data model and core APIs (Add, Search) using only LanceDB.
    • Outcome: A functionally complete but limited-performance context database.
  • Phase 2: Hybrid Storage and Query Offloading

    • Goal: Introduce external specialized engines (e.g., Elasticsearch for FTS) where LanceDB's native capabilities are insufficient.
    • Outcome: A hybrid system with better performance and stronger query capabilities.
  • Phase 3: Engine Adapter Layer and Query Router

    • Goal: Evolve lance-context into a universal "context virtualization layer" that supports any combination of backend storage engines.
    • Outcome: A highly scalable context database platform completely decoupled from the underlying storage.

Checklist

  • Finalize table schemas for all three families.
  • Implement Phase 1 Add interface and async processing pipeline.
  • Implement Phase 1 Search interface with hybrid search capabilities.
  • Set up meta_jobs and a basic daily compaction framework.
  • Develop adapters for LangChain and LlamaIndex.
  • Benchmark performance of the Phase 1 implementation.
  • Document all public APIs and data models.

Impact Assessment

  • Performance: The layered data model and hybrid retrieval strategy are expected to significantly improve query performance and reduce LLM context size. Write latency will be managed via asynchronous processing and a write queue.
  • Cost: While LLM calls for L0/L1 generation introduce costs, the overall architecture aims to reduce token consumption during retrieval, potentially leading to net savings. The phased rollout allows for cost-effective scaling.
  • Compatibility: The design is framework-agnostic. By providing clean DTOs and adapters, it ensures easy integration with existing and future Agent frameworks. The reliance on LanceDB in Phase 1 simplifies initial deployment and dependencies.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions