Skip to content

fetchtable-dev/fetchtable

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fetchtable

Fetchtable is an advanced local-first vector database ingestion, corroboration, and semantic retrieval engine designed to augment Large Language Models and agents like OpenClaw. It goes beyond simple static vector search thresholds by dynamically probing semantic boundaries, maintaining a persistent memory graph of local files via background watchdogs, and structuring truth corroboration for agentic workflows.

Key Features

  • Dynamic N-ary Search Retrieval: Transcends traditional static top_k searches. Fetchtable intelligently evaluates chunks and performs recursive N-ary probing to dynamically identify relevance cutoff points across vast knowledge banks, minimizing LLM API token costs while maximizing context precision.
  • Persistent Vector Postgres: Out-of-the-box support for spinning up dockerized ankane/pgvector databases mapped directly to your local file system for infinite retention.
  • OpenClaw OS Semantic Indexer (Watchdog): A background daemon that recursively scans, sanitizes, and indexes your OS directories (e.g., Documents/Downloads). It monitors changes dynamically, generates LLM-summarized embedding descriptors of undocumented files, and safely maps OS commands to open/execute local files seamlessly for your AI agents.
  • Trust Graph Pipelines: Ingestion pipelines feature optional extraction overlays:
    • Named Entity Recognition (NER) via Spacy to map semantic relationships.
    • Fact Corroboration via HuggingFace Cross-Encoders (e.g., nli-deberta-v3-base) to build trust layers against conflicting document histories.
  • Decoupled Model Registry: Supports OpenAI, Anthropic, Ollama, and HuggingFace Sentence Transformers interchangeably without hardcoding keys or architectures.
  • Glassmorphism FastAPI Control Panel: A fully decoupled, beautiful local UI built on FastAPI to execute the entire lifecycle visually without ever touching a line of Python.

🛠️ Quickstart

Prerequisites

  1. Python 3.10+
  2. Docker Desktop (Required for automatic PGVector database spinning).

Installation

Clone the repository and install the dependencies.

git clone https://github.com/yourusername/fetchtable.git
cd fetchtable
pip install -r requirements.txt

(Note: Ensure you include fastapi, uvicorn, watchdog, pypdf, pgvector, litellm, sentence-transformers, spacy, and python-multipart in your environment.)


🖥️ Running the GUI Control Panel

Fetchtable comes out of the box with a visually stunning, persistent state web GUI.

python fetchtable/gui/app.py

Navigate to http://localhost:8000 in your browser. From here you can:

  1. Model Registry Tab: Securely register your Ollama, OpenAI, or Anthropic models & your text embedding extractors.
  2. Databases Tab: Click a button to launch a persistent PGVector Docker instance locally.
  3. Pipelines Tab: Wire together Chunk Sizes, Models, and Fact Corroboration logic graphically.
  4. Documents Tab: Drag and drop PDF or TXT files natively to embed them.
  5. OpenClaw Agent Tab: Type in directories (e.g., /Users/you/Documents) to daemonize the background Watchdog observer!

Python Client Usage

If you prefer building Fetchtable into your existing agent structures programmatically:

import asyncio
from fetchtable.client import FetchtableClient
from fetchtable.vector_stores.pgvector_store import PGVectorStore
from fetchtable.llms.openai_llm import OpenAILLM
from fetchtable.embeddings.huggingface_embeddings import HuggingFaceEmbeddings

async def main():
    # 1. Initialize Decoupled Components
    store = PGVectorStore(connection_string="postgresql+psycopg2://postgres:postgres@localhost:5432/postgres")
    llm = OpenAILLM(api_key="sk-...", model_name="gpt-4o")
    embedder = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

    client = FetchtableClient(store, llm, embedder)
    
    # 2. Start the OS File Watcher Daemon
    directories_to_watch = ["/Users/you/Documents/Important"]
    ignore_patterns = [".git", "node_modules", ".DS_Store"]
    
    await client.ingest_local_os(
        directories=directories_to_watch,
        ignore_patterns=ignore_patterns,
        sync_interval=3600 # Sync every hour
    )

    # 3. Dynamic N-ary Probing Search
    query = "What is the capital of France?"
    results = await client.search(
        query=query, 
        search_type="dynamic", # Uses LLMs recursively to find cutoff point
        max_chunks=20 
    )

    print(results)
    
    # 4. Cleanup
    await client.shutdown()

if __name__ == "__main__":
    asyncio.run(main())

Architecture Design

Fetchtable explicitly relies on Dependency Injection. The base level FetchtableClient does not know how to interact with psycopg2 or openai. It utilizes Pydantic Abstract Base Classes (BaseVectorStore, BaseLLM, BaseEmbeddingModel). You can build your own SQLiteVectorStore by subclassing BaseVectorStore and injecting it, and Fetchtable's N-ary search and OS Indexer will continue working perfectly.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors