CEF - Context Engineering Framework

Research-Grade ORM for LLM Context Engineering - Persist Knowledge Models, Query Context Intelligently

Community Edition Notice: This framework is designed for Developers (Rapid Prototyping) and Academics (Experimentation). It is NOT currently engineered for Enterprise Production use (see KNOWN_ISSUES.md).

Overview

CEF is an ORM for LLM context engineering - just as Hibernate abstracts relational databases for transactional data, CEF abstracts knowledge stores for LLM context.

✅ Validated with comprehensive benchmarks: Knowledge Model retrieves 60-220% more relevant content than vector-only approaches for complex queries requiring relationship reasoning.

Target Audience

👩‍💻 Developers: Rapidly prototype LLM applications with rich context without setting up complex infrastructure.
🎓 Academics: Experiment with GraphRAG algorithms and benchmark against vector-only baselines.
🧪 Researchers: Reproducible environment for testing context engineering strategies.
🏢 Enterprise Research Pods: Deploy ephemeral, self-contained analysis environments for specific datasets (e.g., "Annual GL Analysis") without requiring permanent heavy infrastructure.

Core Capabilities

🗄️ Knowledge Model ORM - Define entities (nodes) and relationships (edges) like JPA @Entity
🔄 Dual Persistence - Graph store (relationships) + Vector store (semantics)
🔍 Intelligent Context Assembly - Relationship navigation + semantic search + keyword fallback
📦 Storage Agnostic - Pluggable backends (JGraphT, Neo4j, Postgres, Qdrant)
🔌 LLM Integration - OpenAI, Ollama, vLLM with MCP tool support
📄 Parser System - PDF, YAML, CSV, JSON with ANTLR support
☁️ Storage Adapters - FileSystem, S3/MinIO
⚡ Fully Reactive - Spring WebFlux + R2DBC

Author: Mahmudur R Manna (mrmanna) - Founder and Principal Architect of DDSE
Organization: DDSE Foundation (Decision-Driven Software Engineering)
Date: 2024

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Application Layer                         │
│          (Define Knowledge Models: Entities & Relations)     │
└─────────────────────────────────────────────────────────────┘
                             │
                 ┌───────────┴───────────┐
                 │    ORM Interface       │
                 │  1. KnowledgeIndexer   │  (like EntityManager)
                 │  2. KnowledgeRetriever │  (like Repository)
                 └────────────────────────┘
                             │
┌─────────────────────────────────────────────────────────────┐
│                  CEF ORM Engine                              │
│  • Knowledge Model Manager                                   │
│  • Relationship Navigator (Graph reasoning)                  │
│  • Context Assembler (Multi-strategy)                        │
│  • Parser System (Domain transformation)                     │
│  • DataSource Adapters (FileSystem, S3/MinIO)               │
│  • Dual Persistence Coordinator                              │
└─────────────────────────────────────────────────────────────┘
                             │
┌─────────────────────────────────────────────────────────────┐
│                   Storage Layer                              │
│  Graph Store: Node, Edge, RelationType (relationships)       │
│  Vector Store: Chunk with embeddings (semantic context)      │
│  Backends: DuckDB, PostgreSQL, Neo4j, Qdrant                 │
└─────────────────────────────────────────────────────────────┘

Quick Start

Prerequisites

Java 17+
Maven 3.8+
Docker & Docker Compose

1. Clone and Build

git clone <repository-url>
cd cef
mvn clean install

2. Start Infrastructure

# Default: Only Ollama (DuckDB embedded, no external DB needed)
docker-compose up -d

# With PostgreSQL (optional - demonstrates agnosticism)
docker-compose --profile postgres up -d

# With MinIO (optional - demonstrates blob storage)
docker-compose --profile minio up -d

# All services
docker-compose --profile postgres --profile minio up -d

3. Run Framework Tests

# Run comprehensive test suite with benchmarks
cd cef-framework
mvn test

# View benchmark results
cat target/surefire-reports/org.ddse.ml.cef.benchmark.MedicalBenchmarkTest.txt

4. Access Services

Ollama: http://localhost:11434/api/tags
MinIO Console (if enabled): http://localhost:9001
PostgreSQL (if enabled): localhost:5432

Project Structure

ced/
├── cef-framework/          # Core framework (JAR library)
│   ├── src/main/java/      # ORM implementation
│   │   └── org/ddse/ml/cef/
│   │       ├── domain/     # Node, Edge, Chunk, RelationType
│   │       ├── api/        # KnowledgeIndexer, KnowledgeRetriever
│   │       ├── storage/    # GraphStore, VectorStore interfaces
│   │       ├── retriever/  # Pattern-based retrieval
│   │       └── graph/      # JGraphT integration
│   ├── src/test/java/      # Comprehensive test suite
│   │   └── org/ddse/ml/cef/
│   │       ├── benchmark/  # Performance benchmarks
│   │       ├── integration/# Medical domain tests
│   │       └── base/       # SAP financial data tests
│   └── pom.xml
│
├── docs/
│   ├── EVALUATION_SUMMARY.md   # Benchmark analysis
│   ├── benchmark_comparison.png # Performance charts
│   ├── ARCHITECTURE.md         # Technical architecture
│   └── requirements.md         # Specifications
│
├── USER_GUIDE.md           # ORM integration guide
├── RELEASE_NOTES.md        # Version beta-0.5
├── KNOWN_ISSUES.md         # Testing status
├── docker-compose.yml      # vLLM + Ollama services
└── pom.xml                 # Parent POM

Configuration

Default (DuckDB + Ollama)

cef:
  database:
    type: duckdb
    duckdb:
      path: ./data/cef.duckdb
  
  llm:
    default-provider: ollama
    ollama:
      base-url: http://localhost:11434
      model: llama3.2:3b

Note: Benchmark tests use vLLM (Qwen3-Coder-30B) which requires separate installation. See vLLM documentation for setup.

Optional (PostgreSQL)

cef:
  database:
    type: postgresql
    postgresql:
      enabled: true
      host: localhost
      port: 5432
      database: cef_db
      username: cef_user
      password: cef_password

Optional (MinIO/S3)

cef:
  datasources:
    blob-storage:
      enabled: true
      endpoint: http://localhost:9000
      bucket: medical-documents
      access-key: minioadmin
      secret-key: minioadmin

Usage

1. Framework Dependency

Add to your pom.xml:

<dependency>
    <groupId>org.ddse.ml</groupId>
    <artifactId>cef-framework</artifactId>
    <version>beta-0.5</version>
</dependency>

Note: Beta release tested with DuckDB, vLLM (Qwen3-Coder-30B for generation), and Ollama (nomic-embed-text for embeddings). OpenAI integration is configured but untested. See KNOWN_ISSUES.md.

2. Define Domain Entities

// Your domain - framework doesn't know these
public record PatientDTO(UUID id, String name, int age, String condition) {}

3. Create Custom Parser

@Component
public class MedicalPdfParser extends AbstractParser<MedicalParsedData> {
    // Parse PDFs into Node/Edge/Chunk inputs
}

4. Persist Knowledge Models

@Autowired
private KnowledgeIndexer indexer;  // Like EntityManager

// Initialize ORM with relation types (like JPA entity mappings)
indexer.initialize(rootNodes, relationTypes);

// Bulk persist from data source (like StatelessSession)
IndexResult result = indexer.fullIndex(dataSource);

5. Query Context

@Autowired
private KnowledgeRetriever retriever;  // Like Repository

// Intelligent context assembly via relationship navigation
SearchResult result = retriever.retrieve(
    RetrievalRequest.builder()
        .query("Show patients with diabetes")
        .depth(2)  // Navigation depth through relationships
        .topK(10)
        .build()
);

Benchmark Results: Knowledge Model Superiority

Comprehensive test suite with real-world scenarios proves Knowledge Model (graph + vector) significantly outperforms vector-only approaches:

Medical Domain Tests

177 nodes: 150 patients, 5 conditions, 7 medications, 15 doctors
455 edges: Patient-Condition, Patient-Medication, Patient-Doctor relationships
177 vectorized chunks: Clinical notes, condition profiles, medication profiles

Financial Domain Tests (SAP-Simulated)

Enterprise data: Vendors, materials, purchase orders, invoices
Complex relationships: Procurement workflows, financial transactions

Performance Comparison

Metric	Vector-Only	Knowledge Model	Improvement
Chunks Retrieved	5 avg	9.75 avg	+95%
Latency	21.8ms	26.0ms	+19.5%
Multi-hop Queries	Limited	Full graph traversal	✅
Structural Coverage	Semantic only	Entity relationships	✅

Key Finding: Knowledge Model retrieves 60-220% more relevant content for complex queries requiring relationship reasoning.

See EVALUATION_SUMMARY.md for detailed analysis.

Documentation

USER_GUIDE.md - Complete ORM integration guide
EVALUATION_SUMMARY.md - Benchmark analysis (60-220% improvement proven)
RELEASE_NOTES.md - Version beta-0.5 release notes
KNOWN_ISSUES.md - Testing status and limitations
QUICKSTART.md - Get started in 5 minutes
ARCHITECTURE.md - Technical architecture
requirements.md - Detailed specifications

Technology Stack

Java 17 - Language
Spring Boot 3.3.5 - Application framework
Spring AI 1.0.0-M4 - LLM integration
Spring WebFlux - Reactive web
Spring Data R2DBC - Reactive database
JGraphT 1.5.2 - In-memory graph
ANTLR 4.13.1 - Parser generator
DuckDB 1.1.3 - Default embedded database
PostgreSQL 16 - Optional external database (with pgvector)
Apache PDFBox 3.0.3 - PDF processing

License

MIT License

See LICENSE file for full license text.

Contributing

Contributions are welcome! Please:

Test untested configurations (PostgreSQL, OpenAI, Neo4j)
Report issues with detailed logs and reproduction steps
Submit pull requests with test coverage
Review KNOWN_ISSUES.md for areas needing validation

For questions, contact DDSE Foundation at https://ddse-foundation.github.io/

Authors

Mahmudur R Manna (mrmanna) - Founder and Principal Architect, DDSE Foundation

About DDSE Foundation

This framework is developed by the DDSE Foundation (Decision-Driven Software Engineering), an open-source initiative advancing principled approaches to software architecture and engineering.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.idea		.idea
cef-framework		cef-framework
ddse		ddse
docker/postgres		docker/postgres
docs		docs
site		site
.gitignore		.gitignore
CEF_README.md		CEF_README.md
KNOWN_ISSUES.md		KNOWN_ISSUES.md
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
RELEASE_NOTES.md		RELEASE_NOTES.md
USER_GUIDE.md		USER_GUIDE.md
docker-compose.yml		docker-compose.yml
pom.xml		pom.xml

License

ddse-foundation/cef

Folders and files

Latest commit

History

Repository files navigation

CEF - Context Engineering Framework

Overview

Target Audience

Core Capabilities

Architecture

Quick Start

Prerequisites

1. Clone and Build

2. Start Infrastructure

3. Run Framework Tests

4. Access Services

Project Structure

Configuration

Default (DuckDB + Ollama)

Optional (PostgreSQL)

Optional (MinIO/S3)

Usage

1. Framework Dependency

2. Define Domain Entities

3. Create Custom Parser

4. Persist Knowledge Models

5. Query Context

Benchmark Results: Knowledge Model Superiority

Medical Domain Tests

Financial Domain Tests (SAP-Simulated)

Performance Comparison

Documentation

Technology Stack

License

Contributing

Authors

About DDSE Foundation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages