v0.6.0 Full LangChain support in addition to LlamaIndex, more DBs (15 PG, 4 RDF)
v0.6.0 LlamaIndex or LangChain, 15 graph (8 + ArangoDB,AGE,Cosmos,SurrealDB,Spanner,HugeGraph,TigerGraph). 4 RDF (3 + Neptune RDF), 10 vector, 3 search
-
15 total property graph databases — 8 existing LlamaIndex stores with LangChain versions
added (Neo4j, ArcadeDB, FalkorDB, Memgraph, NebulaGraph, Neptune, Neptune Analytics,
LadybugDB); 6 new LC-only stores (ArangoDB, Apache AGE, Azure Cosmos DB Gremlin, Apache
HugeGraph, SurrealDB, TigerGraph); 1 new LI-only store (Google Cloud Spanner) -
10 LangChain vector backends (Qdrant, Elasticsearch, Milvus, Weaviate, LanceDB, Chroma,
Pinecone, pgvector, OpenSearch, Neo4j vector); 3 LangChain search backends (Elasticsearch,
OpenSearch, BM25); 4 RDF/triple-store backends (Fuseki, GraphDB, Oxigraph + new Amazon
Neptune RDF with IAM SigV4 auth) -
flexible-graphrag now runs fully on LlamaIndex, fully on LangChain, or any mix — both
frameworks are first-class peers. Each pipeline stage is independently configurable:
CHUNKER_BACKEND, KG_EXTRACTOR_BACKEND, GRAPH_BACKEND, VECTOR_BACKEND, SEARCH_BACKEND,
RETRIEVAL_FUSION, LLM_PROVIDER, EMBEDDING_KIND. Note: document readers / data sources
remain LlamaIndex-based (first pipeline stage). LangChain-only graph stores auto-select
GRAPH_BACKEND=langchain. -
Retrievers for both LI and LC with fusion support for both frameworks (RETRIEVAL_FUSION=
llamaindex uses QueryFusionRetriever; =langchain uses EnsembleRetriever when all stores are
LC-backed). LangChain retrievers include: Synonym Exploder (expands query terms for vector
search), pg_vector + neighborhood traversal for Neo4j (LANGCHAIN_PG_VECTOR_SEARCH,
USE_PG_NEIGHBORHOOD), and text-to-query graph QA for all LC property graph stores (generates
Cypher for Neo4j/ArcadeDB/Memgraph/FalkorDB/Ladybug/AGE, GQL for HugeGraph, SurrealQL for
SurrealDB, AQL for ArangoDB, SPARQL for all RDF stores). -
Matrix test support — run_matrix.py / run_all_profiles.py; 24+ integration test profiles
covering all PG, vector, search, RDF, and chunker combinations -
Docling OCR — DOCLING_OCR=true + DOCLING_OCR_ENGINE (auto / rapidocr / easyocr /
tesseract_cli / tesserocr / ocrmac); optional extras for easyocr, tesserocr, ocrmac -
Incremental update (add, delete, modify) end-to-end across property graph, RDF,
vector, and search databases on both LlamaIndex and LangChain backends -
scripts/cleanup.py — all 15 property graph stores have native-client cleanup; early skip
when store stage disabled to improve speed, postgres document state / datasource config
tabvle cleanup skipped ifuse incremental update false. -
Observability — upgraded OpenLIT + OpenInference LangChain instrumentation; both OTLP
producers (LlamaIndex via OpenLIT, LangChain via OpenInference) work simultaneously -
Docs site (zensical 0.0.40) + major doc updates — ARCHITECTURE.md (15 PG stores, 13 LLM
providers, framework backends section), per-store setup guides (Cosmos Gremlin, Neptune,
Spanner), CONFIG-PROPERTY-GRAPH, DATABASE-CONFIGURATION, UI-TAB-SEARCH, MCP-TOOLS; all
broken links fixed, README.md updated -
Per-store config isolation — each database and LLM/embedding provider has its own typed
config env var ({TYPE}_GRAPH_DB_CONFIG, {TYPE}_VECTOR_DB_CONFIG, {TYPE}_SEARCH_DB_CONFIG,
{KIND}_EMBEDDING_MODEL, etc.); per-store config takes precedence over generic fallback; no
shared config collisions across stores -
Time logging now separates out KG extraction time from graph storage time.