v0.3.0
π§ Semantica v0.3.0 β First Stable Release
Released: 2026-03-10 Β |Β PyPI: pip install semantica Β |Β Python: 3.8 β 3.12 Β |Β License: MIT
The first
Production/Stablerelease of Semantica β an open-source framework for building context graphs and decision intelligence layers for AI agents. This release consolidates everything shipped across three stages: 0.3.0-alpha (2026-02-19), 0.3.0-beta (2026-03-07), and 0.3.0 stable (2026-03-10).
pip install --upgrade semanticaNo breaking changes. All new parameters carry safe defaults. All new methods are purely additive.
π¦ Release Highlights
- π Temporal Validity β
valid_from/valid_untilon nodes & edges; query what's active at any point in time - π Cross-Graph Navigation β link separate
ContextGraphinstances; navigate across them; survives save/load - βοΈ Weighted BFS Traversal β filter multi-hop queries by edge confidence with
min_weight - π§ Decision Intelligence β full lifecycle: record β causal chain β impact analysis β precedent search β policy enforcement
- π Delta Processing β SPARQL-based incremental graph diffs; only changed data flows through the pipeline
- ποΈ Deduplication v2 β 6.98x faster semantic dedup, 63.6% faster candidate generation
- π€ New Export Formats β ArangoDB AQL, Apache Parquet (Spark/BigQuery/Databricks ready)
- ποΈ Graph Backends β Apache AGE, PgVector, AWS Neptune, FalkorDB
- β 886+ tests passing β 0 failures
π₯ Contributors
| Contributor | Areas |
|---|---|
| @KaifAhmad1 | Lead maintainer β context graph, decision intelligence, KG algorithms, semantic extraction, pipeline, provenance, bug fixes, release management |
| @ZohaibHassan16 | Deduplication v2 suite, incremental/delta processing, benchmark suite |
| @Sameer6305 | Apache AGE backend, PgVector store, Snowflake connector, Apache Arrow export |
| @tibisabau | ArangoDB AQL export, Apache Parquet export |
| @d4ndr4d3 | ResourceScheduler deadlock fix |
β¨ v0.3.0 Stable β Context Graph Feature Completeness
Shipped 2026-03-10 Β· All changes by @KaifAhmad1
π Temporal Validity Windows
Nodes and edges now carry first-class valid_from / valid_until ISO datetime fields β stored directly on the ContextNode and ContextEdge dataclasses, not buried in metadata.
New API:
add_node(valid_from=..., valid_until=...)andadd_edge(valid_from=..., valid_until=...)β set validity window at creationnode.is_active(at_time=None)andedge.is_active(at_time=None)β returnsTrueif live at the given time (defaults to now)graph.find_active_nodes(node_type=None, at_time=None)β filters entire graph to active nodes only
Bug fixes:
is_active()crashed withTypeErroron tz-awaredatetimeinputs β fixed by normalising to tz-naive UTC via new_parse_iso_dt()helper- Validity fields silently lost during serialisation β fixed across all four paths:
add_nodes(),add_edges(),to_dict(),from_dict()
π Cross-Graph Navigation
Separate ContextGraph instances can now be linked and navigated between. Links are fully durable β they survive save_to_file() / load_from_file() and reconnect via a registry.
New API:
graph.graph_idβ stable UUID assigned at init; persisted to JSONlink_graph(other_graph, source_node_id, target_node_id)β creates a navigable bridge; returnslink_idnavigate_to(link_id)β returns(other_graph, target_node_id)resolve_links({graph_id: instance})β reconnects links after load; returns count resolvedsave_to_file()β now writes alinkssection alongside nodes and edgesload_from_file()β restoresgraph_idand populates_unresolved_links
Bug fix: Previous implementation auto-created marker targets as phantom "entity" nodes β fixed by pre-creating a "cross_graph_link" typed ContextNode before inserting the marker edge.
14 new tests in tests/context/test_cross_graph_navigation.py covering link creation, phantom-node prevention, partial registry resolution, and full save/load round-trips.
βοΈ Weighted Multi-Hop BFS Traversal
get_neighbors() now accepts a min_weight threshold to confine traversal to high-confidence causal links only. Default 0.0 passes all edges β fully backward-compatible.
π§ Additional Fixes in v0.3.0 Stable
PipelineBuilder.add_step()return type annotation corrected from"PipelineBuilder"to"PipelineStep"test_hybrid_search_performancefixed to accumulate a truesearch_timeslist; threshold relaxed to< 5.0sfor realsentence-transformerslatency
π§ v0.3.0-beta β Semantic Extraction, Deduplication v2, New Export Formats
Shipped 2026-03-07
π§© Semantic Extraction Fixes β @KaifAhmad1 (PR #354, #355)
LLM Relation Extraction:
- Unmatched subjects/objects now produce a synthetic
UNKNOWNentity instead of silently dropping the relation - Orphaned legacy block in
_parse_relation_resultthat appended every relation twice has been removed extraction_methodparameter added β typed extraction paths now record"llm_typed"instead of"llm"
Reasoner Pattern Matching:
_match_patterninreasoner.pyfully rewritten β splits patterns on?varplaceholders, escapes only literal segments, uses backreferences for repeated variables and non-greedy.+?to prevent over-consumption
RDF Export Aliases:
RDFExporternow accepts"ttl","nt","xml","rdf", and"json-ld"as format aliases β zero API changes
Tests added: tests/reasoning/test_reasoner.py (4 tests), tests/semantic_extract/test_relation_extractor.py (6 tests), tests/export/test_rdf_exporter.py (8 tests)
π Incremental / Delta Processing β @ZohaibHassan16, @KaifAhmad1 (PR #349)
- Native SPARQL-based diff between graph snapshots β only changed triples enter the pipeline
delta_modeflag inPipelineBuilderfor near-real-time incremental workloads- Version snapshot management with graph URI tracking and per-snapshot metadata storage
prune_versions()for automatic retention cleanup of old snapshots
Bug fixes: corrected SPARQL variable order, fixed class references, resolved duplicate dictionary keys.
ποΈ Deduplication v2 Suite β @ZohaibHassan16, @KaifAhmad1 (PR #338, #339, #340, #344)
Three independently opt-in tiers β legacy mode remains the default, fully backward-compatible.
Candidate Generation v2 (PR #338):
- New
blocking_v2andhybrid_v2strategies replace O(NΒ²) pair enumeration - Multi-key blocking with normalised token prefixes, type-aware keys, and optional phonetic (Soundex) matching
- Deterministic
max_candidates_per_entitybudgeting with stable sorting - 63.6% faster in worst-case scenarios (0.259s β 0.094s for 100 entities)
Two-Stage Scoring Prefilter (PR #339):
- Fast gates for type mismatch, name-length ratio, and token overlap eliminate expensive semantic scoring for obvious non-matches
- Configurable thresholds:
min_length_ratio,min_token_overlap_ratio,required_shared_token - 18β25% faster batch processing when enabled (
prefilter_enabled=Falseby default)
Semantic Relationship Deduplication v2 (PR #340):
- Canonicalisation engine with predicate synonym mapping (e.g.
works_forβemployed_by) - O(1) hash matching for exact canonical signatures before any semantic scoring
- Weighted scoring: 60% predicate + 40% object with explainable
semantic_match_scorein metadata - 6.98x faster than legacy mode (83ms vs 579ms)
dedup_triplets()infinite recursion bug fixed; promoted to first-class API inmethods.py
Migration guide: MIGRATION_V2.md with complete examples for all v2 strategies (PR #344)
π€ New Export Formats β @tibisabau (PR #342, #343)
ArangoDB AQL Export (PR #342):
- Full AQL
INSERTstatement generation for vertices and edges - Configurable collection names with validation and sanitisation; batch processing (default: 1000)
export_arango()convenience function;.aqlauto-detection in the unified exporter- 17 tests β 100% pass rate
Apache Parquet Export (PR #343):
- Columnar storage with configurable compression: snappy, gzip, brotli, zstd, lz4, none
- Explicit Apache Arrow schemas with type safety and field normalisation
- Analytics-ready: pandas, Spark, Snowflake, BigQuery, Databricks
export_parquet()convenience function;.parquetauto-detection- 25 tests β 100% pass rate
π Beta Bug Fixes β @KaifAhmad1
Context module:
retrieve_decision_precedentsβ entity extraction correctly gated onuse_hybrid_search=True_extract_entities_from_queryβ switched toword[0].isupper()to capture camelCase identifiers likeCreditCard- Added missing
expand_context()(BFS traversal) and_get_decision_query()methods - Fixed
hybrid_retrieval,dynamic_context_traversal,multi_hop_context_assemblyfor correct single-pass BFS - Fixed
_retrieve_from_vectorfallback to prevent empty content and negative similarity scores
KG module:
calculate_pagerankβ addedalpha/max_iteraliases; return format structured to{"centrality": scores, "rankings": sorted_list}community_detector._to_networkxβ fixed silent edge-loss when a NetworkX graph is passed directly- Added 9 domain-specific tracking methods to
AlgorithmTrackerWithProvenance - Created
provenance_tracker.pywithProvenanceTracker; correctly exported fromsemantica.kg
Pipeline module:
- Retry loop fixed β now correctly iterates to
max_retries - Added
RecoveryAction+handle_failure(error, policy, retry_count)with LINEAR, EXPONENTIAL, and FIXED backoff add_step()fixed to return the createdPipelineStepvalidateadded as public alias forvalidate_pipelineinPipelineValidator
Other:
- Fixed
NameErrorβ missingTypeimport inutils/helpers.py - Vector store performance threshold relaxed from
< 100msto< 500msper decision - Windows cp1252 encoding fixed in test files
Beta result: ~840 tests passing, 36 skipped (external services), 0 failed
π v0.3.0-alpha β Foundational Features
Shipped 2026-02-19
π§ Decision Intelligence & Agent Context β @KaifAhmad1 (PR #307, #315)
The foundational 0.3.0 feature β complete overhaul of semantica.context for production-grade decision intelligence.
Full decision lifecycle:
record_decision()βadd_causal_relationship()βtrace_decision_chain()βanalyze_decision_impact()βanalyze_decision_influence()βfind_similar_decisions()
AgentContext β unified wrapper:
- Feature flags:
decision_tracking,kg_algorithms,graph_expansion - Methods:
store(),retrieve(),get_conversation_history(),get_statistics(),capture_cross_system_inputs()
Supporting components:
AgentMemoryβ working, conversation, and long-term memory tiersPolicyEngineβ versioned policy nodes,check_decision_rules(),PolicyExceptionmodel, graceful fallback without a graph store- Hybrid precedent search β vector + structural + category similarity with configurable weights
9 critical bug fixes in PR #315: causal chain depth, None metadata, nonexistent node handling, find_precedents() direction, missing from_dict(), missing properties in to_dict(), UUID generation β all 71 context tests passing after fixes.
π KG Algorithms β @KaifAhmad1 (PR #292, #293)
30+ graph algorithms across 7 categories:
- Node embeddings: Node2Vec, DeepWalk, Word2Vec via
NodeEmbedder - Similarity: cosine, Euclidean, Manhattan, correlation via
SimilarityCalculator - Path finding: Dijkstra, A*, BFS, K-shortest paths via
PathFinder - Link prediction: preferential attachment, Jaccard, Adamic-Adar via
LinkPredictor - Centrality: degree, betweenness, closeness, PageRank via
CentralityAnalyzer - Community detection: Louvain, Leiden, label propagation via
CommunityDetector - Connectivity: components, bridges, density via
ConnectivityAnalyzer
Decision embedding pipeline (PR #293):
DecisionEmbeddingPipelineβ semantic + structural embeddingsHybridSimilarityCalculatorβ configurable weights (semantic: 0.7, structural: 0.3)- Convenience API:
quick_decision(),find_precedents(),explain(),similar_to(),batch_decisions(),filter_decisions() - Performance: 0.028s per decision, 0.031s search, ~0.8KB memory per decision
ποΈ Graph Database Backends β @Sameer6305, @KaifAhmad1
Apache AGE (PR #311):
AgeStoreclass with fullGraphStoreAPI compatibility (openCypher via SQL on PostgreSQL)- SQL injection vulnerabilities fixed with comprehensive input validation
psycopg2-binarydependency added; migration guide included
PgVector Store (PR #303):
- Native PostgreSQL vector storage using the pgvector extension
- Distance metrics: cosine, L2/Euclidean, inner product with automatic score normalisation
- HNSW and IVFFlat indexing for approximate nearest-neighbour search
- JSONB metadata with flexible filtering; connection pooling with psycopg3/psycopg2 fallback
- SQL injection protection via
psycopg_sql.SQL(); 36+ tests with Docker integration
βοΈ Infrastructure β @d4ndr4d3, @KaifAhmad1 (PR #299, #301)
ResourceScheduler Deadlock Fix:
- Root cause: nested lock acquisition in
allocate_resources()withthreading.Lock()deadlocked under concurrent load - Fix: replaced with
threading.RLock()to allow reentrant acquisition - Added
ValidationErrorwhen no resources can be allocated; progress tracking moved outside lock scope - 6 regression tests for deadlock prevention
Security Configuration:
- Dependabot configured for bi-weekly security updates with manual review
- Automated security scans (Bandit, Safety, Semgrep) on schedule
- Zero auto-merge policy for security-critical packages
π Test Coverage Summary
semantica.contextβ 335 testssemantica.kgβ ~430 testssemantica.semantic_extractβ 70 tests (9 skipped β external LLM APIs)semantica.reasoningβ 19 testssemantica.pipeline,semantica.export,semantica.deduplicationβ all passing- Real-world E2E scenarios β 85 tests
- Grand total: 886+ passing β 0 failures