Skip to content

Semantica v0.2.6

Choose a tag to compare

@github-actions github-actions released this 03 Feb 05:10
· 965 commits to main since this release

Semantica v0.2.6

Release Date: February 3, 2026

We're excited to announce Semantica v0.2.6, featuring major enhancements in provenance tracking, change management, and several important bug fixes!


πŸŽ‰ Highlights

Major Features

  • W3C PROV-O Compliant Provenance Tracking - Enterprise-grade lineage tracking across all 17 modules
  • Enhanced Change Management - Version control for knowledge graphs and ontologies
  • CSV Ingestion Improvements - Auto-detection and robust error handling
  • Comprehensive Test Coverage - 80-86% coverage for ingestion modules

Bug Fixes

  • Temperature compatibility for LLM providers
  • JenaStore empty graph initialization

✨ New Features & Enhancements

W3C PROV-O Compliant Provenance Tracking

PRs: #254, #246 | Contributor: @KaifAhmad1

A comprehensive provenance tracking system with W3C PROV-O compliance across all 17 Semantica modules.

Core Module:

  • ProvenanceManager for centralized tracking
  • W3C PROV-O schemas (Activity, Entity, Agent)
  • Storage backends: InMemory and SQLite
  • SHA-256 integrity verification

Module Integrations:

  • Semantic Extract, LLMs (Groq, OpenAI, HuggingFace, LiteLLM)
  • Pipeline, Context, Ingest, Embeddings
  • Graph/Vector/Triplet stores
  • Reasoning, Conflicts, Deduplication
  • Export, Parse, Normalize, Ontology, Visualization

Features:

  • Complete lineage tracking: Document β†’ Chunk β†’ Entity β†’ Relationship β†’ Graph
  • LLM tracking: tokens, costs, latency
  • Source tracking and bridge axioms for domain transformations

Compliance:

  • W3C PROV-O, FDA 21 CFR Part 11, SOX, HIPAA, TNFD

Testing:

  • 237 tests covering core functionality, all 17 module integrations, edge cases, backward compatibility

Design:

  • Opt-in with provenance=False by default
  • Zero breaking changes
  • No new dependencies

Enhanced Change Management Module

PRs: #248, #243 | Contributor: @KaifAhmad1

Enterprise-grade version control for knowledge graphs and ontologies with persistent storage and audit trails.

Core Classes:

  • TemporalVersionManager - Knowledge graph versioning
  • OntologyVersionManager - Ontology versioning
  • ChangeLogEntry - Change metadata tracking

Storage:

  • SQLite (persistent) and in-memory backends
  • Thread-safe operations

Features:

  • SHA-256 checksums for integrity
  • Detailed entity/relationship diffs
  • Structural ontology comparison
  • Email validation

Compliance:

  • HIPAA, SOX, FDA 21 CFR Part 11
  • Immutable audit trails

Testing:

  • 104 tests (100% pass)
  • Unit, integration, compliance, performance, edge cases

Performance:

  • 17.6ms for 10k entities
  • 510+ ops/sec concurrent
  • Handles 5k+ entity graphs

Migration:

  • Backward compatible
  • Simplified class names
  • Zero external dependencies

CSV Ingestion Enhancements

PR: #244 | Contributor: @saloni0318

Robust CSV parsing with auto-detection and error handling.

Features:

  • Auto-detect CSV encoding using chardet
  • Auto-detect delimiter using csv.Sniffer
  • Tolerant decoding and malformed-row handling (on_bad_lines='warn')
  • Optional chunked reading for large files
  • Metadata tracks detected values

Testing:

  • Expanded unit tests covering:
    • Multiple delimiters
    • Quoted/multiline fields
    • Header overrides
    • Chunked reading
    • NaN preservation

Comprehensive Test Coverage

TextNormalizer Tests

PR: #242 | Contributor: @ZohaibHassan16

Added focused test coverage for TextNormalizer behavior across various inputs.

Integration Test Improvements

PR: #241 | Contributor: @KaifAhmad1

  • Introduced integration test marker
  • Reduced noisy warnings in ingest tests

Ingest Unit Tests

PRs: #239, #232 | Contributor: @Mohammed2372

Comprehensive unit tests for ingestion modules (file, web, and feed ingestors).

Coverage:

  • File scanning: local/cloud (S3/GCS/Azure)
  • Web ingestion: URL/sitemap/robots.txt
  • RSS/Atom feed parsing

Testing:

  • 998 lines of test code
  • Mocked external dependencies for fast, isolated execution

Results:

  • file_ingestor: 86% coverage
  • web_ingestor: 86% coverage
  • feed_ingestor: 80% coverage

Covers happy paths, edge cases, and error handling.


πŸ› Bug Fixes

Temperature Compatibility Fix

PRs: #256, #252 | Contributors: @F0rt1s, @IGES-Institut

Fixed hardcoded temperature=0.3 that broke compatibility with models requiring specific temperature values (e.g., gpt-5-mini).

Changes:

  • Added _add_if_set helper method to BaseProvider
  • Only passes parameters when explicitly set
  • When temperature=None, parameter is omitted allowing APIs to use model defaults
  • Updated all 5 providers: OpenAI, Groq, Gemini, Ollama, DeepSeek

Impact:

  • Reduced code by ~85 lines with cleaner parameter handling
  • Comprehensive test coverage added (10 temperature tests, all passing)
  • Backward compatible - no breaking changes

JenaStore Empty Graph Bug

PRs: #257, #258 | Contributor: @ZohaibHassan16

Fixed ProcessingError: Graph not initialized when operating on empty (but initialized) graphs.

Changes:

  • Replaced implicit if not self.graph: checks with explicit if self.graph is None: validation
  • Updated 5 methods: add_triplets, get_triplets, delete_triplet, execute_sparql, serialize
  • Properly distinguishes None (uninitialized) from empty graphs (initialized with 0 triplets)

Impact:

  • Unblocks benchmarking suite
  • Enables fresh deployments
  • Improves testing workflows

πŸ“¦ Installation

pip install semantica==0.2.6

Or upgrade from a previous version:

pip install --upgrade semantica

πŸ™ Contributors

Special thanks to all contributors who made this release possible:


πŸ“š Documentation


πŸ”— Links


πŸš€ What's Next?

Stay tuned for upcoming features in future releases. Check our GitHub Issues to see what we're working on!


Full Changelog: v0.2.5...v0.2.6