Fix SWI-Prolog 10 stack exhaustion and improve memory management#2377
Merged
Conversation
…ugh frames. Now lightweight.
dfrnt-HansKochstein
approved these changes
Feb 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This branch addresses a class of stack overflow crashes that occur under SWI-Prolog 10 when large
transaction_objectdicts are carried through recursive Prolog call chains. It also enables correct LRU cache eviction in terminus-store and replaces the system allocator with jemalloc to reduce memory fragmentation in long-running server processes.Also ensures the integration tests are optimized between each test for well-controlled optimization, instead of auto-optimizer that operates between transactions with 10% probability.
With these changes, there is an approx 15% performance improvement when running a WOQL-heavy set of integration tests and has sustained performance for long runs thanks to auto-optimizer, stack and memory usage improvements, tabling optimization, jemalloc and many other improvements made across the codebase to support swipl 10 and stability fixes.
Background
SWI-Prolog 10 introduced stricter stack segment management. The
trie_gen_compiled/2primitive, used internally by tabled predicates, assertsgTop+1 <= gMax && tTop+2 <= tMaxat entry. When recursive predicates carried the fulltransaction_object— a deeply nested dict containing schema, instance, and inference graphs plus all metadata — each stack frame consumed significantly more space than necessary. This leaves insufficient headroom for trie operations, triggering hard crashes on databases with moderately complex schemas.The core fix is straightforward: extract the lightweight
Schema(a simple list of read-write objects) once at each entry point, then thread it through the recursive chain instead of the full transaction object. Wrapper predicates that extract schema internally — likeis_subdocument/2,class_predicate_type/4,oneof_descriptor/3— are replaced with their directschema_*equivalents where the schema is already available.Changes
Prolog stack pressure reduction
inference.pl — The inference chain (
infer_type,infer_range,infer_object_type, and related predicates) previously threadedDatabasethrough every recursive frame. Refactored to extractSchemaat entry points and pass it through the entire chain. Prefix merging moved to entry points to avoid repeated computation.json.pl — Two recursive chains fixed:
json_assign_idsextractsSchemaonce, delegates tojson_assign_ids_which carries the lightweight schema through ID generation for nested subdocuments. Newjson_idgen_schemaandget_field_values_variants takeSchemadirectly.get_documentextractsSchemaandInstanceonce, delegates toget_document_to avoid carrying the full transaction object through document retrieval.migration.pl —
strip_nonconforming_idsextractsSchemaonce at the entry point. The recursivestrip_nonconforming_value_usesschema_is_subdocumentandschema_key_descriptordirectly, eliminating the large transaction object fromconvlistlambda closures that previously captured it on every frame.instance.pl — The
refute_instancevalidation chain threadsSchemafrom the entry points (refute_instance/2,refute_instance_schema/2) throughrefute_subject,refute_subject_1,refute_typed_subject,refute_cardinality,refute_cardinality_new,refute_object_type, andrefute_object_type_. Each predicate now uses directschema_*calls (schema_class_predicate_type,schema_oneof_descriptor,schema_is_abstract,is_schema_foreign, etc.) instead of wrappers that re-extract the schema. Old-arity predicates are kept as thin wrappers where external callers depend on them.schema.pl — Exports
is_schema_foreign/2,schema_class_predicate_type/4,schema_class_subsumed/3, andis_schema_simple_class/2to support direct schema-based lookups from other modules.Memory allocator
terminusdb-dylib — Replaces the system malloc with jemalloc via
tikv-jemallocator. The default glibc allocator on Linux tends to fragment memory in long-running processes with many small allocations (common in layer cache operations). Jemalloc uses thread-local caches and size-class-based arenas that significantly reduce fragmentation. Configured withbackground_threadsfor asynchronous purging anddisable_initial_exec_tlsfor compatibility as a dynamically loaded library. Only enabled on non-MSVC targets.Test infrastructure
test_utils.pl — Improved
spawn_server_1to collect stderr lines during server startup and retry on a different port when the spawned server fails to start. Previously,server_has_no_outputthrew past thebetween/3retry loop, causing flaky push/pull tests when ports were temporarily unavailable.