PseudoForge v0.1.4

This release is a major deterministic cleanup-quality update focused on large Windows kernel decompilation corpora. It expands layout provenance, domain-aware identity recovery, corpus quality reporting, replay planning, and fail-closed validation while keeping IDB writes limited to explicitly selected, validator-gated renames.

Compared with v0.1.3, this release includes 239 commits across 107 files.

Highlights

Added trusted temp-base provenance reporting for decompiler temporary layout bases.
Added Windows kernel domain identity packs for common subsystem-specific structures and roles.
Added corpus quality, replay planning, quality comparison, and cleanup integrity tooling.
Expanded field layout hints, layout rewrite previews, blocker queues, and provenance comments.
Improved NTSTATUS/status literal cleanup and residue reporting.
Hardened LLM rename handling, stale fallback cleanup, and risky rename suppression.
Updated README and release validation documentation for the new corpus-quality workflow.

Trusted Temp-Base Provenance

PseudoForge now tracks decompiler temporary layout bases through source origin, lifetime stability, merge shape, guard dominance, and mutation risk.

New evidence comments include:

inferred_offset_temp_provenance_trace
inferred_offset_trusted_temp_source
inferred_offset_temp_promotion_blocked
inferred_offset_same_family_merge_provenance
inferred_offset_call_result_parameter_dominance
inferred_offset_post_access_mutation_blocker

Trust classes now distinguish stable, review-only, and blocked candidates such as:

trusted_stable_source
trusted_stable_temp
stable_review_only
same_family_merge_review
call_result_parameter_review
call_result_temporary_review
branch_merge_blocked
reassignment_blocked
mutation_blocked
opaque_source_blocked
weak_or_unknown_source_blocked

Canonical layout rewrites remain fail-closed. Opaque call results, mixed call-result/parameter branches without dominance proof, bugcheck/debug parameters without domain identity, globals/MMIO-looking bases, array cursors, post-access writes, address-taken uses, and pointer mutations stay report-only or blocked.

Domain Identity Packs

Added a domain identity framework and subsystem packs for Windows kernel analysis, including:

Object Manager
Process and thread lifecycle
Process/thread notify callbacks
Token and security objects
Handle tables
Memory Manager and VAD-related flows
Registry configuration
I/O Manager
PnP and power paths
File cache and section objects
ALPC ports
ETW/WMI telemetry
Executive async patterns
Dispatcher synchronization
Security descriptors and ACLs
Image code integrity
Trap and processor state contexts

These profiles improve role naming and layout evidence without turning weak identity hints into unsafe rewrites.

Layout And Structural Analysis

Expanded deterministic layout analysis with:

Field layout hints and field alias previews
Stable base source evidence
Generic base evidence and trust candidates
Subfield overlay and narrow subfield evidence
Bitfield mask and bitfield alias hints
Hot field cluster evidence
Base stability and relocation-sensitive RHS samples
Merged layout base evidence
Allocation/null merge dominance
Call-result merge equivalence
Parameter merge provenance
Bugcheck merge identity
Indexed callback table evidence
Dense structural hints

Validated layout rewrite previews are now exported as reviewable artifacts, with canonical and partial rewrite paths gated behind explicit validation.

Corpus Quality And Replay Planning

Added and expanded corpus-level tooling:

tools/pseudoforge_corpus_quality.py
tools/pseudoforge_replay_plan.py
tools/pseudoforge_quality_compare.py
tools/pseudoforge_cleanup_integrity.py

The quality report now tracks residue metrics, source identity blockers, layout blocker queues, preview validation state, temp-base provenance, source origins, branch merge shapes, dominance state, and blocked/review-only candidates.

Replay planning can rank high-value functions and emit focused EA lists for targeted no-LLM reruns.

Status Cleanup Improvements

Expanded deterministic NTSTATUS and status-like cleanup, including:

Profiled status argument cleanup
Status alias comparison cleanup
Guard-dispatch status aliases
Logical OR status aliases
Status pointer store literals
Nested status pointer stores
Low-DWORD status carriers
Bitmask-guarded status comparisons
Small enum and debug-exception residue split queues
More detailed NTSTATUS residue review metadata

Batch And Validation Workflow

Improved headless IDA and corpus validation workflows:

Deterministic IDA replay mode
LLM candidate cache replay mode
Better source-identity replay queues
Better replay scoring for layout actionability and residue saturation
Cleanup integrity QA gate
Release validation documentation for deterministic replay and corpus quality comparisons
Kernel corpus relocation/package workflow updates

LLM And Rename Safety

Improved rename handling with:

Evidence-backed dispatcher LLM rename salvage
Pascal underscore normalization
Risky unassigned LLM rename suppression
Stale LLM fallback artifact cleanup
Filtered warning artifact export
Better diagnostics for rename quality and weak candidate residue

LLM suggestions remain advisory and must pass deterministic validation.

Documentation

Updated documentation for:

Trusted temp-base provenance reporting
Corpus quality and replay planning workflow
Quality comparison commands
Release validation workflow
Kernel corpus package installation and runbook guidance

Compatibility Notes

Existing corpus artifacts can still be read, but new provenance metrics require rerunning analysis with this version.
Kernel Corpus data packages remain separate from PseudoForge plugin releases.
New quality and replay metrics are additive.
IDB writes remain limited to selected, validator-gated local and argument renames.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v0.1.4

Choose a tag to compare

Sorry, something went wrong.