PseudoForge is an IDA Pro / Hex-Rays plugin that turns noisy pseudocode into reviewable, kernel-aware cleanup artifacts.
The core direction is deterministic-first. PseudoForge does not let an LLM rewrite arbitrary code. It builds a validated CleanPlan from deterministic analysis, optional data-only rules, and optional LLM rename suggestions, then writes preview/export artifacts that can be compared against the original pseudocode. IDB writes remain limited to user-selected, validator-gated local and argument renames.
All repository documentation is written in English. Generated comments, logs, rule text, and examples should also stay ASCII-only unless a file has an explicit reason to use another character set.
The left side is raw Hex-Rays pseudocode. The right side is the PseudoForge preview.
Animated demo of the interactive IDA preview workflow:
Static preview examples:
The second preview shows no-symbol OB callback cleanup with inferred LIST_ENTRY record types and CONTAINING_RECORD-based traversal.
The third preview shows the PfKernelPattern IOCTL handler cleanup, including IRP dispatch naming, IO_STACK_LOCATION.Parameters.DeviceIoControl field rendering, SystemBuffer union alias cleanup, NTSTATUS names, and decoded CTL_CODE(...) case comments.
PseudoForge works on Hex-Rays pseudocode output. Its cleanup quality depends heavily on the quality of the initial decompilation.
Better Hex-Rays output usually produces better PseudoForge output. Type information, function prototypes, structure recovery, imported kernel symbols, PDB/type library availability, and correct calling conventions all improve deterministic matching and reduce noisy casts.
PseudoForge does not recover semantics that are completely absent from the decompiler output, and it does not treat LLM suggestions as authoritative rewrites. LLM assist remains optional and must pass deterministic validation. Preview/export artifacts are the primary output; IDB writes remain limited to explicitly selected, validator-gated rename operations.
For best results:
- Let IDA finish analysis before previewing or exporting PseudoForge output.
- Load relevant PDBs, type libraries, WDK headers, and kernel type information when available.
- Fix obviously wrong function prototypes and calling conventions before cleanup.
- Prefer symbol and type recovery over text-only cleanup.
- Review inferred structure rewrites, especially when fixed offsets are converted into semantic fields.
- Use IDA Pro 9.x or newer with Hex-Rays for the interactive plugin path.
- Copy
pseudoforge.py,ida-plugin.json, andida_pseudoforge/into the IDA user plugin directory. - Open a pseudocode view and run
Edit/PseudoForge/Analyze current function. - Review the generated
<input>.forgesection or export bundle before applying any IDB rename. - For offline smoke testing, run:
python -B .\tools\pseudoforge_cli.py .\samples\pseudocode\NtSetSystemInformation_switch_renamed.cpp --out $env:TEMP\pseudoforge_cli_smokeKey documentation:
- pseudoforge_implementation_status.md: current implemented scope and validation history.
- ida_pseudocode_refactor_plugin_design.md: overall product and architecture design.
- deterministic_rules_matching_engine_design.md: deterministic JSON rule engine design.
- samples/kernel_pattern_driver/README.md: WDK sample corpus for kernel-pattern analysis.
Current plugin version: 0.1.0.
The runtime version source is ida_pseudoforge/version.py. The ida-plugin.json manifest version must match it; the unit suite enforces this parity so plugin packaging and runtime reporting do not drift.
Ways to check the installed/current version:
python -B .\tools\pseudoforge_cli.py --version
python -B .\tools\pseudoforge_free_cli.py --versionInside IDA, run Edit/PseudoForge/Show settings. Preview/export headers, switch outlines, aggregate .forge sections, and IDA Free CLI JSON reports also include the version.
Release packaging bumps the patch version by default, updates ida_pseudoforge/version.py, ida-plugin.json, and the current-version lines in the docs, then writes an installable zip named with the new version:
python -B .\tools\release_pseudoforge.pyThe default archive path is:
release\PseudoForge-<new-version>.zip
Useful release options:
python -B .\tools\release_pseudoforge.py --dry-run
python -B .\tools\release_pseudoforge.py --bump minor
python -B .\tools\release_pseudoforge.py --version 0.2.0
python -B .\tools\release_pseudoforge.py --no-version-bumpThe current implementation is an MVP+ slice. The core engine, offline CLI, deterministic rules engine, headless IDA batch path, and interactive IDA plugin load/capture/export/action workflow have been validated.
Implemented:
- Current-function Hex-Rays pseudocode capture.
- Parameter and local rename plan generation from prototypes and usage patterns.
- Rename validation for collisions, reserved words, invalid identifiers, and weak speculative names.
- Dispatcher case recovery from
vX = dispatcher - constantchains.- Chained delta temporaries can be rendered as profile-backed enum comparisons.
- Stale delta temporaries reused after large branch bodies are kept unchanged.
- Native top-level
switch(dispatcher)case and single-return body extraction. - Nested switch depth tracking so inner cases are not mixed into the top-level dispatcher.
- Cleanup label classification.
- Kernel driver semantics pass.
- NTSTATUS literal normalization in returns and status assignments.
- Profile-backed
0xC???????NTSTATUS error literals in 4-byte local assignments and_DWORDstores. - Deterministic LIST_ENTRY record/link/tail naming that outranks generic LLM suggestions.
- LIST_ENTRY unlink/insert-tail pattern hints.
- ERESOURCE, critical region, pool allocation, object reference, and failfast insights.
- Pool tag decoding such as
0x54465241toARFT.
- Profile-backed NTSTATUS,
SYSTEM_INFORMATION_CLASS, andPROCESSINFOCLASSnames.- 25H2-range
SYSTEM_INFORMATION_CLASSandPROCESSINFOCLASSprofile coverage. - Preview-only canonical prototypes for
NtSetSystemInformationandNtSetInformationProcess.
- 25H2-range
- Recovered dispatcher output as an auxiliary switch-case outline appended after normalized original pseudocode.
- Generated pseudocode style normalization.
- Opening braces on the next line.
- Mandatory braces for
if,else,for, andwhile. - Standalone
else. - Guard flattening after terminating branches.
- No forced
do { } while (false)conversion.
- Cleaned pseudocode preview in an IDA native custom viewer.
- Aggregate
<input>.forgeanalysis file beside the analyzed binary. - Export bundle for cleaned pseudocode, switch outline, rename map, flow report, and rule report.
- Action for applying selected local and argument renames to the IDB.
- IDA Output progress logging with file-backed trace logs.
- Offline CLI smoke path that does not require IDA.
- Multi-provider optional LLM rename assist inside IDA.
- Optional offline CLI LLM rename assist.
- Synchronized warning counts between
.forgemetadata and preview headers. - Stable
.forgepath/string escaping and current-function section preview. - Headless IDA batch analysis over
.i64/.idbfunctions. - WDK-based kernel driver test corpus under
samples/kernel_pattern_driver. - Deterministic rules matching engine v1.
- Data-only JSON rule pack loader.
- Builtin, project-local, and user-global rule directories.
- Regex and assignment-based rename rules.
- Semantic comment rules.
- Fail-closed validator CLI.
- Per-function rule report export.
Still pending:
- Full switch body reconstruction for shared and fallthrough branches.
- Ctree-identity based local variable rename tracking beyond the current session and preflight gates.
- Richer dockable side-by-side preview panel.
- Deterministic rule phase expansion for
call_arg_rewrite,text_rewrite, andflow. - Wider profile coverage from real target builds.
Detailed implementation tracking lives in pseudoforge_implementation_status.md.
Core plugin:
- Windows
- IDA Pro 9.x or newer
- Hex-Rays decompiler
- IDA-bundled Python 3
- No external Python packages for core operation
IDA Pro 7.6 or newer may be able to run PseudoForge, but that compatibility path has not been verified yet. Treat IDA 9.x as the supported requirement until older versions are tested directly.
Offline CLI:
- Validated with Python 3.11
- Standard library only
IDA Free:
- Not supported as an interactive PseudoForge plugin target.
- IDA Free does not provide the IDAPython and local Hex-Rays APIs required by the plugin actions.
- Supported only through the offline CLI workflow where copied or saved cloud-decompiled pseudocode text is processed outside IDA.
- The IDA Free CLI path does not modify an IDB and does not apply renames back into IDA.
Optional LLM rename assist:
- OpenAI-compatible
/chat/completionsendpoint - OpenRouter
/chat/completionsendpoint - DeepSeek OpenAI-compatible endpoint
- ChatGPT OAuth via Codex CLI, Codex CLI, Claude login via Claude CLI, or Claude CLI command bridge
- IDA configuration through
Configure LLM rename assist - Environment variables or command-line options for offline CLI
- Not required for deterministic analysis
pseudoforge.py
ida-plugin.json
ida_pseudoforge/
version.py
config.py
core/
capture.py
deterministic/
context.py
emitters.py
engine.py
loader.py
schema.py
validators.py
matchers/
regex.py
forge_store.py
kernel_api.py
kernel_rewrites.py
normalize.py
kernel_semantics.py
lvar_analysis.py
flow_recovery.py
cleanup_rewriter.py
offline_input.py
pattern_renames.py
llm_assist.py
validation.py
render.py
plan_schema.py
api_semantics.py
profiles/
loader.py
kernel_api.json
kernel_api_overrides.json
status_codes.json
process_information_class.json
system_information_class.json
rules/
builtin/
kernel_comments.json
local_renames.json
models/
base.py
cli_provider.py
model_discovery.py
openai_compatible.py
prompting.py
provider_factory.py
provider_registry.py
subprocess_utils.py
logging.py
ida/
action_registry.py
analysis_state.py
async_runner.py
plugin.py
actions.py
decompiler.py
llm_config_dialog.py
apply_changes.py
ui_preview.py
thread_helpers.py
tools/
build_kernel_api_profile.py
build_status_codes_profile.py
empty_llm_rename_provider.py
pseudoforge_cli.py
pseudoforge_free_console.py
pseudoforge_free_cli.py
pseudoforge_ida_batch.py
release_pseudoforge.py
run_pseudoforge_ida_batch.ps1
summarize_pseudoforge_ida_batch.py
validate_pseudoforge_rules.py
samples/
pseudocode/
NtSetSystemInformation_switch_renamed.cpp
kernel_pattern_driver/
tests/
test_core_engine.py
test_ida_plugin_safety.py
test_kernel_api_profile_builder.py
test_llm_cli_provider.py
test_pseudoforge_free_cli.py
test_release_pseudoforge.py
Copy the plugin entrypoint, plugin manifest, and package directory into the IDA user plugin directory:
pseudoforge.py
ida-plugin.json
ida_pseudoforge/
The common Windows user plugin directory is:
%APPDATA%\Hex-Rays\IDA Pro\plugins
PowerShell copy example:
$pluginDir = Join-Path $env:APPDATA "Hex-Rays\IDA Pro\plugins"
New-Item -ItemType Directory -Force $pluginDir | Out-Null
Copy-Item .\pseudoforge.py -Destination $pluginDir -Force
Copy-Item .\ida-plugin.json -Destination $pluginDir -Force
Copy-Item .\ida_pseudoforge -Destination $pluginDir -Recurse -ForceTo confirm the IDA user directory, run this in the IDA Python console:
import ida_diskio
print(ida_diskio.get_user_idadir())During development, a symlink or junction install is usually faster:
$pluginDir = Join-Path $env:APPDATA "Hex-Rays\IDA Pro\plugins"
New-Item -ItemType Directory -Force $pluginDir | Out-Null
New-Item -ItemType SymbolicLink -Path (Join-Path $pluginDir "pseudoforge.py") -Target (Resolve-Path .\pseudoforge.py) -Force
New-Item -ItemType SymbolicLink -Path (Join-Path $pluginDir "ida-plugin.json") -Target (Resolve-Path .\ida-plugin.json) -Force
New-Item -ItemType Junction -Path (Join-Path $pluginDir "ida_pseudoforge") -Target (Resolve-Path .\ida_pseudoforge) -ForceIf symlink creation is blocked by Windows policy, use the copy method.
- Restart IDA.
- Open the target binary.
- Open the Hex-Rays pseudocode view.
- Confirm that
Edit/PseudoForgeis visible. - Run an action on the target function.
Menu:
Edit/PseudoForge/
Analyze current function
Show current analysis result
Analyzed functions...
Export cleaned pseudocode
Configure LLM rename assist
Show settings
Advanced/
Apply selected renames to IDB
Pseudocode view context menu:
PseudoForge/
Analyze current function
Show current analysis result
Analyzed functions...
Export cleaned pseudocode
Configure LLM rename assist
Show settings
Advanced/
Apply selected renames to IDB
Hotkeys:
Ctrl+Alt+F Analyze current function
Ctrl+Alt+P Show current analysis result
Ctrl+Alt+Shift+P Analyzed functions...
Ctrl+Alt+Shift+F Export cleaned pseudocode
Analyze current function decompiles the current function, builds the rename plan, flow outline, cleanup classification, deterministic rule report, and warnings, then updates the function section in <input>.forge. It does not modify the IDB.
Show current analysis result opens only the cached .forge section whose function start EA matches the current pseudocode cursor. It does not decompile, invoke an LLM, run analysis, or refresh the .forge file. If the current function has not been analyzed yet, it asks the user to run Analyze current function first. Copy all and Save as... operate on that selected section.
Analyzed functions... opens a chooser built from cached .forge function-section markers. It avoids opening the full aggregate .forge as the primary UI, which keeps navigation usable after many functions have been analyzed.
Export cleaned pseudocode analyzes the current function and writes a review/audit bundle. Its main purpose is to freeze a PseudoForge result outside the IDA UI so the cleaned pseudocode, rename plan, flow report, and rule report can be shared, diffed, regression-tested, and inspected later. It writes to pseudoforge_out beside the IDB when possible and does not modify the IDB.
Advanced/Apply selected renames to IDB analyzes the function if needed, shows a rename chooser, refuses stale sessions when the current function changed, and applies only user-selected local or argument renames that pass final preflight through ida_hexrays.rename_lvar(). This path is intentionally separate from preview/export.
Configure LLM rename assist stores optional LLM settings in <IDA user directory>\pseudoforge_config.json. HTTP provider API keys are stored per provider under credentials and are prompted only when an enabled provider needs a missing key.
Show settings displays the current plugin version, config path, and LLM status. API keys are masked.
- The preview first shows normalized original pseudocode.
- Functions with recovered dispatcher information append an auxiliary switch-case outline.
- The auxiliary outline summarizes nested if/else dispatcher chains as switch cases.
- Only safe single-return bodies are expanded in the outline.
- Complex shared or fallthrough bodies point back to the normalized original pseudocode instead of emitting misleading fragments.
- Native switches already present in the normalized original pseudocode are not duplicated in the auxiliary outline.
- Viewer lines use IDA color tag syntax highlighting where practical; large previews automatically fall back to plain text.
.forge,Copy all, andSave as...output remain plain text without color tags.- Set
PSEUDOFORGE_DISABLE_PREVIEW_HIGHLIGHT=1before launching IDA to isolate syntax-highlight issues. - Right-click in the preview for
PseudoForge/Copy all,PseudoForge/Save as..., andPseudoForge/Analyzed functions.... PseudoForge/Analyzed functions...and the top-levelAnalyzed functions...action parse.forgemarkers and open a chooser of all analyzed sections.- Function-section
Save as...defaults toPseudoForge__<target>__<function>_0x<EA>.cpp. Copy alluses the Windows Clipboard API withCF_UNICODETEXT; it does not shell out or rely on a Qt clipboard.- Clipboard status is written to
%TEMP%\pseudoforge_clipboard\copy_all.log.
- Casted NTSTATUS returns such as
return (unsigned int)-1073741727;can render asSTATUS_PRIVILEGE_NOT_HELD. - Status accumulator assignments such as
status = 0x40000000;can render asSTATUS_OBJECT_NAME_EXISTS. - Profile-backed
0xC???????NTSTATUS error literals in 4-byte local assignments and_DWORDstores render symbolically, for examplev16 = STATUS_INSUFFICIENT_RESOURCES;. - Wider stores keep the raw literal unless there is stronger type evidence.
status_codes.jsonis generated from WDKntstatus.h; low wait/success aliases are excluded by default exceptSTATUS_SUCCESSandSTATUS_PENDING.- Direct
return 0becomesSTATUS_SUCCESSonly under strong NTSTATUS return evidence such as an explicitNTSTATUSprototype, a known signature override, or anNt*/Zw*native API name. - LLM-only
statusrenames do not makestatus = 0becomeSTATUS_SUCCESS; success assignments require strong NTSTATUS context or a deterministic kernel-status accumulator. - LIST_ENTRY walks, unlink, and insert-tail patterns can produce role-centered names such as
providerRecord,providerLink,nextLink,previousLink, andtailLink. - Deterministic kernel names outrank generic LLM suggestions.
- LLM local and argument renames prefer lowerCamel names. PascalCase LLM names are skipped because they can look like authoritative types or fields.
- DriverEntry-style setup can recover lowerCamel
driverObject,registryPath,status,extension,deviceObject,deviceName, andmajorIndexnames without relying on LLM suggestions. - Strong DriverEntry evidence can render the preview signature as
NTSTATUS __fastcall DriverEntry(PDRIVER_OBJECT driverObject, PUNICODE_STRING registryPath)while keeping IDB writes preview-only. - Driver dispatch table initialization can render
IRP_MJ_MAXIMUM_FUNCTION,IRP_MJ_CREATE,IRP_MJ_CLOSE, andIRP_MJ_DEVICE_CONTROL. - Driver device flags can render
DO_BUFFERED_IOandDO_DEVICE_INITIALIZING, andIoCreateDevicedevice characteristics can renderFILE_DEVICE_SECURE_OPEN. - Unknown or vendor
DEVICE_TYPEvalues, for example0x8337u, stay as literals unless a trusted binary/profile source proves a standardFILE_DEVICE_*name. PseudoForge does not infer original source macro names. - IOCTL dispatcher case constants can be annotated with exact
CTL_CODE(DeviceType, Function, Method, Access)bitfield decoding, includingMETHOD_BUFFERED, while preserving Hex-Rays integer suffixes and without inventing originalIOCTL_*macro names. - IRP dispatch handlers can render preview signatures as
NTSTATUS __fastcall Name(PDEVICE_OBJECT deviceObject, PIRP irp)once IRP completion orIoStatusevidence identifies the handler. - No-PDB dispatch handlers can still recover
deviceObjectandirpwhen the second parameter is completed throughIofCompleteRequest(...), including casted forms such as(IRP *)a2. IO_STACK_LOCATIONindex rewrites are union-arm gated.Parameters.DeviceIoControl.*is emitted only when IRP dispatch evidence and DeviceControlIoControlCodestack-index evidence are present; other IRP major-function paths keep raw indexing until their own union arm is identified.- IRP dispatch body cleanup can render
deviceObject->DeviceExtension,NTSTATUS status, andreturn status;without requiring DeviceControl-specific evidence. - METHOD_BUFFERED-only DeviceControl dispatchers can render the
AssociatedIrp.MasterIrpunion alias asAssociatedIrp.SystemBufferwith aPVOIDlocal type, but only whenIoControlCodeis proven to come from the DeviceControl stack location. Mixed methods, METHOD_NEITHER cases, or IOCTL-like switches without stack evidence keep the original union alias. - LLM-proposed names such as
ioControlCodeorioStackLocationdo not force a DeviceControl union arm when the function is not an IRP dispatch path. - Device-control dispatchers can recover
deviceObject,irp,ioStackLocation,ioControlCode,outputBufferLength, andinputBufferLengthfrom usage. The stack-location variable does not need to already be namedioStackLocation. - IRP completion tails that set
IoStatus, callIofCompleteRequest, and return status can be labeled asCompleteIrpinstead of staying as unknown labels. - Device-control display warnings suppress resolved buffered/SystemBuffer and dispatch-signature cautions once deterministic IOCTL and IRP evidence has already proved the rewrite.
- DriverEntry device-extension offset usage can produce a preview-only
INFERRED_DRIVER_DEVICE_EXTENSIONand field access for common initialization, cleanup, work-item, registry-path, lookaside, timer, DPC, rundown, and resource fields. - Inferred device-extension structs do not authorize reconstructing original
sizeof(...)source expressions. Allocation and whole-extension zeroing sizes remain as Hex-Rays literals unless there is direct evidence. - DriverEntry display warnings suppress routine LLM sub-function rename guesses and redundant
DeviceExtensionwording once deterministic DriverEntry/device-extension evidence has been recovered. - Function pointers resolved through
MmGetSystemRoutineAddresscan use WDK profile metadata when the routine string or function-pointer variable name matches a profiled API and the call arity matches. The preview keeps the indirect call form and adds aresolved indirect callcomment instead of rewriting it into a direct import-style call. - Callback registration toggles that combine process, image, thread, and object callbacks can recover
deviceExtension,enable, callback status locals,OB_FLT_REGISTRATION_VERSION, andOB_OPERATION_REGISTRATIONfield assignments from Hex-Rays_QWORD[4]stack arrays. - Configuration Manager registry callback probes can recover
callbackContext,majorVersion,minorVersion,callbackCookie,altitudeString,registerExStatus, andregisterStatus, while rendering successfulCmRegisterCallback(Ex)checks withNT_SUCCESS(...). - Memory Manager probe functions that combine
MmGetSystemRoutineAddress,MmCopyMemory, MDL setup, noncached memory, and contiguous memory allocation can recover routine-name, buffer, MDL, byte-count, and physical-address locals.MmCopyMemoryflags render asMM_COPY_MEMORY_PHYSICALorMM_COPY_MEMORY_VIRTUAL. - Zw API corpus/probe functions that exercise object, registry, token, and file calls can recover handle, status, object-attribute, timeout, IO-status, value-name, and shared info-buffer roles. Preview rendering keeps the calls intact while normalizing
OBJECT_ATTRIBUTESsize,OBJ_*flags,NtCurrentProcess(),NtCurrentThread(), and successful status checks. - Confident record layout evidence can simplify offset arithmetic into preview-only
CONTAINING_RECORD(...)forms. - Known OB pre-operation callbacks simplify raw offset loads such as
*(_DWORD *)(*(_QWORD *)(preOperationInfo + 32) + 4LL)and typed-array offset loads such as*(_DWORD *)(*((_QWORD *)preOperationInfo + 4) + 4LL)into typedpreOperationInfo->Parameters->...OriginalDesiredAccessaccess. - No-symbol OB pre-operation callbacks with a suspicious
POB_PRE_OPERATION_CALLBACKsecond parameter can be normalized toPOB_PRE_OPERATION_INFORMATION preOperationInfowhen field-use evidence matches the callback information layout. - OB pre-operation private LIST_ENTRY records and event records can receive preview-only inferred record types when allocation size, list walk shape, and field-write evidence all match. Confirmed record loops are rendered with a separate
LIST_ENTRY *iterator andCONTAINING_RECORD(...). - Identified LIST_ENTRY heads can become aliases such as
providerListHead = (LIST_ENTRY *)&ExpFirmwareTableProviderListHead. - Verified neighboring-link checks can render as
RemoveEntryList(providerLink)andInsertTailList(providerListHead, newProviderLink). - Self-linked LIST_ENTRY initialization can render as
InitializeListHead(newProviderLink). - Suspicious call targets are preserved with warning comments; uncertain targets are not replaced with different API names.
- Semantic labels such as
CorruptListEntry,InvalidParameter, andCleanupare column-zero labels. - Duplicate semantic labels receive stable suffixes such as
InvalidParameter_17. - Safe tail-label hoisting separates error/failfast paths from normal cleanup returns.
Flow rewritescounts dispatcher/switch recovery only. Kernel semantic substitutions are counted underKernel semantic rewrites.- TraceLogging and C++ template wrapper functions are not promoted to recovered switch outlines.
- Kernel rewrite patterns belong in
core/kernel_rewrites.py, either inKernelRewriteRuleentries or narrow helper passes. Avoid adding individual kernel patterns directly torender.py. - Kernel rewrite rules should be gated by
Kernel insightscomment kind and confidence where applicable. - WDK-backed API parameter metadata can render calls like
ExAllocatePool2(0x100uLL, 0x28uLL, 0x54465241u)asExAllocatePool2(POOL_FLAG_PAGED, 0x28uLL, POOL_TAG('A', 'R', 'F', 'T')). - Scalar
BOOLEANarguments can render asTRUEorFALSE.
samples/kernel_pattern_driver contains a WDM driver corpus for PseudoForge analysis regression testing. It follows the shape of the Microsoft Windows-driver-samples WDM IOCTL sample while concentrating common kernel driver call combinations into one binary.
Included patterns:
DriverEntry,DriverUnload,IRP_MJ_CREATE,IRP_MJ_CLOSE, andIRP_MJ_DEVICE_CONTROLIoCreateDevice,IoCreateSymbolicLink, andMETHOD_BUFFEREDIOCTL validationExAllocatePool2,ExFreePoolWithTag, andNPAGED_LOOKASIDE_LISTLIST_ENTRYevent retention and variable output withFIELD_OFFSETFAST_MUTEX,ERESOURCE, and critical-region pairingPsLookupProcessByProcessIdandObDereferenceObjectKTIMER,KDPC, andIoQueueWorkItem- Optional process, image, and thread callback registration
- Optional
ObRegisterCallbacksprocess object callback registration - LIST_ENTRY-backed process whitelist/blacklist traversal with
CONTAINING_RECORD - A single-function object pre-operation callback path in
PfkpObjectPreOperation, including requested-access checks and requester whitelist auto-add behavior
Build:
.\samples\kernel_pattern_driver\tools\build.ps1 -Configuration ReleaseOutput:
samples\kernel_pattern_driver\x64\Release\PfKernelPattern.sys
samples\kernel_pattern_driver\x64\Release\PfKernelPatternTool.exe
PseudoForge reads ida_pseudoforge/profiles/kernel_api.json by default for WDK API prototypes and selected argument semantics. kernel_api_overrides.json adds private wrapper aliases and deterministic argument semantics that do not exist directly in WDK headers.
Regenerate the profile from WDK headers:
python -B .\tools\build_kernel_api_profile.py --list-versions
python -B .\tools\build_kernel_api_profile.py --version 10.0.26100.0Inspect selected functions without writing a profile:
python -B .\tools\build_kernel_api_profile.py --version 10.0.26100.0 --header wdm.h --dry-run --function ExAllocatePool2 --function ExFreePoolWithTagOptions:
--wdk-include-root: defaults toC:\Program Files (x86)\Windows Kits\10\Include.--version: WDK include version; omitted means the newest installedkminclude directory.--header: header name to parse; may be repeated.--directory: WDK include subdirectory; defaults tokmandshared.--all-km-headers: parse onlykm\*.hfor the selected WDK version.--out: output profile path; defaults toida_pseudoforge/profiles/kernel_api.json.--function: function name to extract; may be repeated.--known-only: generate only functions with PseudoForge semantic overlays.--summary: print function/enum count summary.--verbose-summary: include function names in the summary.--dry-run: print JSON to stdout instead of writing a file.
The built-in profile is currently generated from WDK 10.0.26100.0 and includes:
- 470 headers
- 3501 function prototypes
- 1760 enums
- 8354 structures
- 19865 typedef aliases
- 58251 macros
- 93592 symbol index entries
- Semantic overlays for
POOL_FLAGS,BOOLEAN, pool tag parameters, and selected resource/list/pool APIs - Override aliases such as
Obp -> Ob,Psp -> Ps,Iop -> Io,Mmp -> Mm, andSep -> Se - Derived argument semantics for exact
Tagarguments in pool andWithTagAPIs
The profile includes a symbols index for name-based lookup. Names such as NdisRegisterProtocolDriver, FltRegisterFilter, PDEVICE_OBJECT, and POOL_FLAG_PAGED can be found as functions, aliases, macros, or enum members. Private wrapper aliases are exposed as function_alias entries, so a call such as ObpReferenceObjectByHandleWithTag can use the public ObReferenceObjectByHandleWithTag prototype metadata while preserving the original call spelling.
PseudoForge stores analyzed cleaned pseudocode beside the target binary.
Example:
C:\work\a.exe
C:\work\a.forge
Rules:
- The filename keeps the input stem and changes only the extension to
.forge. - If IDA cannot provide the input file path, the IDB path is used as a fallback.
- One
.forgefile can contain multiple functions. - Each function section is wrapped with
// PSEUDOFORGE FUNCTION BEGIN ea=...andENDmarkers. - Re-analyzing the same EA replaces only that function section.
- Other function sections are preserved.
Show current analysis resultshows only the matching function section.Analyzed functions...lists all cached.forgesections without opening the full aggregate file first.- The preview context-menu action
PseudoForge/Analyzed functions...provides the same chooser from inside a preview window. - Run
Analyze current functionto refresh the current function section.
LLM rename assist can be configured without a separate CLI.
- Run
Edit/PseudoForge/Configure LLM rename assist. - Choose
YesforEnable PseudoForge LLM rename assist?. - Select a provider in the read-only provider combo box.
- Enter the base URL for HTTP providers.
- Enter an API key only if the selected HTTP provider has no stored key.
- Select a model in the provider-specific read-only model combo box.
- Enter a command template for CLI providers.
- Set the timeout in seconds.
- Subsequent
Analyze current function,Export cleaned pseudocode, andApply selected renamesactions use LLM rename assist when it is enabled.
API key policy:
openai_compatible,openrouter, anddeepseek_apirequire API keys.- API keys are stored under provider-specific
credentials, not underllm. - Existing provider keys are reused when changing models.
- To replace a key, edit or delete the provider credential in
pseudoforge_config.json, then run the configuration action again.
Supported provider IDs:
openai_compatible
openrouter
chatgpt_oauth_via_codex_cli
codex_cli
claude_login_via_claude_cli
claude_cli
deepseek_api
Default provider settings:
| Provider | Default model | Default endpoint or command |
|---|---|---|
openai_compatible |
gpt-5-mini |
https://api.openai.com/v1 |
openrouter |
openrouter/auto |
https://openrouter.ai/api/v1 |
chatgpt_oauth_via_codex_cli |
gpt-5-mini |
codex exec -m {model} --skip-git-repo-check --sandbox read-only --output-last-message {output_file} - |
codex_cli |
gpt-5-mini |
codex exec -m {model} --skip-git-repo-check --sandbox read-only --output-last-message {output_file} - |
claude_login_via_claude_cli |
claude-sonnet-4-6 |
claude -p --model {model} --permission-mode dontAsk --output-format text --no-session-persistence --tools "" |
claude_cli |
claude-sonnet-4-6 |
claude -p --model {model} --permission-mode dontAsk --output-format text --no-session-persistence --tools "" |
deepseek_api |
deepseek-v4-flash |
https://api.deepseek.com |
The default timeout is 60 seconds.
Model discovery:
chatgpt_oauth_via_codex_cliandcodex_cliread the Codex model catalog throughcodex debug modelsusing argv-based subprocess execution, not a shell command string.- If
codex debug modelsfails,%USERPROFILE%\.codex\models_cache.jsonis used. - Claude CLI providers use provider-specific static model lists. This is the expected path because Claude CLI does not expose a model catalog command. The static list starts with the current Claude API/Claude Code model IDs and aliases:
claude-opus-4-8,claude-sonnet-4-6, andclaude-haiku-4-5. - HTTP providers query the selected base URL's
/modelsendpoint. - If an enabled HTTP provider has no stored key, the key prompt appears before model discovery.
- Discovery failures fall back to provider-specific static model lists.
- A custom model stored in
pseudoforge_config.jsonis temporarily added to the combo box on the next configuration run so the current setting is not lost.
chatgpt_oauth_via_codex_cli lets IDA call Codex CLI with the ChatGPT OAuth session saved by codex login. claude_login_via_claude_cli lets IDA call Claude CLI with the Anthropic account session saved by claude auth login. PseudoForge does not implement in-IDA browser login. codex_cli and claude_cli remain generic local CLI bridges with editable command templates.
CLI command template placeholders:
{prompt_file} temporary file containing the prompt
{output_file} temporary file expected to contain the provider response
{model} selected model name
PseudoForge also sends the prompt to CLI providers over stdin. If {output_file} is present, the file is preferred; otherwise stdout is used. CLI command templates are parsed into argv and executed with shell=False by default. On Windows, CLI provider calls and Codex model discovery request hidden child console windows so local CLI bridges such as Claude CLI do not flash a separate console during normal runs. Prefix a template with shell: or raw-shell: only when an explicitly reviewed advanced shell pipeline is required. The default Codex, ChatGPT, and Claude templates include {model}. Old default templates that omitted {model}, used unsupported Codex CLI flags, or used the older Claude CLI template without the selected model are migrated on load; user-created custom templates are preserved.
Config path:
<IDA user directory>\pseudoforge_config.json
Example:
{
"llm": {
"enabled": true,
"provider": "openai_compatible",
"base_url": "https://api.openai.com/v1",
"model": "gpt-5-mini",
"timeout_seconds": 60,
"command_template": "",
"extra_headers": {}
},
"credentials": {
"openai_compatible": {
"api_key": "sk-..."
}
}
}OpenRouter example:
{
"llm": {
"enabled": true,
"provider": "openrouter",
"base_url": "https://openrouter.ai/api/v1",
"model": "openrouter/auto",
"timeout_seconds": 60,
"command_template": "",
"extra_headers": {
"X-Title": "PseudoForge"
}
},
"credentials": {
"openrouter": {
"api_key": "sk-or-..."
}
}
}DeepSeek API example:
{
"llm": {
"enabled": true,
"provider": "deepseek_api",
"base_url": "https://api.deepseek.com",
"model": "deepseek-v4-flash",
"timeout_seconds": 60,
"command_template": "",
"extra_headers": {}
},
"credentials": {
"deepseek_api": {
"api_key": "<deepseek-api-key>"
}
}
}Codex CLI / ChatGPT OAuth via Codex CLI example:
{
"llm": {
"enabled": true,
"provider": "chatgpt_oauth_via_codex_cli",
"base_url": "",
"model": "gpt-5-mini",
"timeout_seconds": 120,
"command_template": "codex exec -m {model} --skip-git-repo-check --sandbox read-only --output-last-message {output_file} -",
"extra_headers": {}
},
"credentials": {}
}Claude CLI login example:
{
"llm": {
"enabled": true,
"provider": "claude_login_via_claude_cli",
"base_url": "",
"model": "claude-sonnet-4-6",
"timeout_seconds": 120,
"command_template": "claude -p --model {model} --permission-mode dontAsk --output-format text --no-session-persistence --tools \"\"",
"extra_headers": {}
},
"credentials": {}
}If an LLM call fails, PseudoForge falls back to the deterministic plan and records the failure in warnings. The IDB write boundary is unchanged: only user-selected, validator-gated renames can be applied.
Export is the durable artifact path for PseudoForge analysis. It is not an apply path and is not meant to rewrite the IDB. The export bundle lets reviewers compare the cleaned output against the original decompiler text, inspect why a rename or semantic cleanup appeared, archive analysis results, and build regression samples from real functions.
Export cleaned pseudocode writes:
<function>.cleaned.cpp
<function>.switch-outline.cpp
<function>.rename-map.json
<function>.flow-report.md
<function>.rule-report.json
File purposes:
.cleaned.cpp: readable pseudocode with validated renames and NTSTATUS literal cleanup..switch-outline.cpp: recovered dispatcher case values and conservative body excerpts..rename-map.json: fullCleanPlanJSON..flow-report.md: dispatcher, recovered cases, cleanup labels, and warning report..rule-report.json: deterministic rule matches, rejected emissions, load errors, and validation errors.
Caveats:
switch-outline.cppdoes not synthesize deep shared branches or fallthrough bodies.- Control-flow rewrites are preview/export-only artifacts and never modify the IDB.
- The IDB receives only user-selected local or argument renames.
- Export artifacts are intended to be reviewed against the original pseudocode.
Run the core engine outside IDA:
python -B .\tools\pseudoforge_cli.py .\samples\pseudocode\NtSetSystemInformation_switch_renamed.cpp --out $env:TEMP\pseudoforge_cli_smokeExpected output:
PseudoForge export complete
Function: NtSetSystemInformation
Renames: <count>
Flow rewrites: <count>
cleaned_pseudocode: ...
switch_outline: ...
rename_map: ...
flow_report: ...
Use LLM rename assist with provider-specific environment variables or options:
$env:PSEUDOFORGE_OPENAI_API_KEY = "<api-key>"
$env:PSEUDOFORGE_OPENAI_MODEL = "gpt-5-mini"
python -B .\tools\pseudoforge_cli.py .\samples\pseudocode\NtSetSystemInformation_switch_renamed.cpp --llm-renames --out $env:TEMP\pseudoforge_cli_smokeOpenRouter:
$env:PSEUDOFORGE_OPENROUTER_API_KEY = "<openrouter-api-key>"
python -B .\tools\pseudoforge_cli.py .\samples\pseudocode\NtSetSystemInformation_switch_renamed.cpp --llm-renames --llm-provider openrouter --out $env:TEMP\pseudoforge_cli_smokeDeepSeek:
$env:PSEUDOFORGE_DEEPSEEK_API_KEY = "<deepseek-api-key>"
python -B .\tools\pseudoforge_cli.py .\samples\pseudocode\NtSetSystemInformation_switch_renamed.cpp --llm-renames --llm-provider deepseek_api --out $env:TEMP\pseudoforge_cli_smokeCodex CLI:
codex login
python -B .\tools\pseudoforge_cli.py .\samples\pseudocode\NtSetSystemInformation_switch_renamed.cpp --llm-renames --llm-provider codex_cli --llm-timeout 120 --out $env:TEMP\pseudoforge_cli_smokeClaude CLI login:
claude auth login
python -B .\tools\pseudoforge_cli.py .\samples\pseudocode\NtSetSystemInformation_switch_renamed.cpp --llm-renames --llm-provider claude_login_via_claude_cli --llm-timeout 120 --out $env:TEMP\pseudoforge_cli_smokeOptional environment variables:
PSEUDOFORGE_OPENAI_API_KEY
PSEUDOFORGE_OPENAI_BASE_URL
PSEUDOFORGE_OPENAI_MODEL
PSEUDOFORGE_OPENROUTER_API_KEY
PSEUDOFORGE_OPENROUTER_BASE_URL
PSEUDOFORGE_OPENROUTER_MODEL
PSEUDOFORGE_DEEPSEEK_API_KEY
PSEUDOFORGE_DEEPSEEK_BASE_URL
PSEUDOFORGE_DEEPSEEK_MODEL
Default values:
PSEUDOFORGE_OPENAI_BASE_URL=https://api.openai.com/v1
PSEUDOFORGE_OPENAI_MODEL=gpt-5-mini
PSEUDOFORGE_OPENROUTER_BASE_URL=https://openrouter.ai/api/v1
PSEUDOFORGE_OPENROUTER_MODEL=openrouter/auto
PSEUDOFORGE_DEEPSEEK_BASE_URL=https://api.deepseek.com
PSEUDOFORGE_DEEPSEEK_MODEL=deepseek-v4-flash
LLM rename assist only adds candidate names to the deterministic rename plan. LLM output must still pass JSON parsing, confidence thresholding, and rename validation.
IDA Free is not a supported interactive plugin target for PseudoForge. The interactive actions require IDAPython and local Hex-Rays pseudocode APIs, which are not available in IDA Free. Users can still copy or save a single cloud-decompiled pseudocode function and process that text outside IDA:
python -B .\tools\pseudoforge_free_cli.py .\samples\pseudocode\NtSetSystemInformation_switch_renamed.cpp --out $env:TEMP\pseudoforge_free_cli_smokeThe IDA Free CLI accepts one or more text files. Each file should contain one complete function. Leading or trailing copied text is tolerated when the function boundary is unambiguous. Multiple functions in one file fail closed with an actionable error.
Project-local deterministic rules:
python -B .\tools\pseudoforge_free_cli.py .\copied_from_ida_free.cpp --project-root . --rules .\extra_rules --out $env:TEMP\pseudoforge_free_cli_smokeOptional offline LLM rename assist:
python -B .\tools\pseudoforge_free_cli.py .\copied_from_ida_free.cpp --llm --llm-provider claude_login_via_claude_cli --llm-timeout 120 --out $env:TEMP\pseudoforge_free_cli_llmProject-local rules and LLM rename assist together:
New-Item -ItemType Directory -Force .\pseudoforge_rules | Out-Null
claude auth login
python -B .\tools\pseudoforge_free_cli.py .\copied_from_ida_free.cpp `
--project-root . `
--rules .\extra_rules `
--llm `
--llm-provider claude_login_via_claude_cli `
--llm-timeout 120 `
--out $env:TEMP\pseudoforge_free_rules_llmIn this mode, builtin rules, .\pseudoforge_rules\*.json, user-global rules, and --rules directories are loaded first. LLM rename suggestions are then added as optional candidates and still pass deterministic validation before they can appear in the output plan. Invalid rule packs are reported in the rule report and do not crash analysis. LLM provider failures fall back to the deterministic plan.
The default text console output prints incremental progress before long phases such as LLM-assisted plan building and artifact writing. Use --no-progress when only the final text summary is needed.
Example IDA Free CLI run with project-local rules and Claude CLI login:
The screenshot shows the current text console flow with incremental progress and a structured final status summary.
Example IDA Free result comparison. The left side is IDA Free cloud-decompiled pseudocode, and the right side is the cleaned PseudoForge offline output:
Structured console output:
python -B .\tools\pseudoforge_free_cli.py .\copied_from_ida_free.cpp --format json --out $env:TEMP\pseudoforge_free_cli_jsonWith --format json, stdout remains machine-readable JSON. Progress messages are written to stderr so scripts can continue parsing stdout safely.
IDA Free CLI artifacts include:
<function>.cleaned.cpp<function>.switch-outline.cpp<function>.rename-map.json<function>.flow-report.md<function>.rule-report.json<function>.raw.cpp<function>.warnings.json<function>.raw-vs-cleaned.diff<function>.ida-free-summary.jsonpseudoforge-free-report.json
IDA Free CLI limitations:
- No interactive PseudoForge menu, preview action, or apply-renames action.
- No IDB writes.
- No direct IDAPython, IDA SDK, or local Hex-Rays API access.
- Output quality depends on the copied decompiler text quality.
- Inferred structure rewrites and semantic comments still require review against the original pseudocode.
tools/pseudoforge_ida_batch.py runs inside IDA batch mode. It opens a .i64 or .idb, calls ida_hexrays.decompile() per function, analyzes through PseudoForge, appends .forge sections, and writes JSONL progress reports. The normal entrypoint is the PowerShell wrapper tools/run_pseudoforge_ida_batch.ps1.
Example:
.\tools\run_pseudoforge_ida_batch.ps1 `
-IdaPath "C:\Path\To\IDA\ida.exe" `
-IdbPath "D:\Path\To\ntoskrnl.exe.i64" `
-TargetPath "D:\Path\To\ntoskrnl.exe" `
-OutputDir "$env:TEMP\pseudoforge_ida_batch\ntoskrnl" `
-OverwriteForgeSingle-function smoke:
.\tools\run_pseudoforge_ida_batch.ps1 `
-IdaPath "C:\Path\To\IDA\ida.exe" `
-IdbPath "D:\Path\To\ntoskrnl.exe.i64" `
-TargetPath "D:\Path\To\ntoskrnl.exe" `
-NameRegex "^NtSetSystemInformation$" `
-MaxFunctions 1Wrapper options:
-MaxFunctions N: analyze only the first N matching functions.-NameRegex REGEX: filter functions by name.-Resume: skip EAs already present in the existing.forge.-OverwriteForge: create a fresh.forgebefore append-only batch export.-UpsertForge: slower path that verifies aggregate section replacement.-LlmRenames: use saved or explicit LLM rename assist settings.-NoPdb: pass-Opdb:offto IDA so validation runs do not load PDB/debug symbols.-NoWait: start the IDA process and return immediately.
Summarize an existing report:
python -B .\tools\summarize_pseudoforge_ida_batch.py "$env:TEMP\pseudoforge_ida_batch\ntoskrnl\ntoskrnl.exe_<timestamp>.jsonl"Compare raw Hex-Rays output against PseudoForge output:
.\tools\run_pseudoforge_ida_batch.ps1 `
-IdaPath "C:\Path\To\IDA\ida.exe" `
-IdbPath "D:\Path\To\ntoskrnl.exe.i64" `
-TargetPath "D:\Path\To\ntoskrnl.exe" `
-NameRegex "^NtSetSystemInformation$" `
-MaxFunctions 1 `
-CompareDir "$env:TEMP\pseudoforge_ida_batch\ntoskrnl_compare"-CompareDir writes:
raw\*.cpp: raw IDA Hex-Rayscfunc.get_pseudocode()text.cleaned\*.cpp: PseudoForge normalized/export pseudocode.forge\*.forge: full.forgesection for the function.diff\*.diff: raw vs cleaned unified diff.
Each JSONL function record includes comparison paths, SHA-256 hashes, line counts, and diff line counts.
To include the same LLM assist path used by interactive IDA Analyze, add -LlmRenames. Full-kernel LLM batch runs can issue many provider calls, so check cost and runtime first.
LLM wrapper overrides:
-LlmProvider openrouter|chatgpt_oauth_via_codex_cli|codex_cli|claude_login_via_claude_cli|claude_cli|deepseek_api|openai_compatible-LlmModel MODEL-LlmTimeout SECONDS-LlmBaseUrl URL-LlmCommand COMMAND_TEMPLATE-LlmApiKey KEY
No-op CLI provider smoke:
$noopProvider = "python " + (Resolve-Path .\tools\empty_llm_rename_provider.py).Path
.\tools\run_pseudoforge_ida_batch.ps1 `
-IdaPath "C:\Path\To\IDA\ida.exe" `
-IdbPath "D:\Path\To\ntoskrnl.exe.i64" `
-NameRegex "^NtSetSystemInformation$" `
-MaxFunctions 1 `
-LlmRenames `
-LlmProvider codex_cli `
-LlmCommand $noopProviderFunctions that Hex-Rays cannot decompile are recorded as skipped, not as PseudoForge failures.
For unknown third-party binary validation, use -NoPdb and review the IDA log for unexpected symbol loading. The wrapper also retries once when a fresh IDA load exits with an empty report file before the batch script has produced records.
PseudoForge includes a v1 deterministic rules matching engine. The supported production scope is data-only JSON rules for rename and semantic_comment.
Rule load paths:
ida_pseudoforge/rules/builtin/*.json
.\pseudoforge_rules\*.json
%APPDATA%\PseudoForge\rules\*.json
Interactive IDA analysis resolves .\pseudoforge_rules relative to the analyzed input binary directory. Offline CLI resolves it relative to the source pseudocode file and also accepts explicit --rules-dir.
Builtin rules currently mirror low-risk deterministic hard-coded passes for report/parity visibility. They do not replace existing hard-coded rename validation, cleanup classification, flow recovery, or kernel API rewrite behavior.
Authoring workflow:
- Create
pseudoforge_rulesbeside the analyzed.idb, binary, or pseudocode input. - Add a rule pack JSON file, for example
project_kernel_rules.json. - Validate the pack before use.
- In IDA, run PseudoForge analysis normally. In the CLI, place rules beside the source input or pass
--rules-dir .\pseudoforge_rules. - Use
--rule-reportto inspect matched rules, rejected emissions, load errors, and validation errors.
Validation:
New-Item -ItemType Directory -Force .\pseudoforge_rules
python -B .\tools\validate_pseudoforge_rules.py .\ida_pseudoforge\rules\builtin
python -B .\tools\validate_pseudoforge_rules.py .\pseudoforge_rulesProject-local rule pack example:
{
"schema_version": 1,
"id": "project.kernel_object_rules",
"description": "Project-local PseudoForge deterministic rules for kernel object callback analysis.",
"rules": [
{
"id": "project.rename.exact_previous_mode",
"phase": "rename",
"priority": 100,
"confidence": 0.99,
"scope": {
"lvars_any": ["PreviousMode"]
},
"match": {
"text_contains": "PreviousMode"
},
"emit": {
"kind": "rename",
"rename_kind": "lvar",
"target": "PreviousMode",
"new_name": "previousMode",
"evidence": "Hex-Rays kept kernel PreviousMode casing"
}
},
{
"id": "project.rename.requester_process",
"phase": "rename",
"priority": 100,
"confidence": 0.94,
"scope": {
"calls_any": ["PsGetCurrentProcessId"]
},
"match": {
"assignment_regex": "\\b(?P<dst>[A-Za-z_][A-Za-z0-9_]*)\\s*=\\s*PsGetCurrentProcessId\\(\\)\\b"
},
"emit": {
"kind": "rename",
"rename_kind": "lvar",
"target": "$dst",
"new_name": "requesterProcessId",
"evidence": "Local receives current process id in object callback path"
}
},
{
"id": "project.comment.object_pre_operation_callback",
"phase": "semantic_comment",
"priority": 80,
"confidence": 0.90,
"scope": {
"function_name_regex": ".*ObjectPreOperation$",
"prototype_contains": "PRE_OPERATION"
},
"match": {
"text_contains_all": ["OB_OPERATION_HANDLE_CREATE", "OriginalDesiredAccess"]
},
"emit": {
"kind": "semantic_comment",
"comment_kind": "object_pre_operation",
"text": "Object pre-operation callback checks requested process access",
"evidence": "OB create operation and OriginalDesiredAccess are present"
}
},
{
"id": "project.override.updated_status_name",
"phase": "rename",
"priority": 120,
"confidence": 0.96,
"enabled": true,
"override_of": "builtin.local.updated_status",
"scope": {
"lvars_any": ["updated"],
"text_contains": "STATUS_"
},
"match": {
"text_contains": "updated"
},
"emit": {
"kind": "rename",
"rename_kind": "lvar",
"target": "updated",
"new_name": "status",
"evidence": "Project policy treats updated as the NTSTATUS accumulator"
}
}
]
}Authoring patterns:
- Exact local rename
- Use
scope.lvars_anyto require the local first. - Use
match.text_containsto confirm the text appears in the function. - Use the real Hex-Rays local name directly in
emit.target.
- Use
- Assignment-based rename
- Add a named capture group in
match.assignment_regex, for example(?P<dst>...). - Refer to the binding as
$dstinemit.target. - Add a scope gate such as
calls_anyortext_containsto reduce false positives.
- Add a named capture group in
- Semantic comment
- Both
phaseandemit.kindmust besemantic_comment. - Keep
comment_kindshort and stable because later reports and rewrites can use it as a key. - Keep
textandevidenceASCII.
- Both
- Override rule
- Rename conflicts for the same target are resolved by
override_of,priority, andconfidence. - To override a builtin policy, set
override_ofto the builtin rule ID and use a higher priority. - The final rename still has to pass the existing validator.
- Rename conflicts for the same target are resolved by
Operational rules:
- Do not use
regexandassignment_regexin the same rule. scopeis optional, but production rules should usually include a scope gate.confidencemust be a number from0.0to1.0; booleans are rejected.- Use
enabled: falsefor temporary disablement. - JSON rule files cannot contain execution or network fields such as
python,shell,command,subprocess,url, ornetwork. - Rules affect preview/export plans and reports only. IDB renames still use the explicit user-selected validator-gated rename path.
Supported scope operators:
calls_any
calls_all
lvars_any
function_name_regex
prototype_contains
text_contains
text_contains_all
Supported match operators:
regex
assignment_regex
text_contains
text_contains_all
Supported emissions:
rename
semantic_comment
Rule conflict policy:
- Higher
priorityandconfidencesort rules earlier for matching. - Rename emissions for the same target are resolved before normal rename validation.
override_ofis the strongest conflict signal; otherwiseprioritywins beforeconfidence.- Rule-based renames always use source
rule; JSON cannot spoof trusted internal sources such askernel-statusorsemantic-rule. - Rule report paths are redacted to labels such as
builtin/local_renames.json,project/foo.json, oruser/foo.json.
Run with additional rules and write a report:
python -B .\tools\pseudoforge_cli.py .\samples\pseudocode\NtSetSystemInformation_switch_renamed.cpp --rules-dir .\pseudoforge_rules --rule-report $env:TEMP\pseudoforge_rules --out $env:TEMP\pseudoforge_cli_smokeInspect a rule report:
Get-ChildItem $env:TEMP\pseudoforge_rules
Get-Content (Get-ChildItem $env:TEMP\pseudoforge_rules -Filter *.rule-report.json | Select-Object -First 1).FullNameReport fields:
matched_rules: rules that passed scope/match and emitted data
rejected_emissions: emissions rejected by conflict, validation, or runtime guards
load_errors: JSON read or parse failures
validation_errors: schema, regex, or forbidden-key failures
Safety boundaries:
- Rule files are JSON data only.
- User Python execution is not supported.
- The rule system rejects network, subprocess, and command execution fields.
- Invalid rule packs fail closed and analysis continues.
- Invalid regexes, invalid scope regexes, ambiguous primary regex matchers, empty matches, empty text gates, boolean numeric fields, and missing emit fields are rejected at load time.
- Runtime exceptions reject only the offending rule and analysis continues.
- Rule-based rename suggestions still pass through
validate_renames(). - Text and control-flow rewrite rules are out of v1 scope and do not modify IDB state.
Unit tests:
python -B -m unittest discover -s tests -vCompile check:
python -B -m compileall .\pseudoforge.py .\ida_pseudoforge .\tests .\toolsProfile JSON checks:
python -B -m json.tool .\ida_pseudoforge\profiles\kernel_api.json
python -B -m json.tool .\ida_pseudoforge\profiles\kernel_api_overrides.json
python -B -m json.tool .\ida_pseudoforge\profiles\status_codes.json
python -B -m json.tool .\ida_pseudoforge\profiles\process_information_class.json
python -B -m json.tool .\ida_pseudoforge\profiles\system_information_class.jsonWDK profile generation checks:
python -B .\tools\build_kernel_api_profile.py --list-versions
python -B .\tools\build_kernel_api_profile.py --version 10.0.26100.0 --dry-run --summary --function ExAllocatePool2 --function ExAcquireResourceExclusiveLite
python -B .\tools\build_status_codes_profile.py --version 10.0.26100.0 --dry-run --summaryRule validation:
python -B .\tools\validate_pseudoforge_rules.py .\ida_pseudoforge\rules\builtinOffline export smoke:
python -B .\tools\pseudoforge_cli.py .\samples\pseudocode\NtSetSystemInformation_switch_renamed.cpp --out $env:TEMP\pseudoforge_cli_smoke
python -B .\tools\pseudoforge_free_cli.py .\samples\pseudocode\NtSetSystemInformation_switch_renamed.cpp --out $env:TEMP\pseudoforge_free_cli_smokePatch hygiene:
git diff --check -- .Current validation set used during development:
python -B -m unittest discover -s tests -v
python -B -m compileall .\pseudoforge.py .\ida_pseudoforge .\tests .\tools
python -B -m json.tool .\ida_pseudoforge\profiles\kernel_api.json
python -B -m json.tool .\ida_pseudoforge\profiles\kernel_api_overrides.json
python -B -m json.tool .\ida_pseudoforge\profiles\status_codes.json
python -B -m json.tool .\ida_pseudoforge\profiles\process_information_class.json
python -B -m json.tool .\ida_pseudoforge\profiles\system_information_class.json
python -B .\tools\validate_pseudoforge_rules.py .\ida_pseudoforge\rules\builtin
python -B .\tools\build_kernel_api_profile.py --version 10.0.26100.0 --dry-run --summary --function ExAllocatePool2 --function ExAcquireResourceExclusiveLite
python -B .\tools\build_status_codes_profile.py --version 10.0.26100.0 --dry-run --summary
python -B .\tools\pseudoforge_cli.py --version
python -B .\tools\release_pseudoforge.py --dry-run
python -B .\tools\pseudoforge_cli.py .\samples\pseudocode\NtSetSystemInformation_switch_renamed.cpp --out $env:TEMP\pseudoforge_cli_smoke
python -B .\tools\pseudoforge_free_cli.py --version
python -B .\tools\pseudoforge_free_cli.py .\samples\pseudocode\NtSetSystemInformation_switch_renamed.cpp --out $env:TEMP\pseudoforge_free_cli_smoke
git diff --check -- .Edit/PseudoForge is not visible:
- Confirm that
pseudoforge.py,ida-plugin.json, andida_pseudoforge/are in the same plugin directory. - Confirm that
ida_pseudoforgewas not copied as a nested directory.- Correct:
plugins\ida_pseudoforge\core\... - Incorrect:
plugins\ida_pseudoforge\ida_pseudoforge\core\...
- Correct:
- Check the IDA Output window for Python import errors.
- Confirm that the Hex-Rays decompiler is active.
IDA hangs immediately after preview:
- Fully restart the IDA process after updating plugin files.
- A running IDA process with an old logger thread can keep failing even after new files are copied.
- Check the last checkpoint in
%TEMP%\pseudoforge_preview_trace.log. - Check
%TEMP%\pseudoforge_trace.logforoutput.timer.startedoroutput.timer.disabled. - If Output logging is suspected, set
PSEUDOFORGE_DISABLE_OUTPUT_LOG=1before launching IDA and retry.
Paste is empty after Copy all:
- Check
%TEMP%\pseudoforge_clipboard\copy_all.log. - A log beginning with
failed ...indicates a Windows Clipboard API or file path issue. - No log means the preview context-menu action
PseudoForge/Copy allwas not invoked.
Export cleaned pseudocode fails:
- Confirm that the cursor is inside a function.
- Confirm that the target function can be decompiled in the pseudocode view.
- Confirm write access beside the IDB.
- Reproduce with the offline CLI using the same pseudocode text.
Rename application fails:
- Confirm that the target is a local variable or argument.
- Check for an existing name collision.
- Confirm the cursor is still inside the same function that was analyzed. PseudoForge refuses apply when the current function no longer matches the analyzed session.
- Hex-Rays can reject some lvar renames; inspect the export artifact first.
- IDB writes happen only after preview and explicit user selection.
- LLM output is never applied directly; only validated plan items are used.
- Control-flow rewrites are preview/export-only artifacts.
- Renames must pass collision, reserved keyword, and identifier validation.
- Apply-selected-renames rechecks the current analyzed function and performs a final preflight before calling Hex-Rays rename APIs.
- Every cleanup should leave artifacts that can be compared with the original pseudocode.
- Kernel-scale functions can legitimately produce partial recovery plus warnings.
- ida_pseudocode_refactor_plugin_design.md
- deterministic_rules_matching_engine_design.md
- pseudoforge_implementation_status.md
- Design deterministic rules matching engine v2 scope and migrate
call_arg_rewrite,text_rewrite, andflow. - Improve shared and fallthrough branch body reconstruction.
- Strengthen ctree-identity based rename tracking.
- Implement a dockable side-by-side preview panel.
- Expand profile coverage against real target builds.





