Add parameter type signatures for IL and JVM methods#358
Merged
Conversation
Extract and propagate parenthesized parameter type signatures to enable overload-precise identification and matching of methods and unresolved call targets. - Extractor: ILExtractor now emits il_method_param_signature and il_call_target_param_signature tuples. - DB schema: Added il_method_param_signature and il_call_target_param_signature to semmlecode.binary.dbscheme. - QL API/AST: Exposed/getters for param signatures across CilInstructions, IR, InstructionSig, TranslatedElement/Function/Instruction and transform layers so signatures flow through translation. - Translated implementations: TranslatedCilMethod and relevant translated call/new-object logic return the extracted signatures; non-CIL backends return wildcards where appropriate. - VulnerableCalls: Expanded the vulnerableCallModel and related predicates to include paramSignature and updated matching logic to accept exact signatures or wildcard '*'. - Models: Updated example YAML models to include a '*' paramSignature for existing entries. This change improves precision when matching overloaded methods for analyses such as vulnerable-call detection.
Expose a getParamSignature API on InstructionSig (and the TransformInstruction implementation) to return parenthesized parameter-type signatures (e.g. "(System.String,System.Int32)"). Extend the extraction DB schema with il_method_param_signature and il_call_target_param_signature to enable overload-precise method identification, and add jvm_stack_height and jvm_stack_slot tables to record JVM stack heights and map stack slots to producer instructions to simplify stack-based dataflow analysis.
For root cause mode analysis, where the vulnerable methods being traced are defined in the same binary being analyzed (not referenced cross-assembly), getAVulnerableMethod needs a base case that matches method definitions by their fully-qualified name and parameter signature. Previously, only cross-assembly calls via ExternalRefInstruction were matched as the base case. Intra-assembly calls are handled by the existing transitive getStaticTarget() clause, but the closure never started because the base case only found external ref call sites. The new clause matches methods defined in the current binary against the model, respecting the paramSignature field (including wildcard '*'). For standard cross-assembly analysis this is a no-op since the model methods won't be defined in the binary being analyzed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The ql lib dbscheme was updated with il_method_param_signature, il_call_target_param_signature, jvm_stack_height, and jvm_stack_slot tables but the JVM extractor's copy was not updated. This causes a schema mismatch when building a JVM database and then running the binary-ql queries against it. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The CIL extractor already emits il_method_param_signature and il_call_target_param_signature for overload-precise method matching. This commit adds the same capability to the JVM bytecode extractor. JVM extractor changes: - ParseParamSignature: converts JVM descriptors (e.g. '(Ljava/lang/Object;JJ)V') to human-readable signatures (e.g. '(Object,long,long)') - ExtractMethod: emits il_method_param_signature for method definitions - ExtractMethodRef: emits il_call_target_param_signature for call sites QL library changes: - JvmMethod: add getParamSignature() backed by il_method_param_signature - JvmInvoke: add getParamSignature() backed by il_call_target_param_signature - TranslatedJvmInvoke: wire getExternalParamSignature to instr.getParamSignature() - TranslatedJvmFunction: use method.getParamSignature() instead of wildcard '*' VulnerableCalls.qll: - VulnerableMethodCall: handle case where extRef lacks param signature (backwards compat for databases built before this change) - Root cause base case: handle functions with wildcard param signature Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
il_call_target_param_signature references @il_instruction which is incompatible with JVM's @jvm_instruction type. Add jvm_call_target_param_signature table for JVM call target signatures and update the extractor and QL to use it. Also sync all extractor dbschemes (JVM and CIL) with the canonical ql/lib copy. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
MathiasVP
approved these changes
May 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request adds extraction and QL API support for parameter type signatures of methods and method calls in both JVM and CIL (Common Intermediate Language) binaries. The main goal is to enable overload-precise identification and matching of methods and call targets by exporting a normalized, parenthesized, comma-separated list of parameter types (e.g.,
(System.String,System.Int32)). The changes span the extractor, database schema, and QL libraries for both CIL and JVM backends.