Skip to content

Add parameter type signatures for IL and JVM methods#358

Merged
gfs merged 9 commits into
mainfrom
gfs/CilMethodDefWithParams
May 15, 2026
Merged

Add parameter type signatures for IL and JVM methods#358
gfs merged 9 commits into
mainfrom
gfs/CilMethodDefWithParams

Conversation

@gfs
Copy link
Copy Markdown
Collaborator

@gfs gfs commented May 15, 2026

This pull request adds extraction and QL API support for parameter type signatures of methods and method calls in both JVM and CIL (Common Intermediate Language) binaries. The main goal is to enable overload-precise identification and matching of methods and call targets by exporting a normalized, parenthesized, comma-separated list of parameter types (e.g., (System.String,System.Int32)). The changes span the extractor, database schema, and QL libraries for both CIL and JVM backends.

gfs and others added 6 commits February 15, 2026 17:38
Extract and propagate parenthesized parameter type signatures to enable overload-precise identification and matching of methods and unresolved call targets.

- Extractor: ILExtractor now emits il_method_param_signature and il_call_target_param_signature tuples.
- DB schema: Added il_method_param_signature and il_call_target_param_signature to semmlecode.binary.dbscheme.
- QL API/AST: Exposed/getters for param signatures across CilInstructions, IR, InstructionSig, TranslatedElement/Function/Instruction and transform layers so signatures flow through translation.
- Translated implementations: TranslatedCilMethod and relevant translated call/new-object logic return the extracted signatures; non-CIL backends return wildcards where appropriate.
- VulnerableCalls: Expanded the vulnerableCallModel and related predicates to include paramSignature and updated matching logic to accept exact signatures or wildcard '*'.
- Models: Updated example YAML models to include a '*' paramSignature for existing entries.

This change improves precision when matching overloaded methods for analyses such as vulnerable-call detection.
Expose a getParamSignature API on InstructionSig (and the TransformInstruction implementation) to return parenthesized parameter-type signatures (e.g. "(System.String,System.Int32)"). Extend the extraction DB schema with il_method_param_signature and il_call_target_param_signature to enable overload-precise method identification, and add jvm_stack_height and jvm_stack_slot tables to record JVM stack heights and map stack slots to producer instructions to simplify stack-based dataflow analysis.
For root cause mode analysis, where the vulnerable methods being traced are
defined in the same binary being analyzed (not referenced cross-assembly),
getAVulnerableMethod needs a base case that matches method definitions by
their fully-qualified name and parameter signature.

Previously, only cross-assembly calls via ExternalRefInstruction were matched
as the base case. Intra-assembly calls are handled by the existing transitive
getStaticTarget() clause, but the closure never started because the base case
only found external ref call sites.

The new clause matches methods defined in the current binary against the model,
respecting the paramSignature field (including wildcard '*'). For standard
cross-assembly analysis this is a no-op since the model methods won't be
defined in the binary being analyzed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The ql lib dbscheme was updated with il_method_param_signature,
il_call_target_param_signature, jvm_stack_height, and jvm_stack_slot tables
but the JVM extractor's copy was not updated. This causes a schema mismatch
when building a JVM database and then running the binary-ql queries against it.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The CIL extractor already emits il_method_param_signature and
il_call_target_param_signature for overload-precise method matching.
This commit adds the same capability to the JVM bytecode extractor.

JVM extractor changes:
- ParseParamSignature: converts JVM descriptors (e.g. '(Ljava/lang/Object;JJ)V')
  to human-readable signatures (e.g. '(Object,long,long)')
- ExtractMethod: emits il_method_param_signature for method definitions
- ExtractMethodRef: emits il_call_target_param_signature for call sites

QL library changes:
- JvmMethod: add getParamSignature() backed by il_method_param_signature
- JvmInvoke: add getParamSignature() backed by il_call_target_param_signature
- TranslatedJvmInvoke: wire getExternalParamSignature to instr.getParamSignature()
- TranslatedJvmFunction: use method.getParamSignature() instead of wildcard '*'

VulnerableCalls.qll:
- VulnerableMethodCall: handle case where extRef lacks param signature
  (backwards compat for databases built before this change)
- Root cause base case: handle functions with wildcard param signature

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
il_call_target_param_signature references @il_instruction which is incompatible
with JVM's @jvm_instruction type. Add jvm_call_target_param_signature table for
JVM call target signatures and update the extractor and QL to use it.

Also sync all extractor dbschemes (JVM and CIL) with the canonical ql/lib copy.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@gfs gfs enabled auto-merge (squash) May 15, 2026 17:49
@gfs gfs merged commit a3cc253 into main May 15, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants