Context
codellm-devkit/codeanalyzer-java#171 adds analysis level 3 to the Java analyzer: the full system dependency graph (control + data dependence) from WALA's slicer. At -a 3 the analyzer emits two new analysis.json sections:
system_dependency_graph — method-level dependence edges. This already validates against the existing JApplication.system_dependency_graph: List[JGraphEdges] model with zero model changes (verified against this repo's models on the analyzer's call-graph-test fixture).
program_graphs — statement-level graphs per the CLDK level-3 dataflow contract: per-callable cfg and pdg keyed by (signature, node_id) (ENTRY = 0, SSA instructions in order, EXIT = last), plus cross-function sdg_edges (CALL, PARAM_IN, PARAM_OUT), schema_version'd. Example:
Asks
- Default the Java backend to analysis level 3, dialing down on request:
JCodeanalyzer should invoke the binary with --analysis-level=3 by default and honor an explicit lower analysis_level (symbol table / call graph) when the caller asks for less.
- Make
get_system_dependency_graph() real. It currently warns "System dependency graph is not yet implemented. Returning the call graph instead." — it should return the actual system_dependency_graph edges (and keep the call-graph fallback only for old analysis files).
- Model
program_graphs once, shared across languages per the level-3 parity clause: ProgramGraphs, FunctionGraphs, GraphNode, GraphEdge, SDGEdge — not per-language copies. (Language analyzers may add node/edge kinds additively.)
- Adapt the SCIP indexing to the new schema so statement-level nodes/edges from
program_graphs participate in indexing alongside the symbol table and call graph.
- Pin the minimum
codeanalyzer-java version that emits level 3 once it is released.
Notes
-a 1 / -a 2 output is unchanged (verified byte-identical on the fixture), so defaulting to 3 is purely additive from the SDK's perspective — but level 3 is slower (WALA slicer), which is the reason for the dial-down knob.
SUMMARY edges and taint/slicing clients are analyzer-side follow-ups, out of scope here.
Context
codellm-devkit/codeanalyzer-java#171 adds analysis level 3 to the Java analyzer: the full system dependency graph (control + data dependence) from WALA's slicer. At
-a 3the analyzer emits two newanalysis.jsonsections:system_dependency_graph— method-level dependence edges. This already validates against the existingJApplication.system_dependency_graph: List[JGraphEdges]model with zero model changes (verified against this repo's models on the analyzer'scall-graph-testfixture).program_graphs— statement-level graphs per the CLDK level-3 dataflow contract: per-callablecfgandpdgkeyed by(signature, node_id)(ENTRY= 0, SSA instructions in order,EXIT= last), plus cross-functionsdg_edges(CALL,PARAM_IN,PARAM_OUT),schema_version'd. Example:Asks
JCodeanalyzershould invoke the binary with--analysis-level=3by default and honor an explicit loweranalysis_level(symbol table / call graph) when the caller asks for less.get_system_dependency_graph()real. It currently warns "System dependency graph is not yet implemented. Returning the call graph instead." — it should return the actualsystem_dependency_graphedges (and keep the call-graph fallback only for old analysis files).program_graphsonce, shared across languages per the level-3 parity clause:ProgramGraphs,FunctionGraphs,GraphNode,GraphEdge,SDGEdge— not per-language copies. (Language analyzers may add node/edge kinds additively.)program_graphsparticipate in indexing alongside the symbol table and call graph.codeanalyzer-javaversion that emits level 3 once it is released.Notes
-a 1/-a 2output is unchanged (verified byte-identical on the fixture), so defaulting to 3 is purely additive from the SDK's perspective — but level 3 is slower (WALA slicer), which is the reason for the dial-down knob.SUMMARYedges and taint/slicing clients are analyzer-side follow-ups, out of scope here.