Shakthi Bachala | Advisor: Prof. Witawas Srisa-An
Department of Computer Science, University of Nebraska-Lincoln
ICSE 2026 Doctoral Symposium -- Recife, Brazil
"Interpretability as a first-class compilation primitive"
Modern AI systems are powerful yet opaque. Understanding, aligning, and auditing their behavior requires multiple specialized tools that operate independently -- each with its own data format, execution model, and performance characteristics. This fragmentation creates severe practical challenges:
| Current Tool | Role | Limitation |
|---|---|---|
| SAELens | Sparse autoencoder features | Eager tensor execution, no compiler visibility |
| Captum | Attribution (integrated gradients) | Separate forward passes, no fusion with inference |
| Petri | Steering / safety auditing | Runtime-only checks, no compile-time guarantees |
| LangChain | Pipeline orchestration | Opaque to the compiler, no global optimization |
| LangSmith | Observability / tracing | API-level logs, not IR-level traces |
| Gradio | Visualization | Output-level displays, no internal circuit visibility |
These tools require multiple redundant forward passes and operate independently. There is no way for a compiler to reason about the full computation graph.
IAM unifies static structure and dynamic behavior through MLIR / Mojo / MAX-native substrate for code models and agents.
The core insight: interpretability operations -- attribution, visualization, tracing, and orchestration -- naturally compose as JAX transformations. IAM treats these as native compiler concerns, enabling the XLA optimization pipeline to automatically fuse, schedule, and lower interpretability primitives alongside model computation.
IAM is a three-tier vertically integrated framework. Each layer builds on the one below through JAX's composable primitive system and XLA's compilation pipeline.
APPLICATION LAYER
+-----------------+-----------------+-----------------+
| Orchestration | Observability | Visualization |
| Pipeline | Traces | Feature Maps |
| DAG Execute | Metrics | Attribution |
| Chain Compose | Logging | Circuit Display |
| .............. | ............... | ............... |
| >> LangChain | >> LangSmith | >> Gradio |
+-----------------+-----------------+-----------------+
|
VECTOR LAYER
+-----------------+-----------------+-----------------+
| CLT/Features | Attribution | Steering/Audit |
| Encode/Decode | IntGrad | Vector Construct|
| TopK Activate | Feature Select | Policy Apply |
| Skip Connect | Circuit Extract | Safety Verify |
| ............... | ............... | ............... |
| >> SAELens | >> Captum | >> Petri |
+-----------------+-----------------+-----------------+
C++ Core C++ Core C++ Core
|
MLIR COMPILER LAYER
+-----------------+-----------------+-----------------+
| CLT Dialect | Attribution | Policy Dialect |
| | Dialect | |
| iam.clt.encode | iam.attr.grad | iam.policy.steer|
| iam.clt.decode | iam.attr.path | iam.policy. |
| CLTFusionPass | AttrCachePass | verify |
| | | PolicyProject |
| | | Pass |
+-----------------+-----------------+-----------------+
|
StableHLO --> XLA --> TPU / GPU / CPU
User-facing API for orchestration, observability, and visualization. Replaces LangChain, LangSmith, and Gradio with unified interfaces that are structure-aware -- the compiler sees through them.
Core interpretability primitives in high-performance C++. Integrated CLT operations, attribution analysis, and policy steering. Attribution works through compiled structure, not eager tensors. Steering is compiler-mediated, not only runtime checks.
Custom MLIR dialects that represent interpretability as first-class compiler constructs. Optimization passes fuse operations and cache results. The compiler can reason about the full computation graph, fusing interpretability with model execution in a single optimized pipeline.
AI systems fail where static structure and runtime behavior diverge. IAM bridges this gap:
STATIC DYNAMIC
+------------------+ +------------------+
| module.compile() | static <--> dynamic | nn.Hooks |
| Compiled | <------------------------> | Observed |
| dependency path | mismatch is the failure | execution |
| -> safety check | | bypasses safety |
| -> output | | check entirely |
+------------------+ +------------------+
BRIDGE LAYER
Static / dynamic overlay
capture -> overlay -> feedback
| Component | Role |
|---|---|
| IAM Core | Application + Vector + MLIR (the three-tier stack above) |
| Bridge Layer | Static / dynamic overlay connecting compile-time and runtime analysis |
| IAM Hybrid Analysis | Unified static <--> dynamic reasoning about model behavior |
IAM builds on JAX's vertically integrated architecture:
- JAX Primitives -- Computation through composable primitives transformed by
grad,jit, andvmap. IAM defines custom primitives for interpretability that integrate seamlessly. Attribution becomes a natural transformation alongside autodiff. - XLA Compilation -- JAX lowers to XLA's HLO enabling aggressive optimization. IAM's primitives lower directly to HLO, allowing the compiler to fuse interpretability operations with model computation.
- Memory Optimization -- JAX's buffer donation enables sophisticated sharing. Activation buffers for attribution reuse memory across operations, with XLA automatically identifying sharing opportunities.
- Progressive Lowering -- JAX traces Python to Jaxpr, lowering to StableHLO with IAM's custom operations. StableHLO lowers through MHLO and Linalg to LLVM IR or GPU representations. Specialized passes fuse operations, identify buffer reuse, and exploit sparsity.
IAM extends StableHLO with three custom dialects:
iam.clt.encode / iam.clt.decode / CLTFusionPass
Decompose activations as StableHLO operations that JAX's compiler fuses with model inference and lowers to efficient device code.
iam.attr.grad / iam.attr.path / AttrCachePass
Causal relationships as StableHLO transformations with algebraic simplification and batched computation through vmap.
iam.policy.steer / iam.policy.verify / PolicyProjectPass
Path identification formalized as HLO graph analysis. Behavioral control as StableHLO primitives integrating with JAX's transformation system. The lowering pipeline: JAX primitives -> Jaxpr -> StableHLO -> simplification -> MHLO -> LLVM IR or NVVM IR.
Beyond understanding behavior, controlling it is essential for safety-critical AI.
Sparse autoencoders extended with residual connections preserving information flow while maintaining interpretability. Skip connections provide direct paths not requiring decomposition, reserving sparse pathways for patterns benefiting from explicit representation. Implemented as JAX primitives lowering to StableHLO; XLA fuses transcoder operations with model layers.
Vectors in activation space representing desired behavioral changes, added to activations during inference. Declarative policy specifications with automatic feature discovery through contrastive analysis. Statistical analysis identifies differentiating features; causal validation verifies effectiveness. Represented as JAX primitives that JIT-compile with model execution. XLA's fusion passes combine steering with model layers.
Multiple vectors combine through tree operations and custom transformations. The compiler reasons about interactions through symbolic execution, resolving conflicts via optimization passes. XLA fuses composed operations. Vectorization through vmap enables parallel policy application.
| Capability | How IAM Does It | What's Different |
|---|---|---|
| Attribution | Custom JAX transformations extending grad |
Simultaneous gradient + attribution computation; vectorized via vmap across all integration steps in parallel |
| Visualization | SAE operations as JAX primitives | Features extracted during JIT-compiled inference; XLA reuses activation buffers, zero overhead |
| Tracing | Integration with XLA's profiling infrastructure | IR-level traces at compilation level, not API-level logs; no Python overhead |
| Orchestration | Operations compose as JAX transformations | Compiler visibility for global optimization; JIT eliminates conditional overhead |
| Policy Steering | Steering vectors as JAX primitives | Compile-time fusion with model layers; multi-policy conflict resolution via symbolic execution |
- Stack: Google compilation stack (JAX, XLA, StableHLO) with MLIR / Mojo / MAX-native foundation
- Target: Code generation models exclusively -- no NLP, audio, video, or other modalities
- Plane: Compute plane (inference and training optimization); data plane is out of scope
- Future: Chain of Thought monitoring, Model Context Protocol (MCP) integration, and multi-agent systems (beyond current dissertation boundaries)
JITANA ReHAna IAM Defense
*-----------------*-----------------*-----------------*
ICSE '17 MobiQuitous '21 ICSE DS '26 Dec 2026
The research lineage traces from hybrid program analysis for Android (JITANA, ReHAna) to compiler-native interpretability for AI systems (IAM) -- a consistent thread of making opaque systems transparent through compiler infrastructure.
-
IAM: Interpretable AI with MLIR - A Compiler-Integrated Framework for Trustworthy Code Generation Shakthi Bachala, and Witawas Srisa-An In preparation for submission
-
Compiler-Native Policy Steering: MLIR Primitives for Efficient Behavioral Control in Neural Networks Shakthi Bachala, and Witawas Srisa-An In preparation for submission
-
ReHAna: An Efficient Program Analysis Framework to Uncover Reflective Code in Android Shakthi Bachala, Yutaka Tsutano, Witawas Srisa-An, Gregg Rothermel, and Jackson Dinh EAI MobiQuitous 2021
-
JitAna: A Modern Hybrid Program Analysis Framework for Android Platforms Yutaka Tsutano, Shakthi Bachala, Witawas Srisa-An, Gregg Rothermel, and Jackson Dinh Journal of Computer Languages, Volume 52, 2019
-
GranDroid: Graph-based Detection of Malicious Network Behaviors in Android Applications Zhiqiang Li, Jun Sun, Qiben Yan, Witawas Srisa-an, and Shakthi Bachala SecureComm 2018
-
An Efficient, Robust, and Scalable Approach for Analyzing Interacting Android Apps Yutaka Tsutano, Shakthi Bachala, Witawas Srisa-An, Gregg Rothermel, and Jackson Dinh ICSE 2017
| File | Description |
|---|---|
shakthi_bachala_research_statement.pdf |
Full PhD research statement detailing the IAM framework, architecture, and technical approach |
poster_icse_brazil.pdf |
ICSE 2026 Doctoral Symposium poster presented at Recife, Brazil |
code_agent_spec.md |
Code agent specification (placeholder -- needs re-upload) |
Shakthi Bachala -- shakthi.bachala@huskers.unl.edu
University of Nebraska-Lincoln | Department of Computer Science
ICSE 2026 Doctoral Symposium