Interpretable AI with MLIR (IAM)

Compiler-Native Hybrid Analysis for Code Models and Agents

Shakthi Bachala | Advisor: Prof. Witawas Srisa-An

Department of Computer Science, University of Nebraska-Lincoln

ICSE 2026 Doctoral Symposium -- Recife, Brazil

"Interpretability as a first-class compilation primitive"

The Problem

Modern AI systems are powerful yet opaque. Understanding, aligning, and auditing their behavior requires multiple specialized tools that operate independently -- each with its own data format, execution model, and performance characteristics. This fragmentation creates severe practical challenges:

Current Tool	Role	Limitation
SAELens	Sparse autoencoder features	Eager tensor execution, no compiler visibility
Captum	Attribution (integrated gradients)	Separate forward passes, no fusion with inference
Petri	Steering / safety auditing	Runtime-only checks, no compile-time guarantees
LangChain	Pipeline orchestration	Opaque to the compiler, no global optimization
LangSmith	Observability / tracing	API-level logs, not IR-level traces
Gradio	Visualization	Output-level displays, no internal circuit visibility

These tools require multiple redundant forward passes and operate independently. There is no way for a compiler to reason about the full computation graph.

The IAM Vision

IAM unifies static structure and dynamic behavior through MLIR / Mojo / MAX-native substrate for code models and agents.

The core insight: interpretability operations -- attribution, visualization, tracing, and orchestration -- naturally compose as JAX transformations. IAM treats these as native compiler concerns, enabling the XLA optimization pipeline to automatically fuse, schedule, and lower interpretability primitives alongside model computation.

Architecture

IAM is a three-tier vertically integrated framework. Each layer builds on the one below through JAX's composable primitive system and XLA's compilation pipeline.

                         APPLICATION LAYER
            +-----------------+-----------------+-----------------+
            |  Orchestration  | Observability   | Visualization   |
            |  Pipeline       | Traces          | Feature Maps    |
            |  DAG Execute    | Metrics         | Attribution     |
            |  Chain Compose  | Logging         | Circuit Display |
            |  .............. | ............... | ............... |
            |  >> LangChain   | >> LangSmith    | >> Gradio       |
            +-----------------+-----------------+-----------------+
                                    |
                              VECTOR LAYER
            +-----------------+-----------------+-----------------+
            | CLT/Features    | Attribution     | Steering/Audit  |
            | Encode/Decode   | IntGrad         | Vector Construct|
            | TopK Activate   | Feature Select  | Policy Apply    |
            | Skip Connect    | Circuit Extract | Safety Verify   |
            | ............... | ............... | ............... |
            | >> SAELens      | >> Captum       | >> Petri        |
            +-----------------+-----------------+-----------------+
              C++ Core            C++ Core            C++ Core
                                    |
                          MLIR COMPILER LAYER
            +-----------------+-----------------+-----------------+
            | CLT Dialect     | Attribution     | Policy Dialect  |
            |                 | Dialect         |                 |
            | iam.clt.encode  | iam.attr.grad   | iam.policy.steer|
            | iam.clt.decode  | iam.attr.path   | iam.policy.     |
            | CLTFusionPass   | AttrCachePass   |       verify    |
            |                 |                 | PolicyProject   |
            |                 |                 |       Pass      |
            +-----------------+-----------------+-----------------+
                                    |
                    StableHLO --> XLA --> TPU / GPU / CPU

Application Layer

User-facing API for orchestration, observability, and visualization. Replaces LangChain, LangSmith, and Gradio with unified interfaces that are structure-aware -- the compiler sees through them.

Vector Layer

Core interpretability primitives in high-performance C++. Integrated CLT operations, attribution analysis, and policy steering. Attribution works through compiled structure, not eager tensors. Steering is compiler-mediated, not only runtime checks.

MLIR Compiler Layer

Custom MLIR dialects that represent interpretability as first-class compiler constructs. Optimization passes fuse operations and cache results. The compiler can reason about the full computation graph, fusing interpretability with model execution in a single optimized pipeline.

Hybrid Analysis: Static <--> Dynamic

AI systems fail where static structure and runtime behavior diverge. IAM bridges this gap:

       STATIC                                           DYNAMIC
  +------------------+                            +------------------+
  | module.compile() |    static <--> dynamic      |    nn.Hooks      |
  | Compiled         | <------------------------> | Observed         |
  |  dependency path |   mismatch is the failure   |  execution       |
  | -> safety check  |                            |  bypasses safety |
  | -> output        |                            |  check entirely  |
  +------------------+                            +------------------+

                          BRIDGE LAYER
                      Static / dynamic overlay
                  capture -> overlay -> feedback

Component	Role
IAM Core	Application + Vector + MLIR (the three-tier stack above)
Bridge Layer	Static / dynamic overlay connecting compile-time and runtime analysis
IAM Hybrid Analysis	Unified static <--> dynamic reasoning about model behavior

Technical Foundation: JAX Vertical Integration

IAM builds on JAX's vertically integrated architecture:

JAX Primitives -- Computation through composable primitives transformed by grad, jit, and vmap. IAM defines custom primitives for interpretability that integrate seamlessly. Attribution becomes a natural transformation alongside autodiff.
XLA Compilation -- JAX lowers to XLA's HLO enabling aggressive optimization. IAM's primitives lower directly to HLO, allowing the compiler to fuse interpretability operations with model computation.
Memory Optimization -- JAX's buffer donation enables sophisticated sharing. Activation buffers for attribution reuse memory across operations, with XLA automatically identifying sharing opportunities.
Progressive Lowering -- JAX traces Python to Jaxpr, lowering to StableHLO with IAM's custom operations. StableHLO lowers through MHLO and Linalg to LLVM IR or GPU representations. Specialized passes fuse operations, identify buffer reuse, and exploit sparsity.

MLIR Dialects for Interpretability

IAM extends StableHLO with three custom dialects:

CLT Dialect (Sparse Autoencoders)

iam.clt.encode  /  iam.clt.decode  /  CLTFusionPass

Decompose activations as StableHLO operations that JAX's compiler fuses with model inference and lowers to efficient device code.

Attribution Dialect

iam.attr.grad  /  iam.attr.path  /  AttrCachePass

Causal relationships as StableHLO transformations with algebraic simplification and batched computation through vmap.

Policy Dialect

iam.policy.steer  /  iam.policy.verify  /  PolicyProjectPass

Path identification formalized as HLO graph analysis. Behavioral control as StableHLO primitives integrating with JAX's transformation system. The lowering pipeline: JAX primitives -> Jaxpr -> StableHLO -> simplification -> MHLO -> LLVM IR or NVVM IR.

Policy Steering for Behavioral Control

Beyond understanding behavior, controlling it is essential for safety-critical AI.

Skip-Transcoder Architecture

Sparse autoencoders extended with residual connections preserving information flow while maintaining interpretability. Skip connections provide direct paths not requiring decomposition, reserving sparse pathways for patterns benefiting from explicit representation. Implemented as JAX primitives lowering to StableHLO; XLA fuses transcoder operations with model layers.

Steering Vector Construction

Vectors in activation space representing desired behavioral changes, added to activations during inference. Declarative policy specifications with automatic feature discovery through contrastive analysis. Statistical analysis identifies differentiating features; causal validation verifies effectiveness. Represented as JAX primitives that JIT-compile with model execution. XLA's fusion passes combine steering with model layers.

Multi-Policy Composition

Multiple vectors combine through tree operations and custom transformations. The compiler reasons about interactions through symbolic execution, resolving conflicts via optimization passes. XLA fuses composed operations. Vectorization through vmap enables parallel policy application.

Unified Interpretability Capabilities

Capability	How IAM Does It	What's Different
Attribution	Custom JAX transformations extending `grad`	Simultaneous gradient + attribution computation; vectorized via `vmap` across all integration steps in parallel
Visualization	SAE operations as JAX primitives	Features extracted during JIT-compiled inference; XLA reuses activation buffers, zero overhead
Tracing	Integration with XLA's profiling infrastructure	IR-level traces at compilation level, not API-level logs; no Python overhead
Orchestration	Operations compose as JAX transformations	Compiler visibility for global optimization; JIT eliminates conditional overhead
Policy Steering	Steering vectors as JAX primitives	Compile-time fusion with model layers; multi-policy conflict resolution via symbolic execution

Dissertation Scope

Stack: Google compilation stack (JAX, XLA, StableHLO) with MLIR / Mojo / MAX-native foundation
Target: Code generation models exclusively -- no NLP, audio, video, or other modalities
Plane: Compute plane (inference and training optimization); data plane is out of scope
Future: Chain of Thought monitoring, Model Context Protocol (MCP) integration, and multi-agent systems (beyond current dissertation boundaries)

Dissertation Evolution

  JITANA          ReHAna              IAM             Defense
    *-----------------*-----------------*-----------------*
  ICSE '17      MobiQuitous '21    ICSE DS '26        Dec 2026

The research lineage traces from hybrid program analysis for Android (JITANA, ReHAna) to compiler-native interpretability for AI systems (IAM) -- a consistent thread of making opaque systems transparent through compiler infrastructure.

Publications

IAM: Interpretable AI with MLIR - A Compiler-Integrated Framework for Trustworthy Code Generation Shakthi Bachala, and Witawas Srisa-An In preparation for submission
Compiler-Native Policy Steering: MLIR Primitives for Efficient Behavioral Control in Neural Networks Shakthi Bachala, and Witawas Srisa-An In preparation for submission
ReHAna: An Efficient Program Analysis Framework to Uncover Reflective Code in Android Shakthi Bachala, Yutaka Tsutano, Witawas Srisa-An, Gregg Rothermel, and Jackson Dinh EAI MobiQuitous 2021
JitAna: A Modern Hybrid Program Analysis Framework for Android Platforms Yutaka Tsutano, Shakthi Bachala, Witawas Srisa-An, Gregg Rothermel, and Jackson Dinh Journal of Computer Languages, Volume 52, 2019
GranDroid: Graph-based Detection of Malicious Network Behaviors in Android Applications Zhiqiang Li, Jun Sun, Qiben Yan, Witawas Srisa-an, and Shakthi Bachala SecureComm 2018
An Efficient, Robust, and Scalable Approach for Analyzing Interacting Android Apps Yutaka Tsutano, Shakthi Bachala, Witawas Srisa-An, Gregg Rothermel, and Jackson Dinh ICSE 2017

Repository Contents

File	Description
`shakthi_bachala_research_statement.pdf`	Full PhD research statement detailing the IAM framework, architecture, and technical approach
`poster_icse_brazil.pdf`	ICSE 2026 Doctoral Symposium poster presented at Recife, Brazil
`code_agent_spec.md`	Code agent specification (placeholder -- needs re-upload)

Shakthi Bachala -- shakthi.bachala@huskers.unl.edu

University of Nebraska-Lincoln | Department of Computer Science

ICSE 2026 Doctoral Symposium

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
code_agent_spec.md		code_agent_spec.md
poster_icse_brazil.pdf		poster_icse_brazil.pdf
shakthi_bachala_research_statement.pdf		shakthi_bachala_research_statement.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Interpretable AI with MLIR (IAM)

Compiler-Native Hybrid Analysis for Code Models and Agents

The Problem

The IAM Vision

Architecture

Application Layer

Vector Layer

MLIR Compiler Layer

Hybrid Analysis: Static <--> Dynamic

Technical Foundation: JAX Vertical Integration

MLIR Dialects for Interpretability

CLT Dialect (Sparse Autoencoders)

Attribution Dialect

Policy Dialect

Policy Steering for Behavioral Control

Skip-Transcoder Architecture

Steering Vector Construction

Multi-Policy Composition

Unified Interpretability Capabilities

Dissertation Scope

Dissertation Evolution

Publications

Repository Contents

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Interpretable AI with MLIR (IAM)

Compiler-Native Hybrid Analysis for Code Models and Agents

The Problem

The IAM Vision

Architecture

Application Layer

Vector Layer

MLIR Compiler Layer

Hybrid Analysis: Static <--> Dynamic

Technical Foundation: JAX Vertical Integration

MLIR Dialects for Interpretability

CLT Dialect (Sparse Autoencoders)

Attribution Dialect

Policy Dialect

Policy Steering for Behavioral Control

Skip-Transcoder Architecture

Steering Vector Construction

Multi-Policy Composition

Unified Interpretability Capabilities

Dissertation Scope

Dissertation Evolution

Publications

Repository Contents

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages