Skip to content

sb-iam/iam

Repository files navigation

Interpretable AI with MLIR (IAM)

Compiler-Native Hybrid Analysis for Code Models and Agents

Shakthi Bachala | Advisor: Prof. Witawas Srisa-An

Department of Computer Science, University of Nebraska-Lincoln

ICSE 2026 Doctoral Symposium -- Recife, Brazil


"Interpretability as a first-class compilation primitive"


The Problem

Modern AI systems are powerful yet opaque. Understanding, aligning, and auditing their behavior requires multiple specialized tools that operate independently -- each with its own data format, execution model, and performance characteristics. This fragmentation creates severe practical challenges:

Current Tool Role Limitation
SAELens Sparse autoencoder features Eager tensor execution, no compiler visibility
Captum Attribution (integrated gradients) Separate forward passes, no fusion with inference
Petri Steering / safety auditing Runtime-only checks, no compile-time guarantees
LangChain Pipeline orchestration Opaque to the compiler, no global optimization
LangSmith Observability / tracing API-level logs, not IR-level traces
Gradio Visualization Output-level displays, no internal circuit visibility

These tools require multiple redundant forward passes and operate independently. There is no way for a compiler to reason about the full computation graph.


The IAM Vision

IAM unifies static structure and dynamic behavior through MLIR / Mojo / MAX-native substrate for code models and agents.

The core insight: interpretability operations -- attribution, visualization, tracing, and orchestration -- naturally compose as JAX transformations. IAM treats these as native compiler concerns, enabling the XLA optimization pipeline to automatically fuse, schedule, and lower interpretability primitives alongside model computation.


Architecture

IAM is a three-tier vertically integrated framework. Each layer builds on the one below through JAX's composable primitive system and XLA's compilation pipeline.

                         APPLICATION LAYER
            +-----------------+-----------------+-----------------+
            |  Orchestration  | Observability   | Visualization   |
            |  Pipeline       | Traces          | Feature Maps    |
            |  DAG Execute    | Metrics         | Attribution     |
            |  Chain Compose  | Logging         | Circuit Display |
            |  .............. | ............... | ............... |
            |  >> LangChain   | >> LangSmith    | >> Gradio       |
            +-----------------+-----------------+-----------------+
                                    |
                              VECTOR LAYER
            +-----------------+-----------------+-----------------+
            | CLT/Features    | Attribution     | Steering/Audit  |
            | Encode/Decode   | IntGrad         | Vector Construct|
            | TopK Activate   | Feature Select  | Policy Apply    |
            | Skip Connect    | Circuit Extract | Safety Verify   |
            | ............... | ............... | ............... |
            | >> SAELens      | >> Captum       | >> Petri        |
            +-----------------+-----------------+-----------------+
              C++ Core            C++ Core            C++ Core
                                    |
                          MLIR COMPILER LAYER
            +-----------------+-----------------+-----------------+
            | CLT Dialect     | Attribution     | Policy Dialect  |
            |                 | Dialect         |                 |
            | iam.clt.encode  | iam.attr.grad   | iam.policy.steer|
            | iam.clt.decode  | iam.attr.path   | iam.policy.     |
            | CLTFusionPass   | AttrCachePass   |       verify    |
            |                 |                 | PolicyProject   |
            |                 |                 |       Pass      |
            +-----------------+-----------------+-----------------+
                                    |
                    StableHLO --> XLA --> TPU / GPU / CPU

Application Layer

User-facing API for orchestration, observability, and visualization. Replaces LangChain, LangSmith, and Gradio with unified interfaces that are structure-aware -- the compiler sees through them.

Vector Layer

Core interpretability primitives in high-performance C++. Integrated CLT operations, attribution analysis, and policy steering. Attribution works through compiled structure, not eager tensors. Steering is compiler-mediated, not only runtime checks.

MLIR Compiler Layer

Custom MLIR dialects that represent interpretability as first-class compiler constructs. Optimization passes fuse operations and cache results. The compiler can reason about the full computation graph, fusing interpretability with model execution in a single optimized pipeline.


Hybrid Analysis: Static <--> Dynamic

AI systems fail where static structure and runtime behavior diverge. IAM bridges this gap:

       STATIC                                           DYNAMIC
  +------------------+                            +------------------+
  | module.compile() |    static <--> dynamic      |    nn.Hooks      |
  | Compiled         | <------------------------> | Observed         |
  |  dependency path |   mismatch is the failure   |  execution       |
  | -> safety check  |                            |  bypasses safety |
  | -> output        |                            |  check entirely  |
  +------------------+                            +------------------+

                          BRIDGE LAYER
                      Static / dynamic overlay
                  capture -> overlay -> feedback
Component Role
IAM Core Application + Vector + MLIR (the three-tier stack above)
Bridge Layer Static / dynamic overlay connecting compile-time and runtime analysis
IAM Hybrid Analysis Unified static <--> dynamic reasoning about model behavior

Technical Foundation: JAX Vertical Integration

IAM builds on JAX's vertically integrated architecture:

  • JAX Primitives -- Computation through composable primitives transformed by grad, jit, and vmap. IAM defines custom primitives for interpretability that integrate seamlessly. Attribution becomes a natural transformation alongside autodiff.
  • XLA Compilation -- JAX lowers to XLA's HLO enabling aggressive optimization. IAM's primitives lower directly to HLO, allowing the compiler to fuse interpretability operations with model computation.
  • Memory Optimization -- JAX's buffer donation enables sophisticated sharing. Activation buffers for attribution reuse memory across operations, with XLA automatically identifying sharing opportunities.
  • Progressive Lowering -- JAX traces Python to Jaxpr, lowering to StableHLO with IAM's custom operations. StableHLO lowers through MHLO and Linalg to LLVM IR or GPU representations. Specialized passes fuse operations, identify buffer reuse, and exploit sparsity.

MLIR Dialects for Interpretability

IAM extends StableHLO with three custom dialects:

CLT Dialect (Sparse Autoencoders)

iam.clt.encode  /  iam.clt.decode  /  CLTFusionPass

Decompose activations as StableHLO operations that JAX's compiler fuses with model inference and lowers to efficient device code.

Attribution Dialect

iam.attr.grad  /  iam.attr.path  /  AttrCachePass

Causal relationships as StableHLO transformations with algebraic simplification and batched computation through vmap.

Policy Dialect

iam.policy.steer  /  iam.policy.verify  /  PolicyProjectPass

Path identification formalized as HLO graph analysis. Behavioral control as StableHLO primitives integrating with JAX's transformation system. The lowering pipeline: JAX primitives -> Jaxpr -> StableHLO -> simplification -> MHLO -> LLVM IR or NVVM IR.


Policy Steering for Behavioral Control

Beyond understanding behavior, controlling it is essential for safety-critical AI.

Skip-Transcoder Architecture

Sparse autoencoders extended with residual connections preserving information flow while maintaining interpretability. Skip connections provide direct paths not requiring decomposition, reserving sparse pathways for patterns benefiting from explicit representation. Implemented as JAX primitives lowering to StableHLO; XLA fuses transcoder operations with model layers.

Steering Vector Construction

Vectors in activation space representing desired behavioral changes, added to activations during inference. Declarative policy specifications with automatic feature discovery through contrastive analysis. Statistical analysis identifies differentiating features; causal validation verifies effectiveness. Represented as JAX primitives that JIT-compile with model execution. XLA's fusion passes combine steering with model layers.

Multi-Policy Composition

Multiple vectors combine through tree operations and custom transformations. The compiler reasons about interactions through symbolic execution, resolving conflicts via optimization passes. XLA fuses composed operations. Vectorization through vmap enables parallel policy application.


Unified Interpretability Capabilities

Capability How IAM Does It What's Different
Attribution Custom JAX transformations extending grad Simultaneous gradient + attribution computation; vectorized via vmap across all integration steps in parallel
Visualization SAE operations as JAX primitives Features extracted during JIT-compiled inference; XLA reuses activation buffers, zero overhead
Tracing Integration with XLA's profiling infrastructure IR-level traces at compilation level, not API-level logs; no Python overhead
Orchestration Operations compose as JAX transformations Compiler visibility for global optimization; JIT eliminates conditional overhead
Policy Steering Steering vectors as JAX primitives Compile-time fusion with model layers; multi-policy conflict resolution via symbolic execution

Dissertation Scope

  • Stack: Google compilation stack (JAX, XLA, StableHLO) with MLIR / Mojo / MAX-native foundation
  • Target: Code generation models exclusively -- no NLP, audio, video, or other modalities
  • Plane: Compute plane (inference and training optimization); data plane is out of scope
  • Future: Chain of Thought monitoring, Model Context Protocol (MCP) integration, and multi-agent systems (beyond current dissertation boundaries)

Dissertation Evolution

  JITANA          ReHAna              IAM             Defense
    *-----------------*-----------------*-----------------*
  ICSE '17      MobiQuitous '21    ICSE DS '26        Dec 2026

The research lineage traces from hybrid program analysis for Android (JITANA, ReHAna) to compiler-native interpretability for AI systems (IAM) -- a consistent thread of making opaque systems transparent through compiler infrastructure.


Publications

  1. IAM: Interpretable AI with MLIR - A Compiler-Integrated Framework for Trustworthy Code Generation Shakthi Bachala, and Witawas Srisa-An In preparation for submission

  2. Compiler-Native Policy Steering: MLIR Primitives for Efficient Behavioral Control in Neural Networks Shakthi Bachala, and Witawas Srisa-An In preparation for submission

  3. ReHAna: An Efficient Program Analysis Framework to Uncover Reflective Code in Android Shakthi Bachala, Yutaka Tsutano, Witawas Srisa-An, Gregg Rothermel, and Jackson Dinh EAI MobiQuitous 2021

  4. JitAna: A Modern Hybrid Program Analysis Framework for Android Platforms Yutaka Tsutano, Shakthi Bachala, Witawas Srisa-An, Gregg Rothermel, and Jackson Dinh Journal of Computer Languages, Volume 52, 2019

  5. GranDroid: Graph-based Detection of Malicious Network Behaviors in Android Applications Zhiqiang Li, Jun Sun, Qiben Yan, Witawas Srisa-an, and Shakthi Bachala SecureComm 2018

  6. An Efficient, Robust, and Scalable Approach for Analyzing Interacting Android Apps Yutaka Tsutano, Shakthi Bachala, Witawas Srisa-An, Gregg Rothermel, and Jackson Dinh ICSE 2017


Repository Contents

File Description
shakthi_bachala_research_statement.pdf Full PhD research statement detailing the IAM framework, architecture, and technical approach
poster_icse_brazil.pdf ICSE 2026 Doctoral Symposium poster presented at Recife, Brazil
code_agent_spec.md Code agent specification (placeholder -- needs re-upload)


Shakthi Bachala -- shakthi.bachala@huskers.unl.edu

University of Nebraska-Lincoln | Department of Computer Science

ICSE 2026 Doctoral Symposium

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors