RFC: Support for Inference-Layer Expert Steering (Extension to `ail` Spec)

**Core Objective:** 
Develop an extension for the `ail` specification that allows the runner to influence Mixture-of-Experts (MoE) routing at the inference layer, as theorized in *arXiv:2509.09660 (SteerMoE)*. This moves `ail` from "Top-Down" prompt orchestration to "Direct Neuromodulation" of the model’s internal reasoning experts.

**Technical Context for the Issue:**
*   **The Problem:** Prompting alone is a "leaky" abstraction for controlling model behavior. MoE models often "flicker" between experts, leading to inconsistent reasoning or "alignment faking."
*   **The Solution:** Implement a `steering` block in the `ail` YAML that allows the runner to apply logit-biasing to specific experts during the forward pass.
*   **Safety & Compliance:** Frame this as a **Faithfulness & Determinism** feature. Steering should be used to reinforce "Logic" and "Verification" experts, ensuring the model remains grounded in the provided context.

**Requested Issue Sections:**

1.  **Proposed Spec Syntax:**
    *   Introduce a `steering` block nested within `steps`.
    *   Include fields for `activate`, `suppress`, and `intensity`.
    *   Define a `requirement_level` (e.g., `strict`, `flexible`) to handle non-MoE or unsupported models.

2.  **Expert Mapping Strategy:**
    *   Propose a sidecar `expert_map.json` standard. Since Expert IDs are model-specific, `ail` needs a translation layer that maps human-readable intents (e.g., `reasoning`, `code_generation`) to specific expert indices.

3.  **Runner Requirements (Rust/C++):**
    *   Outline the need for hooks in the inference engine (e.g., `llama.cpp` or `candle`) to intercept the Gate/Router logits before the top-k selection.

4.  **Error Handling & Validation:**
    *   What happens if a `strict` steering requirement is met with a dense model?
    *   Define the "Degraded Mode" where the runner falls back to standard prompting if steering is unavailable.

5.  **Safety Guardrails:**
    *   Explicitly mention that steering can be used to *enforce* safety experts in high-stakes loops, providing a verifiable "Neuro-Audit" log of the model's internal state.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RFC: Support for Inference-Layer Expert Steering (Extension to `ail` Spec) #5

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

RFC: Support for Inference-Layer Expert Steering (Extension to ail Spec) #5

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

RFC: Support for Inference-Layer Expert Steering (Extension to `ail` Spec) #5