VOLTA — Versatile Orchestration Layer for Tiered AI
A Swift framework that resolves at runtime which AI model to use — on-device (Apple Intelligence) or cloud — with automatic fallback, privacy disclosure, and multi-turn conversations. The app asks for a response; Volta picks the right model based on availability, preference, context window, and privacy policy.
This is not an agent framework: it owns no sessions and no conversations. Its only job is model resolution.
- The acronym says what it does: a Versatile Orchestration Layer that manages Tiered AI — the tiers being both the capability tiers it ships (iOS 26.0 base → 26.4 token-aware) and the fallback chain it walks at runtime (on-device → cloud).
- Alessandro Volta invented the battery — a stack of cells where, when one layer alone isn't enough, the stack delivers the power. That is exactly the fallback chain.
| OS | What works |
|---|---|
| iOS / macOS 26.0+ | The whole core: fallback chain, typed errors, privacy disclosure, multi-turn conversations, vendor-agnostic developer key (OpenAI / Claude / Gemini), optional UI components. Context handling is reactive (error → fallback). |
| iOS / macOS 26.4+ | The token-aware tier lights up on its own: exact on-device token counting, automatic context-window pre-flight, contextUsage to know how full the window is. |
Requirements: Swift 6.2, and Xcode 26.4 or newer to build — the 26.4
token-counting API the token-aware tier references is declared only in the
26.4 SDK, so older toolchains (e.g. a CI runner pinned to Xcode 26.0.x)
fail to compile the package even though the #available gate keeps it
running fine on iOS/macOS 26.0+ at runtime. Build requirement and
deployment target are independent: build with 26.4+, deploy from 26.0.
The on-device model needs a device with Apple Intelligence (detected at
runtime: if absent, Volta excludes it and explains why).
From a git repository:
dependencies: [
.package(url: "https://github.com/deruloop/VoltaSDK.git", from: "0.3.4")
]In Xcode: File → Add Package Dependencies… → paste the URL → add the
VoltaSDK product to your app target. Add VoltaSDKUI only if you
want the ready-made SwiftUI components: the core is fully usable without any
UI.
Local package (same machine): File → Add Package Dependencies… → Add Local… → select the package folder. Note: a local dependency always uses the working copy; version tags don't apply.
The current version is 0.3.4 (see CHANGELOG.md). VoltaSDK is in active development: 0.x minor versions may evolve the API; 1.0.0 will mark the complete feature set.
The developer key is vendor-agnostic: the same slot accepts an OpenAI,
Anthropic (Claude), or Google (Gemini) key, and the vendor is auto-detected
from the key format (sk-ant-… → Claude, AIza… → Gemini, sk-… → OpenAI).
import VoltaSDK
AIOrchestrator.configure {
$0.enableOnDevice = true
// Works with an OpenAI, Claude, or Gemini key — auto-detected:
$0.developerKey = Bundle.main.object(forInfoDictionaryKey: "AI_API_KEY") as? String
// Optional. The model name belongs to the key's vendor; nil = the
// vendor's default (gpt-4o-mini / claude-opus-4-8 / gemini-2.5-flash).
$0.developerKeyModel = nil
$0.preference = .preferOnDevice // on-device, then developer key
$0.privacyDisclosure = .notify { downgrade in
print("Response generated by \(downgrade.provider) (\(downgrade.to))")
}
}Inject the developerKey as an Xcode secret (.xcconfig → Info.plist);
never write it in source code.
Where to find current model names for developerKeyModel (also linked from
CloudVendor.modelDocumentationURL and inside the demo):
| Vendor | Model catalog |
|---|---|
| OpenAI | https://platform.openai.com/docs/models |
| Anthropic (Claude) | https://platform.claude.com/docs/en/about-claude/models/overview |
| Google (Gemini) | https://ai.google.dev/gemini-api/docs/models |
let answer = try await AIOrchestrator.active.respond(
to: "Plan a weekend in Rome",
instructions: "You are a concise travel expert."
)
// Or, with provenance (to show who answered):
let response = try await AIOrchestrator.active.respondDetailed(to: "…")
print(response.text, response.provider, response.privacyLevel)Volta is stateless: it remembers nothing between calls. The conversation belongs to the app, which passes it with every call:
var history: [ChatTurn] = []
let first = try await kit.respond(to: "Plan a weekend")
history += [.user("Plan a weekend"), .assistant(first)]
// The follow-up works because the history travels with the call —
// even if the fallback switched provider in the meantime.
let second = try await kit.respond(to: "Change day 2", history: history)Every call is self-contained: if the preferred provider becomes unavailable mid-conversation, the next one receives the same history and the conversation continues seamlessly (with the configured privacy disclosure). When and how to trim the history remains the app's choice.
Volta runs an automatic pre-flight: if it knows a call cannot fit a provider's context window, it skips that provider without paying for a doomed generation. On 26.0–26.3 on-device counting isn't available and only the reactive behavior remains.
if let usage = await kit.contextUsage(history: history), usage.fraction > 0.8 {
// Up to the app: trim the oldest turns, or summarize them.
}usage is nil when the resolved provider can't count (an estimate is never
passed off as a count).
let provider = try await AIOrchestrator.active.resolveProvider()
// Resolves the chain's first usable provider without executing anything.var config = AIConfiguration()
config.developerKey = key
let kit = AIOrchestrator(configuration: config)
let answer = try await kit.respond(to: "...")import VoltaSDKUI
// Fallback-chain status (for debugging or a settings screen):
ProviderStatusList(orchestrator: kit)
// Ready-to-use conversational playground, with privacy badges
// and a context-pressure indicator:
AIPlaygroundView(orchestrator: kit, instructions: "Be concise.")ModelSelector is a drop-in view that lets your users choose which model
to use. Collapsed, it occupies a single row showing the active choice; tapping
it expands the list of options (it stays compact however many providers
exist). Options derive from the orchestrator's real state: providers you
didn't configure never appear; unavailable ones show their reason.
Initial state — the gate invariant. Nothing is ever committed without
passing through onSelection. When the selection binding starts nil, the
selector auto-selects the on-device model if available — the only
provider with no business gate behind it (free, private, no account) — and
even that attempt runs through your handler. Cloud providers are never
preselected: your configuration preference must not look like a user
activation when a subscription (or another gate) sits behind it. A non-nil
initial binding (e.g. a persisted user choice) is never overridden.
The default labels make no business assumptions — only you know whether
the cloud model is "included with Pro", metered, or free. Brand the rows via
labels:.
Selection is a conversation with your app through onSelection:
@State private var userChoice: ProviderIdentifier?
ModelSelector(
orchestrator: kit,
selection: $userChoice,
labels: [
.openAI: ModelSelectorLabel(
title: "Premium cloud model",
subtitle: "Included with your Pro plan",
systemImage: "sparkles"
)
],
onSelection: { provider in
guard provider != .onDevice else { return .activate } // immediate
if await entitlements.hasActiveSubscription() {
return .activate // commit now
}
showPaywall = true // your own view
return .deferred // app takes over
}
).activatecommits the selection immediately..deny(message:)refuses it, with an optional message under the selector..deferredhands control to your flow — a paywall, a settings screen, or any view that gates the choice. When your flow succeeds, commit the choice by setting theselectionbinding; the selector reflects it instantly. Nothing about the gate lives inside the component — it only reacts.
selection == nil therefore means no model committed yet. The selector
can't stop the orchestrator from answering — if your fallback chain contains
a gated provider, it will serve calls regardless of any UI. Close the loop on
your side: gate the chat until something is committed (what the demo does),
or keep the developer key out of the configuration for unentitled users.
Design customization: per-provider labels, hidesUnavailable, standard
SwiftUI modifiers — and ModelSelectorRow is public, so you can rebuild the
whole layout on top of providerStatuses() while keeping the rows.
All component building blocks (ProviderStatusRow, PrivacyLevelBadge,
ModelSelectorRow) are public: recompose them into custom layouts using the
core's providerStatuses() and respondDetailed().
macOS:
swift run VoltaSDKDemoiPhone / iPad: open Examples/iOSDemo/iOSDemo.xcodeproj and Run on a
device or on a simulator with an iOS 26 runtime. On a device with Apple
Intelligence the on-device provider is real; otherwise the list shows the
unavailability reason and the fallback switches to the developer key.
The demo mirrors a real integration's two roles: a Developer side
(configuration, privacy policy, a simulated subscription entitlement) and a
User side (the chat with the ModelSelector underneath) — so you can see
the result of any configuration × user-preference combination, including the
activation gate rejecting the cloud model when the simulated subscription is
off. The chat stays disabled until a model is committed — the
selection == nil contract that keeps gated providers from answering before
activation.
swift test # 41 tests: fallback, privacy, conversations, tokens, parsingInternal documentation lives in docs/:
- docs/iOS26-Implementation.md — how the iOS 26 / 26.4 base is implemented (decisions, stable API, verification).
- docs/iOS27-Design.md — the design of the iOS 27 extension (not yet implemented).
- docs/iOS27-OpenQuestions.md — the open questions gating the iOS 27 implementation.