swift-ai

A provider-agnostic Swift API for chat, embeddings, image generation, agents, and retrieval. One protocol surface — on-device Apple Intelligence, OpenAI, Anthropic, Gemini, Mistral, local Ollama, and Cohere behind it.

It's built for client- and server-side Swift alike: drop it into an iOS/macOS app — including fully on-device inference with Apple Intelligence, no API key required — or run it inside a server (Vapor, Hummingbird) or CLI on Linux. The same protocols, agents, and retrieval stack work in both, so a model you prototype in an app moves to a backend unchanged.

Platforms: macOS 15+, iOS 18+, tvOS 18+, watchOS 11+, Linux, Android, Wasm
Swift: 6.1+ toolchain, language mode 6
License: MIT

Install

Add the package to your Package.swift:

.package(url: "https://github.com/Dean151/swift-ai.git", from: "0.1.0"),

Then depend on whichever provider (or higher-level library) you need:

.target(
    name: "MyApp",
    dependencies: [
        .product(name: "OpenAI", package: "swift-ai"),
    ]
)

The Retrieval package trait unlocks retrieval helpers in Agent or Conversation:

.product(name: "Agent", package: "swift-ai", traits: ["Retrieval"]),

Pick a provider

Module	Use it for
`AppleIntelligence`	On-device Foundation Models (iOS/macOS/visionOS 26+). No API key. Structured output, tool calling, streaming; image input, reasoning levels, and Private Cloud Compute on 27+.
`OpenAI`	GPT family, DALL·E / `gpt-image-*`, native structured outputs.
`Anthropic`	Claude family, with prompt-caching helpers.
`Gemini`	Gemini chat, task-typed embeddings, Imagen.
`Mistral`	Mistral chat (incl. Pixtral vision) and embeddings.
`Ollama`	Local / self-hosted open models via a running Ollama server. Keyless by default.
`Cohere`	Command chat, task-typed embeddings, and document reranking.
`Voyage`	Embeddings and reranking specialist.
`Jina`	Embeddings and reranking specialist (Matryoshka dimensions, late chunking).

Every provider conforms to ModelProvider, so swapping backends is a one-line change. Use provider.capabilities(for: id) to check what a given model supports before requesting features like tool calling or structured outputs.

Your first request

import OpenAI

let provider = OpenAI(
    apiKey: ProcessInfo.processInfo.environment["OPENAI_API_KEY"] ?? "",
    transport: URLSessionTransport()
)

let model = provider.languageModel(.gpt4oMini)   // or "gpt-4o-mini", or any model id
let output = try await model.generate("Say hi in one word.")
print(output.message)

Each provider ships a strongly typed model identifier with discoverable constants — OpenAIModel.gpt4oMini, AnthropicModel.sonnet, GeminiModel.gemini25Flash, MistralModel.large, OllamaModel.llama33, CohereModel.commandA, plus VoyageEmbedding/JinaEmbedding (and rerank) for the embedding specialists. They're ExpressibleByStringLiteral, so any new or fine-tuned model name still works as a plain string. Apple Intelligence, whose model set is closed, uses an exhaustive enum (.onDevice / .privateCloudCompute) instead.

ChatRequest accepts a string literal for the common single-user-message case, and an array literal of messages when you need system prompts or multi-turn history:

let output = try await model.generate(
    [
        .system("You answer in haiku."),
        .user("Why does the wind blow?")
    ],
    options: GenerationOptions(maxTokens: 200)
)

Streaming

for try await event in model.stream(request) {
    switch event {
    case .messageDelta(let chunk): print(chunk, terminator: "")
    case .toolInvocation(let call): print("\ntool:", call.name)
    case .completed(let final):    print("\ndone:", final.usage ?? "")
    }
}

Higher level with Agent and Conversation

Agent — tool-using loops

AgentRunner drives a model through a tool-calling loop with retries, observers, and parallel tool execution. Give it a ToolBox, then run for a final result or stream for live events.

import Agent
import OpenAI

let model = OpenAI(apiKey: apiKey, transport: URLSessionTransport())
    .languageModel(.gpt4oMini)

let tools = try ToolBox([
    AnyTool.callback(
        name: "current_time",
        description: "Returns the current ISO-8601 timestamp.",
        handler: { _, _ in .string(ISO8601DateFormatter().string(from: Date())) }
    )
])

let result = try await AgentRunner(model: model, tools: tools)
    .run(messages: [.user("What time is it?")])
print(result.finalOutput.message)

Conversation — persistent multi-turn sessions

Conversation layers a MessageStore and a chain of MemoryPolicy values on top of AgentRunner. Every turn persists user, assistant, and tool messages; policies shape what the model actually sees — system prompt, recent window, rolling summary, retrieval — without losing anything from the durable transcript.

import Conversation
import OpenAI

let conversation = Conversation(
    model: OpenAI(apiKey: apiKey, transport: URLSessionTransport())
        .languageModel(.gpt4oMini),
    store: FileMessageStore(directory: URL.documentsDirectory.appending(path: "chats")),
    policies: [
        StaticSystemMemoryPolicy("You are a friendly Swift tutor."),
        RecentWindowPolicy(maxMessages: 30, maxTokens: 8_000),
    ],
    tokenBudget: 12_000
)

let session = try await conversation.newSession()
let turn = try await conversation.send(.user("Explain async/await in one sentence."), to: session.id)
print(turn.appended.last?.message ?? "")

Each higher-level module re-exports the layers underneath it (Conversation brings in Agent and AI), so a single import Conversation plus your provider import is enough.

On-device with Apple Intelligence

The AppleIntelligence provider runs Foundation Models entirely on-device — no API key, no network. It speaks the same ModelProvider surface as every other backend, so structured output, tool calling, and streaming all work through the shared API:

import AppleIntelligence

let provider = AppleIntelligence()

// Structured output — bridged to Foundation Models guided generation.
// Apple Intelligence has a closed model set, so you pick it with a typed
// enum (.onDevice / .privateCloudCompute), not a string.
let output = try await provider.languageModel(.onDevice).generate(
    "Extract the city and temperature from: It's 22°C in Paris.",
    options: GenerationOptions(responseFormat: .jsonSchema(Weather.outputSchema))
)
let weather = try output.decodeJSON(Weather.self)

Features that need the iOS 27 / macOS 27 SDK degrade gracefully — the package still builds and runs on 26, those options simply do nothing there:

// Reasoning effort and tool-calling mode (no-op before 27).
let options = GenerationOptions(appleIntelligence: .init(reasoningLevel: .deep, toolCallingMode: .required))

// Route through Private Cloud Compute (falls back to on-device before 27).
let cloud = provider.languageModel(.privateCloudCompute)

// Image input — attach an image to a multimodal prompt (27+).
let described = try await provider.languageModel(.onDevice).generate(
    [.user(content: ["Write alt text for this:", .image(.url(screenshotURL))])]
)

provider.capabilities(for:) reflects what the running OS actually supports, so you can gate features at runtime. The provider also exposes on-device introspection — contextSize, supportedLanguages, tokenCount(for:), prewarm(), a useCase/guardrails initializer, and (on 27+) privateCloudComputeStatus() for quota and availability.

Going further

The rendered DocC catalog (target AI) has guides for everything beyond hello-world:

Tool calling — define Tools, run them through Agent.AgentRunner.
Structured outputs — typed, schema-backed responses.
Conversation — persistent multi-turn sessions with pluggable memory policies.
Retrieval — VectorStore + retrieval policies for RAG, with optional RerankModel reranking.
Testing — AITesting ships mocks, recorders, and replayers.
Prompt caching, availability & fallbacks, embeddings & images.

Browse the catalog on the Swift Package Index or via swift package generate-documentation.

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
.github		.github
Sources		Sources
Tests		Tests
.gitignore		.gitignore
.spi.yml		.spi.yml
LICENSE		LICENSE
Package.resolved		Package.resolved
Package.swift		Package.swift
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

swift-ai

Install

Pick a provider

Your first request

Streaming

Higher level with Agent and Conversation

Agent — tool-using loops

Conversation — persistent multi-turn sessions

On-device with Apple Intelligence

Going further

License

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

swift-ai

Install

Pick a provider

Your first request

Streaming

Higher level with Agent and Conversation

Agent — tool-using loops

Conversation — persistent multi-turn sessions

On-device with Apple Intelligence

Going further

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages