Skip to content

Dean151/swift-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

swift-ai

A provider-agnostic Swift API for chat, embeddings, image generation, agents, and retrieval. One protocol surface — on-device Apple Intelligence, OpenAI, Anthropic, Gemini, Mistral, local Ollama, and Cohere behind it.

It's built for client- and server-side Swift alike: drop it into an iOS/macOS app — including fully on-device inference with Apple Intelligence, no API key required — or run it inside a server (Vapor, Hummingbird) or CLI on Linux. The same protocols, agents, and retrieval stack work in both, so a model you prototype in an app moves to a backend unchanged.

  • Platforms: macOS 15+, iOS 18+, tvOS 18+, watchOS 11+, Linux, Android, Wasm
  • Swift: 6.1+ toolchain, language mode 6
  • License: MIT

Install

Add the package to your Package.swift:

.package(url: "https://github.com/Dean151/swift-ai.git", from: "0.1.0"),

Then depend on whichever provider (or higher-level library) you need:

.target(
    name: "MyApp",
    dependencies: [
        .product(name: "OpenAI", package: "swift-ai"),
    ]
)

The Retrieval package trait unlocks retrieval helpers in Agent or Conversation:

.product(name: "Agent", package: "swift-ai", traits: ["Retrieval"]),

Pick a provider

Module Use it for
AppleIntelligence On-device Foundation Models (iOS/macOS/visionOS 26+). No API key. Structured output, tool calling, streaming; image input, reasoning levels, and Private Cloud Compute on 27+.
OpenAI GPT family, DALL·E / gpt-image-*, native structured outputs.
Anthropic Claude family, with prompt-caching helpers.
Gemini Gemini chat, task-typed embeddings, Imagen.
Mistral Mistral chat (incl. Pixtral vision) and embeddings.
Ollama Local / self-hosted open models via a running Ollama server. Keyless by default.
Cohere Command chat, task-typed embeddings, and document reranking.
Voyage Embeddings and reranking specialist.
Jina Embeddings and reranking specialist (Matryoshka dimensions, late chunking).

Every provider conforms to ModelProvider, so swapping backends is a one-line change. Use provider.capabilities(for: id) to check what a given model supports before requesting features like tool calling or structured outputs.

Your first request

import OpenAI

let provider = OpenAI(
    apiKey: ProcessInfo.processInfo.environment["OPENAI_API_KEY"] ?? "",
    transport: URLSessionTransport()
)

let model = provider.languageModel(.gpt4oMini)   // or "gpt-4o-mini", or any model id
let output = try await model.generate("Say hi in one word.")
print(output.message)

Each provider ships a strongly typed model identifier with discoverable constants — OpenAIModel.gpt4oMini, AnthropicModel.sonnet, GeminiModel.gemini25Flash, MistralModel.large, OllamaModel.llama33, CohereModel.commandA, plus VoyageEmbedding/JinaEmbedding (and rerank) for the embedding specialists. They're ExpressibleByStringLiteral, so any new or fine-tuned model name still works as a plain string. Apple Intelligence, whose model set is closed, uses an exhaustive enum (.onDevice / .privateCloudCompute) instead.

ChatRequest accepts a string literal for the common single-user-message case, and an array literal of messages when you need system prompts or multi-turn history:

let output = try await model.generate(
    [
        .system("You answer in haiku."),
        .user("Why does the wind blow?")
    ],
    options: GenerationOptions(maxTokens: 200)
)

Streaming

for try await event in model.stream(request) {
    switch event {
    case .messageDelta(let chunk): print(chunk, terminator: "")
    case .toolInvocation(let call): print("\ntool:", call.name)
    case .completed(let final):    print("\ndone:", final.usage ?? "")
    }
}

Higher level with Agent and Conversation

Agent — tool-using loops

AgentRunner drives a model through a tool-calling loop with retries, observers, and parallel tool execution. Give it a ToolBox, then run for a final result or stream for live events.

import Agent
import OpenAI

let model = OpenAI(apiKey: apiKey, transport: URLSessionTransport())
    .languageModel(.gpt4oMini)

let tools = try ToolBox([
    AnyTool.callback(
        name: "current_time",
        description: "Returns the current ISO-8601 timestamp.",
        handler: { _, _ in .string(ISO8601DateFormatter().string(from: Date())) }
    )
])

let result = try await AgentRunner(model: model, tools: tools)
    .run(messages: [.user("What time is it?")])
print(result.finalOutput.message)

Conversation — persistent multi-turn sessions

Conversation layers a MessageStore and a chain of MemoryPolicy values on top of AgentRunner. Every turn persists user, assistant, and tool messages; policies shape what the model actually sees — system prompt, recent window, rolling summary, retrieval — without losing anything from the durable transcript.

import Conversation
import OpenAI

let conversation = Conversation(
    model: OpenAI(apiKey: apiKey, transport: URLSessionTransport())
        .languageModel(.gpt4oMini),
    store: FileMessageStore(directory: URL.documentsDirectory.appending(path: "chats")),
    policies: [
        StaticSystemMemoryPolicy("You are a friendly Swift tutor."),
        RecentWindowPolicy(maxMessages: 30, maxTokens: 8_000),
    ],
    tokenBudget: 12_000
)

let session = try await conversation.newSession()
let turn = try await conversation.send(.user("Explain async/await in one sentence."), to: session.id)
print(turn.appended.last?.message ?? "")

Each higher-level module re-exports the layers underneath it (Conversation brings in Agent and AI), so a single import Conversation plus your provider import is enough.

On-device with Apple Intelligence

The AppleIntelligence provider runs Foundation Models entirely on-device — no API key, no network. It speaks the same ModelProvider surface as every other backend, so structured output, tool calling, and streaming all work through the shared API:

import AppleIntelligence

let provider = AppleIntelligence()

// Structured output — bridged to Foundation Models guided generation.
// Apple Intelligence has a closed model set, so you pick it with a typed
// enum (.onDevice / .privateCloudCompute), not a string.
let output = try await provider.languageModel(.onDevice).generate(
    "Extract the city and temperature from: It's 22°C in Paris.",
    options: GenerationOptions(responseFormat: .jsonSchema(Weather.outputSchema))
)
let weather = try output.decodeJSON(Weather.self)

Features that need the iOS 27 / macOS 27 SDK degrade gracefully — the package still builds and runs on 26, those options simply do nothing there:

// Reasoning effort and tool-calling mode (no-op before 27).
let options = GenerationOptions(appleIntelligence: .init(reasoningLevel: .deep, toolCallingMode: .required))

// Route through Private Cloud Compute (falls back to on-device before 27).
let cloud = provider.languageModel(.privateCloudCompute)

// Image input — attach an image to a multimodal prompt (27+).
let described = try await provider.languageModel(.onDevice).generate(
    [.user(content: ["Write alt text for this:", .image(.url(screenshotURL))])]
)

provider.capabilities(for:) reflects what the running OS actually supports, so you can gate features at runtime. The provider also exposes on-device introspection — contextSize, supportedLanguages, tokenCount(for:), prewarm(), a useCase/guardrails initializer, and (on 27+) privateCloudComputeStatus() for quota and availability.

Going further

The rendered DocC catalog (target AI) has guides for everything beyond hello-world:

  • Tool calling — define Tools, run them through Agent.AgentRunner.
  • Structured outputs — typed, schema-backed responses.
  • Conversation — persistent multi-turn sessions with pluggable memory policies.
  • RetrievalVectorStore + retrieval policies for RAG, with optional RerankModel reranking.
  • TestingAITesting ships mocks, recorders, and replayers.
  • Prompt caching, availability & fallbacks, embeddings & images.

Browse the catalog on the Swift Package Index or via swift package generate-documentation.

License

MIT — see LICENSE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Sponsor this project

 

Packages

 
 
 

Contributors

Languages