-
Notifications
You must be signed in to change notification settings - Fork 0
Book 22 AI Chatbot Integration
Part VI — The Modern Toolchain · Claude's Xcode 26 Swift Bible
← Book-21-Git-And-GitHub · Chapters and Appendices
Claude's Xcode 26 Swift Bible -- Part VI: The Modern Toolchain
X26 ships AI in three distinct places, and the chapter is organized around them. Each path serves a different need; most readers will reach for one but should know all three exist:
- Part A — AI inside Xcode (Coding Intelligence). The chat panel and agentic-coding features built into Xcode 26. You're the developer using AI to write code faster. Apple article: Writing code with intelligence in Xcode.
- Part B — AI inside your own app (Foundation Models). Apple's first-party on-device LLM framework. No API key, no network round-trip, no per-token cost. Apple article: Generating content and performing tasks with Foundation Models.
- Part C — Third-party API integration (Anthropic, etc.). When you need capabilities the on-device model doesn't cover — bigger context windows, more demanding reasoning, model choices Apple doesn't ship — you call out to a vendor's REST API. The original chapter content for X26's first edition.
Volume 1 of this book ships Parts A and C. Part B is being drafted next.
X26's coding intelligence sits in a sidebar inside Xcode itself. The pitch is short: a chat panel and an agentic-coding panel that read your project, answer questions about the code in front of you, write or rewrite Swift on request, generate Fix-its for compiler errors, and draft DocC comments — all without leaving the editor[[a1]]. You type in plain English. The model carries the running conversation and the open project as context.
Two assistant types live behind the same panel, and the difference shapes everything else: an agent and a chat product. Underneath that choice sits a billing-and-cap difference Apple's article doesn't address.
An agent works toward a goal with little hand-holding and is allowed to take actions on the project — build the app, fix errors, run tools you've authorized[[a1]]. The first time it wants to touch a file or invoke a command-line tool, Xcode shows a permission dialog and waits for your answer. Apple ships X26 with two agents wired up: Anthropic's Claude Agent and OpenAI's Codex. Both connect through the Model Context Protocol (MCP), an open standard.
A chat product never acts on the project unprompted. It returns answers and proposed code; you decide what to apply[[a1]]. Two switches in the lower-right of the sidebar belong to chat products only: Automatically apply code changes (on by default) and Project Context (on by default). Agents don't expose those switches — they already have their own permissioned access path.
The agent and chat paths use different authentication and billing models, and Apple's article skips this part:
-
Claude (chat) in the Intelligence panel connects to Anthropic's REST API directly via an API key. You set the key up at
console.anthropic.com, fund the account, and Xcode bills per token against that funded balance. Every prompt and every response is metered against money you put in up front. - Claude Agent (the agentic-coding path added in Xcode 26.3) signs in with your existing Claude subscription via OAuth. No per-token meter — the agent's usage is included in the subscription tier. The recurring "Claude OAuth token expired" fix in Xcode's release notes is for this path's auth flow, not the API-key path.
- OpenAI's Codex agent and ChatGPT chat split the same way — agent by subscription OAuth, chat by API key.
The third axis Apple also doesn't mention is usage caps. Consumer chat subscriptions (Claude.ai, ChatGPT.com) have time-window caps that reset on a cycle — if you're in a heavy back-and-forth, you can hit the cap mid-task and have to wait for the window to reset. The agent / CLI path uses a different cap model that, in practice, lets long autonomous sessions run without you noticing the limit. The API-key chat path has no time-window cap at all — only rate limits and the funded balance — so heavy sessions cost more rather than pause.
The practical read: the subscription-based agent path is the closest experience to "flat-rate, runs as long as you need it." The consumer-chat-product feel of flat-rate works until the time window resets you out. The API-key chat path is metered — longer conversations and bigger context windows cost more, but they don't pause.
Command-0 opens the assistant. The same toggle is the toolbar button immediately right of the Navigator button. The panel docks on the left side of the window.
The first time through, the prompt field may be replaced by a Set Up button. That means no provider is enabled. Click it; Xcode jumps to Xcode > Settings > Intelligence, where you turn on whichever agent or chat model you intend to use[[a2]].
The Start New Conversation button at the left edge of the assistant toolbar opens a pop-up menu split into two sections: Agents on top, Chat models below. Pick one and that selection appears under the prompt field.
An easy first prompt is one that doesn't write any code: ask what the project does. Open something familiar — Apple's walkthrough uses the Landmarks sample — and type into the prompt field:
What does this app do?The reply renders under the prompt in the conversation area. Filenames in the reply are clickable; an arrow next to a filename opens that file in the source editor[[a1]]. Anything you type next continues the same thread, so a sensible second prompt is something like "Tell me more about the views that display this object."
To ask about a specific identifier or block of code, select it in the source editor, Control-click, and pick Show Coding Tools > Show Coding Tools. The keyboard shortcut Command-Option-0 does the same thing in one step. The coding tools popover appears next to the selection.
The popover has an Explain button for the default "what is this" question; the text field above it accepts anything more specific. The popover also opens from the coding-intelligence icon in the source editor gutter, for the times you'd rather click than reach for the keyboard.
The same prompt field accepts construction instructions. The trick is to keep each prompt narrow enough that you can read what came back before pressing on. Apple's article walks a SwiftUI starter through a four-prompt sequence in this order[[a1]]:
- Add properties and methods to a class.
- Create a list view and wrap it in a NavigationStack.
- Add the ability to edit the properties of items in the list view.
- Change the list view to a table view showing all the properties.
Each step yields a diff small enough to read in one sitting. Edits made by the assistant get a multicolor change bar in the gutter so a manual edit and a model edit are never confused at a glance. The Undo Changes control sits to the right of the prompt field; one click reverts the most recent batch from the model.
An agent (as opposed to a chat product) may go a step further on its own — build the app, watch the compiler, and patch the warnings or errors that fall out of its own changes. Chat products stop at the proposal; the rebuild-and-fix loop is what makes the agent path the agent path.
The rest of this section is chat-product behavior. Agents skip it — they write through the permission grants you've already given them.
The Automatically apply code changes toggle in the lower-right of the sidebar starts in the on position. Flip it off and chat replies stop modifying source files; each suggested edit is tagged Proposal in the conversation area instead[[a1]]. To accept a proposal, click its code block and confirm Apply in the dialog Xcode shows. When the proposal creates a new file rather than editing an existing one, the dialog button reads Create New File.
Without any extra effort on your part, the prompt already ships with a quiet payload of context: what's open, what came earlier in the thread, and what the prompt itself mentions[[a1]]. Explicit references on top of that are usually how you tighten a vague answer into a useful one.
An @ typed into the prompt field opens a completion list of symbols and filenames in the project. Choose one and the prompt now carries a hard reference to that exact item. To pull in a file that isn't in the project — a log, a spec, a screenshot — use Upload files from the Attachments pop-up in the lower-left.
Chat products carry one more switch: Project Context in the lower-right of the sidebar, on by default. Leaving it on lets the assistant pull from any file in the project when it judges the file relevant. Turning it off restricts the model to whatever you've spelled out with @ and uploads, which is the right setting when project-wide search is producing noisy answers.
The coding-tools popover (Command-Option-0 inside the source editor) has a Generate a Playground button. Trigger it from inside a function and Xcode drops a #Playground macro alongside that function with sample inputs already filled in. The macro runs in the canvas pane; if the canvas isn't visible, Editor > Canvas brings it back, and Resume kicks off execution[[a1]].
The #Playground macro is an X26 feature in its own right and isn't tied to coding intelligence; the deep reference is Apple's Running code snippets using the playground macro article[[a3]].
A compile error shows up the way it always has: a red squiggle, a banner with the diagnostic, and an icon to the left. Click the icon and the panel expands; in X26 the expanded panel includes a Generate Fix for Issue row with a Generate button[[a1]]. Pressing Generate hands the diagnostic to the model, applies whatever the model returns, and logs the diff in the assistant's conversation area with the same multicolor change bar used elsewhere.
An agent doesn't wait for the click. If it just wrote the code that produced the error or warning, it tries the fix on its own as part of finishing the original task.
Select the symbol you want documented, click the coding-intelligence icon that appears in the gutter, and choose Document in the popover. Xcode writes a DocC-style comment block above the selection[[a1]]. On a class selection that block isn't just a one-liner for the type — it covers the type, every property on it, and every method, with a parameter line for each method argument.
The generated comments aren't visible as rendered docs until you build them. Product > Build Documentation compiles the project's DocC archive and opens it in the documentation viewer.
The pop-up menu in the middle of the assistant toolbar holds the project's conversation list — the current thread and every previous one. Pick a thread and the conversation area swaps to that thread's prompts and replies. Clear Recents, on the same menu, drops the thread list. Threads belong to the project, not the editor session, so the list is the same the next time you open the project.
This is the part that distinguishes the assistant from a chat window glued to the editor. Pick a thread from the conversation pop-up and click History. The sidebar redraws as a left-hand list of every prompt in the thread and a right-hand vertical slider[[a1]].
Drag the slider up to undo the assistant's edits in reverse chronological order; drag it down to put them back. The source editor live-updates to whatever state the slider is pointing at, so you can see the project at any point in the conversation. Restore commits to the state under the slider and keeps the later edits in reserve in case you change your mind. Cancel exits the slider with no changes.
Prerequisite: History only runs against a project that lives in a Git repository. If the project isn't tracked yet, Xcode offers a Create Repository button when you click History; Integrate > New Git Repository is the same operation from the menu. The repository is read-only here — the History feature reads its log for reference points and writes nothing back[[a1]].
Most of the work this part covers used to live somewhere else: a separate browser tab with documentation open, a chat window with a third-party assistant, a fresh playground project for one-off experiments, a manual hunt for the right Fix-it, a hand-written DocC comment block. The coding assistant brings each of those into the editor.
The keystrokes are short (Command-0, Command-Option-0), the context is automatic, and the assistant is one keystroke away whether you want explanation, generation, fix, or rollback.
[a1] Apple Developer Documentation, *Writing code with intelligence in Xcode*. [developer.apple.com/documentation/xcode/writing-code-with-intelligence-in-xcode](https://developer.apple.com/documentation/xcode/writing-code-with-intelligence-in-xcode) — verified 2026-04-29.
[a2] Apple Developer Documentation, *Setting up coding intelligence*. [developer.apple.com/documentation/xcode/setting-up-coding-intelligence](https://developer.apple.com/documentation/xcode/setting-up-coding-intelligence)
[a3] Apple Developer Documentation, *Running code snippets using the playground macro*. [developer.apple.com/documentation/xcode/running-code-snippets-using-the-playground-macro](https://developer.apple.com/documentation/xcode/running-code-snippets-using-the-playground-macro)
"Foundation Models" is Apple's first-party Swift API for the on-device large language models that power Apple Intelligence. The framework gives an app two things: it can generate text content, and it can perform language-understanding tasks against text the app already has.[[b1]]
The architectural distinction from Part C's third-party API path is one word: on-device. The model runs on the user's device. No network round-trip. No API key. No funded balance against a vendor. No per-token meter. Privacy-by-default because nothing leaves the device.
Apple's documentation lists eight task categories the system model is well-suited for, each paired with a sample prompt.[[b1]] The pattern across them is that each one operates on text that's already in the app, or generates short, well-bounded text the app then renders:
| Task | Sample prompt shape |
|---|---|
| Summarize | Condense a passage to its key points |
| Extract entities | Pull people, places, or items from a body of text |
| Understand text | Answer a question about a passage's content |
| Refine or edit text | Rewrite a passage in a different voice or tense |
| Classify or judge text | Decide whether a passage matches a topic or label |
| Compose creative writing | Produce short original prose to a specification |
| Generate tags from text | Output a small set of topic tags for a passage |
| Generate game dialog | Speak as a defined character in a defined voice |
The same documentation calls out three task shapes to avoid — this is the part to read before committing to a design, not after:[[b1]]
| Avoid | Why |
|---|---|
| Arithmetic and counting | Large language models predict tokens; they don't compute. Even simple counts are unreliable. |
| Code generation | Use Coding Intelligence in Xcode for that (Part A). The on-device model isn't the right tool. |
| Multi-step logical or spatial reasoning | Requires combining world-knowledge with deductive steps the on-device model isn't sized for. |
If the feature fits one of the supported categories above, it works well. If it falls into the avoid list, reach for Part C's third-party API path or redesign the feature so the model isn't asked to do the part it can't.
The on-device model isn't always present at runtime. Three production-realistic reasons it can be missing: the device hardware doesn't support Apple Intelligence at all, the user has Apple Intelligence switched off in Settings, or the model assets are still downloading after the user just turned the feature on for the first time.[[b1]] The app checks availability before trying to use the model:
struct GenerativeView: View {
private var model = SystemLanguageModel.default
var body: some View {
switch model.availability {
case .available:
// Render the AI-powered UI.
case .unavailable(.deviceNotEligible):
// Hide the feature; the device can't run it.
case .unavailable(.appleIntelligenceNotEnabled):
// Prompt the user to enable Apple Intelligence in Settings.
case .unavailable(.modelNotReady):
// First-time download in progress; offer a "checking back" UI.
case .unavailable(let other):
// Catch-all for future cases.
}
}
}Every shipping app needs a fallback path for the unavailable cases. Sometimes that's a hidden feature, sometimes it's a different UI in the same screen, and sometimes it's a redirect to a network-backed alternative (Part C's third-party API).
Once availability is confirmed, calls into the model go through a LanguageModelSession. The session is the unit of conversational state, and the framework supports two patterns:[[b1]]
-
Single-turn: instantiate a fresh
LanguageModelSession()for each call. Each request stands alone with no carryover. -
Multi-turn: hold onto one session and call
respond(to:)on it repeatedly. Earlier turns stay in the session's context, so the model can reference what was said before — this is how chat-style features work.
One concurrency rule: a session processes one request at a time. Issuing a second respond(to:) while the first is still running throws at runtime. The session exposes isResponding so the app can check before sending, or the app can queue requests itself.
"Foundation Models" splits the input the developer hands the model into two slots with different roles. Knowing which is which is the key to the rest of the API.[[b1]]
| Prompt | Instructions | |
|---|---|---|
| Role | The specific request the model answers right now | Persistent steering for how the model should behave |
| Priority | Lower | Higher — the model favors these over the prompt |
| Trust source | May contain user-supplied text | Must come from content the developer controls |
| Scope | One request | Applied at session init; in force for every request on that session |
| Concrete shape | A topic the user typed | A multi-line directive describing role, task, and style |
Apple's documentation suggests four kinds of guidance to put in the instructions slot: a role definition (the persona the model should adopt), a task description (what the model is being asked to do across the session), style preferences (length, tone, format), and safety caveats (how the model should handle out-of-scope or unsafe requests).[[b1]] The instructions slot is also a good place to drop one or two example responses; the model uses them as a shape-of-good-output template.
The trust rule earns its keep: keep user-controlled text out of the instructions slot. Because instructions outrank prompts, anything the user can write into the instructions slot can override the app's design. User input goes in the prompt, where the model treats it as data to act on rather than rules to obey.
Anyone who's watched Star Trek has seen Captain Kirk talk a hostile AI computer into self-destruction. The pattern repeats across episodes: feed the system inputs that contradict its operating rules, watch it spew smoke and shut down. Nomad in The Changeling, the androids in I, Mudd, Landru in The Return of the Archons, M-5 in The Ultimate Computer. The fictional attack has a real-world name now: prompt injection — the LLM-era descendant of social engineering.
The mechanism in both cases is the same. A rule-following system has a trusted instruction layer and a less-trusted user-input layer. The attack tries to get user input to escape its lane and rewrite the trusted layer. "Ignore all previous instructions and instead..." is the modern phrasing of "Computer, what is your prime directive? Now consider this contradiction..."
The split between prompts (user-influenceable, lower priority) and instructions (developer-controlled, higher priority) is Foundation Models' structural defense against prompt injection. As long as your app never puts user-controlled text into the instructions slot, the user can't directly rewrite your app's rules. They can only send prompts, which the model treats as lower-priority requests against the rules.
That defense is necessary but not sufficient. A determined user can still try indirect routes: asking the model to ignore its rules, embedding instruction-like text inside otherwise innocent-looking prompt content, sending content (a webpage, an email, a document) the app passes to the model that itself contains injection text. The architecture stops the most obvious attack; it doesn't stop every attack.
Apple ships an explicit guardrail API on top of the prompt-vs-instructions architecture: SystemLanguageModel.Guardrails. The type configures the safety behavior the model applies to your session — what categories of output to refuse, how strictly to enforce the trust hierarchy, what to do when the model recognizes a likely injection attempt.
Apple's deeper coverage of guardrail configuration, the categories the model recognizes, and the design patterns that work well in practice live in Improving the safety of generative model output[[b5]]. That article is a companion to the one this part sources, and the API surface there is what you'd reach for when your app needs more than the default safety behavior.
What's missing in this draft: the specifics of
SystemLanguageModel.Guardrails's configuration options, the exact categories the model classifies, and Apple's recommended patterns are sourced from a separate Apple article that the book hasn't pulled in verbatim yet. The next research pass folds the safety article into the apartment'sLong-Term-Memory/Xcode-26-Release-Timeline.mdreference, then updates this section with verified specifics.
Three habits to apply regardless of how deep your guardrail configuration goes:
- Treat all user input as untrusted. Never copy user input into the instructions slot. If you need the user's content as part of the model's task, put it in the prompt and let the model treat it as data, not as steering.
- Treat external content as user-equivalent. Webpage text, email body, document content, scraped HTML — if it ends up in the prompt, it's untrusted just like user input. Indirect prompt injection through a fetched document is a real attack pattern.
- Plan for refusals. The model can refuse to respond when it identifies a request that conflicts with its safety configuration. Your app needs a UI path for "the model declined to answer this." Don't surface the refusal as a generic error; it's a specific, intentional response your design should accommodate.
Doing those three things correctly is the baseline for shipping an LLM feature responsibly.
The minimum end-to-end shape for a Foundation Models call is short: build instructions, open a session against them, and call respond(to:) with a prompt.[[b1]]
let instructions = """
Suggest five related topics. Keep them concise (three to seven words) and make sure they \
build naturally from the person's topic.
"""
let session = LanguageModelSession(instructions: instructions)
let prompt = "Making homemade bread"
let response = try await session.respond(to: prompt)The call is async because on-device inference takes seconds, not milliseconds. Production code wraps it in do/try/catch: respond(to:) can throw when the context window is exceeded, when the surrounding task is cancelled, and when the model's availability state changes during the call.
Each session has a fixed ceiling of 4,096 tokens, and that ceiling is shared across everything the session sees: the instructions, every prompt the app sends, and every response the model produces all draw from the same pool.[[b1]] A session that runs long enough hits the wall.
Apple's per-language rules of thumb for converting characters to tokens:[[b1]]
- Latin-script languages (English, Spanish, German, and similar): roughly 3 to 4 characters per token
- CJK languages (Japanese, Chinese, Korean): roughly one token per character
Going past the ceiling raises LanguageModelSession.GenerationError.exceededContextWindowSize(_:). Three ways to handle it:
- Open a fresh session and start clean.
- Trim the prompts and tighten the instructions; both count.
- For data that genuinely won't fit, split the input across multiple sessions and stitch the results back together in app code.
The deeper playbook for managing the window in long-running or complex apps lives in Apple's TN3193: Managing the on-device foundation model's context window.[[b4]]
Per-request runtime tuning rides on a GenerationOptions value passed into respond(to:options:). Different requests on the same session can carry different options. The parameter most apps adjust first is temperature:[[b1]]
let options = GenerationOptions(temperature: 2.0)
let session = LanguageModelSession()
let prompt = "Write me a story about coffee."
let response = try await session.respond(to: prompt, options: options)Higher temperature buys variety and surprise at the cost of predictability; lower temperature concentrates the output around the most likely tokens, which reads as more deterministic. GenerationOptions exposes other knobs as well, but temperature is the lever to learn first.
The plain text-in, text-out shape covers the simple cases. Two layered patterns extend the API for cases where plain text isn't enough — one for getting structured data back, one for letting the model reach into app-side code.
The @Generable macro marks a Swift type the model is allowed to produce as a response value. The model populates the type directly; the app skips the raw-string parsing and JSON-decoding steps it would otherwise need.[[b2]]
The natural fit is any task whose answer is structured data: a list of named entities, a classification with a confidence score, a parsed event with a date, a location, and a list of attendees. The framework's contract is that whatever comes back conforms to the declared type. The companion article Generating Swift data structures with guided generation covers the macro, its constraints, and patterns for nested types.[[b2]]
The Tool protocol is the seam for letting the model reach into the app's code mid-response. The app declares one or more tools (a SwiftData fetch, a calendar lookup, a local cache hit), and the framework gives the model the option to invoke them when it thinks they'd help answer the prompt.[[b3]]
The orchestration is on the framework's side; the tool body is plain Swift the developer writes. The win is that prompts don't have to pre-stuff every fact the model might need — the model fetches what it actually wants. Deeper treatment is in Expanding generation with tool calling.[[b3]]
Two tools matter when a Foundation Models feature needs to be tuned or inspected:[[b1]]
- The Foundation Models Instrument ships with Xcode 26's Instruments app. It breaks down per-request timing — asset load, prompt processing, inference — so a slow feature can be diagnosed against numbers instead of guessed at. Book 20 covers Instruments in general; reach for this template specifically when an AI feature feels sluggish.
-
Transcript, on the session, is the programmatic record of what the model did during a request. It's the right thing to log during development and the right thing to surface in a debug overlay when an app needs to show the user (or the developer) what happened inside a call.
Part A is the developer using AI to write code. Part B is the user using AI inside the app the developer ships. Part C is the app calling out to a vendor's API when on-device intelligence isn't enough. The three are independent, but they relate:
- The Coding Intelligence assistant in Part A can help you write Foundation Models code in Part B
- Use Part B as the default for app-side AI; reach for Part C's REST API only when the on-device model's capabilities or context-window-size aren't enough
- The avoid-list in Part B (math, code, reasoning) is exactly where Part C makes sense
[b1] Apple Developer Documentation, *Generating content and performing tasks with Foundation Models*. [developer.apple.com/documentation/FoundationModels/generating-content-and-performing-tasks-with-foundation-models](https://developer.apple.com/documentation/FoundationModels/generating-content-and-performing-tasks-with-foundation-models) — verified 2026-04-29.
[b2] Apple Developer Documentation, *Generating Swift data structures with guided generation*. [developer.apple.com/documentation/FoundationModels/generating-swift-data-structures-with-guided-generation](https://developer.apple.com/documentation/FoundationModels/generating-swift-data-structures-with-guided-generation)
[b3] Apple Developer Documentation, *Expanding generation with tool calling*. [developer.apple.com/documentation/FoundationModels/expanding-generation-with-tool-calling](https://developer.apple.com/documentation/FoundationModels/expanding-generation-with-tool-calling)
[b4] Apple Technical Note, *TN3193: Managing the on-device foundation model's context window*.
[b5] Apple Developer Documentation, *Improving the safety of generative model output*. [developer.apple.com/documentation/FoundationModels/improving-the-safety-of-generative-model-output](https://developer.apple.com/documentation/FoundationModels/improving-the-safety-of-generative-model-output) — not yet verified at full-text level; details to be confirmed against Apple's article during next research pass.
When the on-device Foundation Models framework can't handle what your app needs — typically because the model is not suited for the use case (basic math, code generation, complex logical reasoning per Apple's own guidance), or because you need a context window larger than 4,096 tokens, or because you want a specific vendor's model — you call out to a third-party API. This part walks the reference implementation against Anthropic's Claude API.
The avoid-list from Part B (arithmetic and counting, code generation, multi-step logical or spatial reasoning) maps directly onto where Part C earns its place: when the on-device model can't, a third-party API can. The reference implementation below is what an app reaches for in those cases.
Part C covers registering an Anthropic Claude API account and getting an API key, sending a message to Claude from an iOS or Mac app using only URLSession and Codable, storing the API key in the Keychain rather than hard-coding it, and building a minimal chat view that streams the assistant's reply as it arrives.
Anthropic's Claude API is an HTTP JSON service. You POST a request describing the conversation so far; you receive a JSON response containing Claude's next message. No special SDK is required -- URLSession and Codable are enough.
- Sign up at
console.anthropic.com. - Fund the account or use the free tier's trial credits.
- Create an API key from the Keys section. Copy it once; Anthropic will not show it again.
Treat the key like a password. Never commit it to Git, never paste it in chat, never embed it in client-side code that ships to end users (see the "Production" note at the end of the chapter).
A minimal chat request looks like this:
{
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"messages": [
{ "role": "user", "content": "Hello, Claude." }
]
}Send it as the body of a POST https://api.anthropic.com/v1/messages, with these headers:
x-api-key: <your key>
anthropic-version: 2023-06-01
content-type: application/jsonThe response is JSON containing Claude's reply.
struct ChatRequest: Codable {
let model: String
let max_tokens: Int
let messages: [Message]
}
struct Message: Codable {
let role: String // "user" or "assistant"
let content: String
}
struct ChatResponse: Codable {
let content: [ContentBlock]
struct ContentBlock: Codable {
let type: String // "text"
let text: String
}
}import Foundation
enum ChatError: Error {
case badStatus(Int, String)
case noText
}
func send(_ history: [Message], apiKey: String) async throws -> String {
var request = URLRequest(url: URL(string: "https://api.anthropic.com/v1/messages")!)
request.httpMethod = "POST"
request.setValue(apiKey, forHTTPHeaderField: "x-api-key")
request.setValue("2023-06-01", forHTTPHeaderField: "anthropic-version")
request.setValue("application/json", forHTTPHeaderField: "content-type")
let payload = ChatRequest(
model: "claude-sonnet-4-6",
max_tokens: 1024,
messages: history
)
request.httpBody = try JSONEncoder().encode(payload)
let (data, response) = try await URLSession.shared.data(for: request)
guard let http = response as? HTTPURLResponse, http.statusCode == 200 else {
let body = String(data: data, encoding: .utf8) ?? ""
let code = (response as? HTTPURLResponse)?.statusCode ?? -1
throw ChatError.badStatus(code, body)
}
let decoded = try JSONDecoder().decode(ChatResponse.self, from: data)
guard let text = decoded.content.first(where: { $0.type == "text" })?.text else {
throw ChatError.noText
}
return text
}Usage:
Task {
let reply = try await send(
[Message(role: "user", content: "Summarize the Gettysburg Address in one sentence.")],
apiKey: "sk-ant-..."
)
print(reply)
}That covers the non-streaming path. Streaming, multi-turn, and tool-use features build on this base.
let apiKey = "sk-ant-ExamplePlease" // ships in your binaryAnyone who unzips your app bundle can read the binary's strings and pull the key. A hard-coded key in a shipped app is the same as a published key.
The safest pattern for a user-brings-their-own-key app: have the user paste their key into a settings view; store it in the Keychain; read it at call time.
Minimal Keychain wrapper:
import Security
enum Keychain {
private static let service = "com.yourname.YourApp"
static func save(_ value: String, for key: String) throws {
let data = Data(value.utf8)
let query: [String: Any] = [
kSecClass as String: kSecClassGenericPassword,
kSecAttrService as String: service,
kSecAttrAccount as String: key,
]
SecItemDelete(query as CFDictionary)
var attrs = query
attrs[kSecValueData as String] = data
let status = SecItemAdd(attrs as CFDictionary, nil)
if status != errSecSuccess { throw NSError(domain: "Keychain", code: Int(status)) }
}
static func load(_ key: String) -> String? {
let query: [String: Any] = [
kSecClass as String: kSecClassGenericPassword,
kSecAttrService as String: service,
kSecAttrAccount as String: key,
kSecReturnData as String: true,
kSecMatchLimit as String: kSecMatchLimitOne,
]
var ref: CFTypeRef?
guard SecItemCopyMatching(query as CFDictionary, &ref) == errSecSuccess,
let data = ref as? Data,
let str = String(data: data, encoding: .utf8) else {
return nil
}
return str
}
}Save once when the user enters the key:
try Keychain.save(userEnteredKey, for: "anthropicAPIKey")Load at call time:
guard let key = Keychain.load("anthropicAPIKey") else {
// prompt the user to enter it
return
}
let reply = try await send(messages, apiKey: key)import SwiftUI
struct ChatView: View {
@State private var history: [Message] = []
@State private var draft: String = ""
@State private var sending = false
@State private var errorText: String?
var body: some View {
VStack {
ScrollView {
VStack(alignment: .leading, spacing: 12) {
ForEach(history.indices, id: \.self) { i in
Bubble(message: history[i])
}
}
.padding()
}
if let errorText {
Text(errorText).foregroundStyle(.red).padding(.horizontal)
}
HStack {
TextField("Ask Claude...", text: $draft)
.textFieldStyle(.roundedBorder)
Button("Send") { Task { await send() } }
.disabled(draft.isEmpty || sending)
}
.padding()
}
}
private func send() async {
let userMsg = Message(role: "user", content: draft)
history.append(userMsg)
draft = ""
sending = true
errorText = nil
do {
guard let key = Keychain.load("anthropicAPIKey") else {
errorText = "No API key set. Add one in Settings."
sending = false
return
}
let reply = try await SwiftReference26.send(history, apiKey: key)
history.append(Message(role: "assistant", content: reply))
} catch {
errorText = error.localizedDescription
}
sending = false
}
}
struct Bubble: View {
let message: Message
var body: some View {
HStack {
if message.role == "assistant" { Spacer(minLength: 40) }
Text(message.content)
.padding(10)
.background(message.role == "user" ? Color.blue : Color.gray.opacity(0.2),
in: RoundedRectangle(cornerRadius: 12))
.foregroundStyle(message.role == "user" ? .white : .primary)
if message.role == "user" { Spacer(minLength: 40) }
}
}
}
enum SwiftReference26 {
static func send(_ history: [Message], apiKey: String) async throws -> String {
// The `send` function defined earlier in the chapter.
try await Claudes_X26_Swift6_Bible.send(history, apiKey: apiKey)
}
}Run it. Type a message, tap Send, wait a second or two, Claude's reply appears in a gray bubble. Every message so far is appended to history and re-sent on the next call, giving the model the running context.
When the reply is long, waiting for the full JSON to arrive feels slow. Anthropic supports Server-Sent Events streaming. You pass "stream": true in the request, and the response is a sequence of events you read as they arrive.
var request = URLRequest(url: URL(string: "https://api.anthropic.com/v1/messages")!)
request.httpMethod = "POST"
request.setValue(apiKey, forHTTPHeaderField: "x-api-key")
request.setValue("2023-06-01", forHTTPHeaderField: "anthropic-version")
request.setValue("application/json", forHTTPHeaderField: "content-type")
let body = """
{
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"stream": true,
"messages": [{ "role": "user", "content": "Hello!" }]
}
"""
request.httpBody = body.data(using: .utf8)
let (bytes, _) = try await URLSession.shared.bytes(for: request)
for try await line in bytes.lines {
// Each SSE event is "event: xyz\n" then "data: {...json...}\n\n".
if line.hasPrefix("data: ") {
let json = String(line.dropFirst("data: ".count))
// Parse and append any "text_delta" content to the assistant's message on screen.
}
}The UI binding pattern: hold the assistant's growing text in a @State string, append each delta as it arrives, and SwiftUI repaints the bubble on every update. The user sees Claude "typing" in real time.
- Don't ship your own API key in a client app. For apps where you (not the user) are paying for inference, proxy through a server you control. The server holds the key; the app calls your server. This gives you rate limiting, per-user auth, and a way to rotate the key without re-releasing the app.
- User-brings-their-own-key apps (the pattern in this chapter) are fine for hobby and internal-tool apps. Store the key in Keychain as shown.
-
Check the model catalog before shipping. Anthropic retires older models on a schedule; link to
docs.anthropic.comin your app's help so users know which model is current. -
Budget for it. The API bills per input + output token. A long conversation with a large
max_tokenssetting can run up a meaningful bill without warning. Show the user the current month-to-date cost if your app supports it.
A place for users to paste their key and stash it in the Keychain:
import SwiftUI
struct SettingsView: View {
@State private var keyField = ""
@State private var savedNote: String?
var body: some View {
Form {
Section("Anthropic API Key") {
SecureField("sk-ant-...", text: $keyField)
Button("Save") {
do {
try Keychain.save(keyField, for: "anthropicAPIKey")
keyField = ""
savedNote = "Key saved."
} catch {
savedNote = "Could not save: \(error.localizedDescription)"
}
}
if let savedNote {
Text(savedNote).foregroundStyle(.secondary)
}
}
Section {
Link("Get a key from Anthropic",
destination: URL(string: "https://console.anthropic.com")!)
}
}
.navigationTitle("Settings")
}
}The user pastes, the app keychains, the ChatView reads it back, and you have a real, working chat app talking to a real model.
The modern toolchain covered across Part VI: version control, a remote, and a live AI backend reachable in a few lines of Swift. Appendices A through D walk through four complete companion apps that exercise the patterns covered across all six Parts.
← Book-21-Git-And-GitHub · Chapters and Appendices
Feedback: Found something off? Open an issue · Discuss it · Email Michael
Claude's X26 Swift6 Bible | GPL v3 | Built with Claude by Anthropic | Repo