An xcframework for running Microsoft's Phi-3 locally in an iOS app using candle.
Includes a sample iOS app.
Run ./build.sh
to build the xcframework.
The open phi.engine.sample/phi.engine.sample.xcodeproj
and build the SwiftUI app.
For a detailed explanation of how this works, check out the blog post here.
✅ Tested on iPad Air M1 8GB RAM ✅ Should work on 6GB RAM iPhones too ❌ Will not work on 4GB RAM iPhones
However, for 4GB RAM iPhones, it's possible to use the (very) low fidelity Q2_K quantized model. Such model is not included in the official Phi-3 release, but I tested this one from HuggingFace on an iPhone 12 mini successfully.
This is the setup in the app:
self.engine = try! PhiEngine(engineOptions: EngineOptions(cacheDir: FileManager.default.temporaryDirectory.path(), systemInstruction: nil, tokenizerRepo: nil, modelRepo: "SanctumAI/Phi-3-mini-4k-instruct-GGUF", modelFileName: "phi-3-mini-4k-instruct.Q2_K.gguf", modelRevision: "main"), eventHandler: ModelEventsHandler(parent: self))
This variant requires the system instruction to be set differently too, as it recognizes the system token, so in other for the model to stop the inference correctly, this line in the Rust code:
let prompt_with_history = format!("<|user|>\nYour overall instructions are: {}<|end|>\n<|assistant|>Understood, I will adhere to these instructions<|end|>{}\n<|assistant|>\n", self.system_instruction, history_prompt);
needs to be changed to:
let prompt_with_history = format!("<|system|>\nYour overall instructions are: {}<|end|>{}\n<|assistant|>\n", self.system_instruction, history_prompt);