Skip to content

Add -w/--webui switch for llama-cpp webui integration#15

Merged
scouzi1966 merged 3 commits into
mainfrom
feature/webui-integration
Jan 23, 2026
Merged

Add -w/--webui switch for llama-cpp webui integration#15
scouzi1966 merged 3 commits into
mainfrom
feature/webui-integration

Conversation

@scouzi1966
Copy link
Copy Markdown
Owner

@scouzi1966 scouzi1966 commented Jan 23, 2026

Summary

  • Add -w/--webui CLI flag to enable webui and open browser automatically
  • Add llama.cpp as git submodule (sparse checkout for webui files only)
  • Add /props endpoint for llama.cpp webui compatibility
  • Add webui serving with gzip decompression and CSS injection
  • Make model field optional in chat completion requests (webui compatibility)
  • Add Makefile targets: submodules, webui, build-with-webui
  • Update distribution scripts to include webui resources
  • Add Homebrew-compatible paths for webui discovery

Usage

# Build with webui
make build-with-webui

# Run with webui
afm -w
afm --webui
afm -w --port 8080

Test plan

  • Verify -w and --webui flags work
  • Verify webui loads in browser
  • Test /health endpoint with webui enabled
  • Test /v1/models endpoint with webui enabled
  • Test /v1/chat/completions (non-streaming) with webui enabled
  • Test /v1/chat/completions (streaming) with webui enabled
  • Test chat completions without model field (webui style)
  • Verify /props endpoint returns expected format

Notes

  • Node.js required only for building webui (make webui), not at runtime
  • Most webui settings (penalties, top-k, etc.) have no effect - Apple Foundation Model only supports temperature
  • Homebrew formula needs update to install webui to share directory

🤖 Generated with Claude Code

Summary by Sourcery

Add optional llama.cpp-compatible web UI integration and metadata endpoints, and update build and distribution tooling to bundle and serve the web UI when requested.

New Features:

  • Introduce a -w/--webui CLI flag to enable a bundled web UI and open it in the default browser.
  • Expose a /props endpoint and adjust model metadata for compatibility with the llama.cpp web UI.
  • Allow chat completion requests to omit the model field by defaulting to the foundation model.

Enhancements:

  • Add gzip-based web UI serving with runtime discovery of bundled resources and minimal UI customization for unsupported features.
  • Improve server startup logs to reflect web UI status and browser launch behavior.

Build:

  • Add llama.cpp as a git submodule and Makefile targets to build the web UI assets and perform web UI-inclusive builds.
  • Update portable build and distribution scripts to copy and install web UI resources alongside the afm binary, including Homebrew-compatible share paths.

scouzi1966 and others added 2 commits January 22, 2026 12:38
- Add -w/--webui CLI flag to enable webui and open browser
- Add llama.cpp as git submodule (sparse checkout for webui only)
- Add /props endpoint for llama.cpp webui compatibility
- Add webui serving with gzip decompression and CSS injection
- Make 'model' field optional in chat completion requests
- Add Makefile targets: submodules, webui, build-with-webui
- Update build-portable.sh to include webui resources
- Update .gitignore for webui build artifacts

The webui uses OpenAI-compatible endpoints (/v1/chat/completions,
/v1/models, /health) which are already implemented. CSS injection
hides the attachment button since AFM doesn't support file uploads.

Note: Most webui settings (penalties, top-k, etc.) have no effect
as Apple Foundation Model only supports temperature parameter.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Homebrew-style paths to webui discovery (/usr/local/share/afm/webui/,
  /opt/homebrew/share/afm/webui/)
- Update create-distribution.sh to include webui in tarball
- Update portable install script to install webui to share directory

This ensures the webui works when installed via Homebrew tap or
portable distribution package.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented Jan 23, 2026

Reviewer's Guide

Adds an optional llama.cpp-compatible web UI mode (toggled via -w/--webui) that serves a gzip-compressed SPA, exposes a /props endpoint, and relaxes chat completion model requirements, along with build/distribution plumbing for bundling the web UI and llama.cpp submodule.

Sequence diagram for webui request handling with gzip and CSS injection

sequenceDiagram
    actor User
    participant Browser
    participant Server
    participant FileSystem

    User->>Browser: Navigate to afm URL (afm -w)
    Browser->>Server: GET /
    Server->>FileSystem: Read index.html.gz
    FileSystem-->>Server: Gzipped HTML bytes
    Server->>Server: gunzip(data)
    alt Decompression succeeds
        Server->>Server: Inject customCSS before </head>
        Server-->>Browser: 200 OK
        note right of Server: contentType=text/html
    else Decompression fails
        Server-->>Browser: 200 OK (gzip payload)
        note right of Server: contentEncoding=gzip
    end

    Browser->>Server: GET /props
    Server-->>Browser: 200 OK PropsResponse JSON
Loading

Updated class diagram for server, commands, and webui-related models

classDiagram
    class Server {
        - app: Application
        - port: Int
        - hostname: String
        - verbose: Bool
        - streamingEnabled: Bool
        - instructions: String
        - adapter: String?
        - temperature: Double?
        - randomness: String?
        - permissiveGuardrails: Bool
        - webuiEnabled: Bool
        - webuiPath: String?
        + init(port: Int, hostname: String, verbose: Bool, streamingEnabled: Bool, instructions: String, adapter: String?, temperature: Double?, randomness: String?, permissiveGuardrails: Bool, webuiEnabled: Bool) async throws
        + start() async throws
        + shutdown()
        - serveWebuiWithCustomCSS(webuiFilePath: String, req: Request) async throws -> Response
        - openBrowser(url: String)
        - static findWebuiPath() -> String?
        - static gunzip(data: Data) throws -> Data
        - static customCSS: String
    }

    class ChatCompletionsController {
        - streamingEnabled: Bool
        - instructions: String
        - adapter: String?
        - temperature: Double?
        - randomness: String?
        - permissiveGuardrails: Bool
        + boot(routes: RoutesBuilder) throws
        + create(req: Request) async throws -> Response
    }

    class ChatCompletionRequest {
        + model: String?
        + messages: [Message]
        + temperature: Double?
        + maxTokens: Int?
    }

    class ModelsResponse {
        + object: String
        + data: [ModelInfo]
    }

    class ModelInfo {
        + id: String
        + object: String
        + created: Int
        + owned_by: String
    }

    class PropsResponse {
        + default_generation_settings: DefaultGenerationSettings
        + total_slots: Int
        + model_path: String
        + role: String
        + modalities: Modalities
        + chat_template: String
        + bos_token: String
        + eos_token: String
        + build_info: String
    }

    class DefaultGenerationSettings {
        + n_ctx: Int
        + params: GenerationParams
    }

    class GenerationParams {
        + n_predict: Int
        + temperature: Double
        + top_k: Int
        + top_p: Double
        + min_p: Double
        + stream: Bool
        + max_tokens: Int
    }

    class Modalities {
        + vision: Bool
        + audio: Bool
    }

    class ServeCommand {
        + port: Int
        + hostname: String
        + verbose: Bool
        + noStreaming: Bool
        + instructions: String
        + adapter: String?
        + temperature: Double?
        + randomness: String?
        + permissiveGuardrails: Bool
        + webui: Bool
        + run() throws
    }

    class RootCommand {
        + port: Int
        + hostname: String
        + verbose: Bool
        + noStreaming: Bool
        + instructions: String
        + adapter: String?
        + temperature: Double?
        + randomness: String?
        + permissiveGuardrails: Bool
        + webui: Bool
        + run() throws
    }

    Server o-- PropsResponse
    PropsResponse o-- DefaultGenerationSettings
    DefaultGenerationSettings o-- GenerationParams
    PropsResponse o-- Modalities

    ModelsResponse o-- ModelInfo

    ChatCompletionsController o-- ChatCompletionRequest

    ServeCommand --> Server
    RootCommand --> ServeCommand
Loading

File-Level Changes

Change Details Files
Introduce web UI mode on the server/CLI and serve llama.cpp-compatible SPA with gzip decompression and CSS injection.
  • Extend Server initializer with webuiEnabled flag and resolve webui index.html.gz path at startup.
  • Add /props endpoint returning llama.cpp-style generation settings and model metadata.
  • Register root and SPA-fallback routes that serve the web UI HTML, skipping API paths like /v1/*, /health, and /props.
  • Implement gunzip helper using Compression framework to decompress gzip content and inject custom CSS hiding unsupported attachment UI elements.
  • Log web UI status in startup output and automatically open the default browser when WebUI is enabled and found.
  • Add helper to locate webui resources across portable, development, and Homebrew-style locations and an openBrowser helper to call /usr/bin/open.
Sources/MacLocalAPI/Server.swift
Expose CLI flags to enable the web UI and propagate configuration to the server.
  • Add -w/--webui flag to ServeCommand and RootCommand.
  • Pass webuiEnabled value into Server initializer when launching the server.
  • Plumb webui flag from root command into the underlying serve command.
Sources/MacLocalAPI/main.swift
Relax chat completion API model handling for web UI compatibility.
  • Make ChatCompletionRequest.model optional in OpenAIRequest models.
  • Default model field to "foundation" in non-streaming ChatCompletionResponse when request omits model.
  • Default model field to "foundation" for streaming responses and final usage chunk when request omits model.
Sources/MacLocalAPI/Models/OpenAIRequest.swift
Sources/MacLocalAPI/Controllers/ChatCompletionsController.swift
Align models and props API responses with expected llama.cpp/OpenAI shapes.
  • Rename ModelInfo.ownedBy to owned_by to match expected JSON key.
  • Add PropsResponse, DefaultGenerationSettings, GenerationParams, and Modalities types to model /props response.
  • Wire /props route to emit AFM-specific default generation settings and build info.
Sources/MacLocalAPI/Server.swift
Add build and distribution support for bundling llama.cpp web UI assets.
  • Extend Makefile with submodules, webui, and build-with-webui targets to pull llama.cpp submodule, build the web UI via npm, and copy index.html.gz into Resources/webui.
  • Update help target text and examples to document new web UI build targets.
  • Update build-portable.sh to copy Resources/webui into .build/release/Resources/webui, track whether web UI is included, and print usage hints when present.
  • Update create-distribution.sh to include web UI resources under share/afm/webui, install them via install.sh into /usr/local/share/afm/webui, and mention them in package contents and usage notes.
Makefile
build-portable.sh
create-distribution.sh
Vendor llama.cpp as a submodule and adjust git configuration for web UI source.
  • Add .gitmodules entry for vendor/llama.cpp submodule.
  • Add vendor/llama.cpp directory from submodule into the repo and rely on sparse checkout for web UI-related paths.
  • Ensure .gitignore is updated/left to allow tracked vendor content as needed.
.gitmodules
vendor/llama.cpp
.gitignore

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The custom gunzip implementation uses a fixed 10MB buffer and compression_decode_buffer once, which risks truncating larger webui bundles; consider either streaming/decompressing in chunks or dynamically growing the buffer until decompression succeeds.
  • Auto-opening the browser whenever -w/--webui is enabled may be undesirable in headless/CI or remote server scenarios; consider making the browser launch optional (e.g., a separate flag or tied to verbose) while still serving the web UI.
  • In create-distribution.sh, the install script message now says Requires macOS 26+, which looks like a typo compared to the previous 15.1+; please confirm and correct the version string if needed.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The custom `gunzip` implementation uses a fixed 10MB buffer and `compression_decode_buffer` once, which risks truncating larger webui bundles; consider either streaming/decompressing in chunks or dynamically growing the buffer until decompression succeeds.
- Auto-opening the browser whenever `-w/--webui` is enabled may be undesirable in headless/CI or remote server scenarios; consider making the browser launch optional (e.g., a separate flag or tied to `verbose`) while still serving the web UI.
- In `create-distribution.sh`, the install script message now says `Requires macOS 26+`, which looks like a typo compared to the previous `15.1+`; please confirm and correct the version string if needed.

## Individual Comments

### Comment 1
<location> `create-distribution.sh:110` </location>
<code_context>
+echo "Start with webui: afm -w"
 echo ""
-echo "Note: Requires macOS 15.1+ and Apple Intelligence enabled"
+echo "Note: Requires macOS 26+ and Apple Intelligence enabled"
 EOF

</code_context>

<issue_to_address>
**issue (typo):** The minimum macOS version in the installer message appears to be a typo.

The script now prints `Requires macOS 26+`, which looks accidental and may mislead users or packagers about supported versions. If the minimum hasn’t actually changed, this should remain `macOS 15.1+` (or whatever the correct minimum is) to reflect the real requirement.

```suggestion
echo "Note: Requires macOS 15.1+ and Apple Intelligence enabled"
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread create-distribution.sh
echo "Start with webui: afm -w"
echo ""
echo "Note: Requires macOS 15.1+ and Apple Intelligence enabled"
echo "Note: Requires macOS 26+ and Apple Intelligence enabled"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (typo): The minimum macOS version in the installer message appears to be a typo.

The script now prints Requires macOS 26+, which looks accidental and may mislead users or packagers about supported versions. If the minimum hasn’t actually changed, this should remain macOS 15.1+ (or whatever the correct minimum is) to reflect the real requirement.

Suggested change
echo "Note: Requires macOS 26+ and Apple Intelligence enabled"
echo "Note: Requires macOS 15.1+ and Apple Intelligence enabled"

- Inject JavaScript to rebrand webui ("Apple Foundation Models" instead of "llama.cpp")
- Change subtitle to "Type a message to get started" (removes upload reference)
- Update Makefile to document pinned llama.cpp commit
- Add submodule-status target to show pinned versions
- Remove --recursive flag (not needed for webui-only sparse checkout)

The llama.cpp submodule is pinned to commit 0e4ebeb05 for reproducible builds.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@scouzi1966 scouzi1966 merged commit b9d38ad into main Jan 23, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant