Skip to content

This also applies to Linux aarch64#167

Merged
ericcurtin merged 1 commit intomainfrom
aarch64
Sep 24, 2025
Merged

This also applies to Linux aarch64#167
ericcurtin merged 1 commit intomainfrom
aarch64

Conversation

@ericcurtin
Copy link
Copy Markdown
Contributor

@ericcurtin ericcurtin commented Sep 24, 2025

It's a decent heuristic in general for aarch64.

Summary by Sourcery

Enhancements:

  • Extend ARM64-specific thread count tuning in the LlamaCPP backend to Linux aarch64

Copilot AI review requested due to automatic review settings September 24, 2025 11:36
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Sep 24, 2025

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

The PR updates the default LlamaCpp configuration to extend the ARM64 thread-count heuristic beyond Windows ARM64 to also cover Linux aarch64 by adjusting the platform check in NewDefaultLlamaCppConfig.

File-Level Changes

Change Details Files
Broaden ARM64 platform detection for thread-count heuristic
  • Updated comment to reflect ARM64 support on Linux (aarch64)
  • Modified conditional from checking Windows ARM64 to checking generic ARM64 (aarch64)
  • Retained existing logic to cap threads at half of core count
  • Removed the GOOS check for Windows
pkg/inference/backends/llamacpp/llamacpp_config.go

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @ericcurtin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request aims to broaden the applicability of a performance heuristic within the llama.cpp backend configuration. The change attempts to extend a specific thread count adjustment, previously limited to Windows ARM64, to a wider range of ARM64 (aarch64) systems, including Linux. This is intended to improve performance on these architectures by optimizing thread usage, though the current implementation of the conditional check contains a logical flaw that would prevent its execution.

Highlights

  • Configuration Logic Update: The special case logic for llama.cpp configuration, which adjusts thread count heuristics, has been modified. Previously, this logic was applied specifically to Windows ARM64 systems. The change intends to extend this heuristic to aarch64 architectures, including Linux aarch64. However, the updated conditional check runtime.GOARCH == "aarch64" && runtime.GOARCH == "arm64" contains a logical error, as runtime.GOARCH cannot simultaneously be both "aarch64" and "arm64", which will prevent this special case from ever being applied.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `pkg/inference/backends/llamacpp/llamacpp_config.go:23-24` </location>
<code_context>

-	// Special case for Windows ARM64
-	if runtime.GOOS == "windows" && runtime.GOARCH == "arm64" {
+	// Special case for ARM64 (aarch64 on Linux)
+	if runtime.GOARCH == "aarch64" && runtime.GOARCH == "arm64" {
 		// Using a thread count equal to core count results in bad performance, and there seems to be little to no gain
 		// in going beyond core_count/2.
</code_context>

<issue_to_address>
**issue (bug_risk):** The conditional checks for both 'aarch64' and 'arm64' simultaneously, which is always false.

Update the condition to check for the correct architecture value, as only one can be true at a time.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR extends ARM64 performance optimization from Windows-only to all ARM64 platforms by updating the architecture check condition.

  • Modifies the condition to apply ARM64 thread optimization to both Linux aarch64 and other ARM64 platforms
  • Updates the comment to reflect the broader platform support

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to generalize a performance heuristic for ARM64 architectures. However, the conditional logic introduced contains a bug where it uses a logical AND (&&) instead of a logical OR (||), which will cause the condition to always be false. I've provided a suggestion to fix this. It would also be beneficial to update the corresponding unit tests to cover this expanded logic for all ARM64 platforms.

@ericcurtin
Copy link
Copy Markdown
Contributor Author

Tested on ampere system:

containers/ramalama#934

I'm 90% sure GOARCH arm64 is aarch64 on Linux. Testing...

Copy link
Copy Markdown
Contributor

@doringeman doringeman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see it's arm64.

$ docker run --rm -it docker.io/library/golang:1.24.7 sh -c 'arch && go env GOARCH'
aarch64
arm64

@ericcurtin
Copy link
Copy Markdown
Contributor Author

ericcurtin commented Sep 24, 2025

I see it's arm64.

$ docker run --rm -it docker.io/library/golang:1.24.7 sh -c 'arch && go env GOARCH'
aarch64
arm64

Cool, I can delete

Copilot AI review requested due to automatic review settings September 24, 2025 11:46
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@ericcurtin
Copy link
Copy Markdown
Contributor Author

Should be good @doringeman removed aarch64

It's a decent heuristic in general for aarch64.

Signed-off-by: Eric Curtin <eric.curtin@docker.com>
doringeman
doringeman approved these changes Sep 24, 2025
@ericcurtin ericcurtin merged commit e121161 into main Sep 24, 2025
5 checks passed
@ericcurtin ericcurtin deleted the aarch64 branch September 24, 2025 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants