Skip to content

Conversation

@hzxuzhonghu
Copy link
Member

  • Please check if the PR fulfills these requirements
  • The commit message follows our guidelines
  • What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)

/kind documentation

Signed-off-by: Zhonghu Xu <xuzhonghu@huawei.com>
@volcano-sh-bot volcano-sh-bot added the kind/documentation Categorizes issue or PR as related to documentation. label Jan 6, 2026
@gemini-code-assist
Copy link

Summary of Changes

Hello @hzxuzhonghu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new English blog post that formally announces and describes Kthena, a new sub-project under Volcano. Kthena is presented as a Kubernetes-native, high-performance system designed to optimize Large Language Model (LLM) inference by providing advanced routing, orchestration, and scheduling capabilities. The post elaborates on how Kthena tackles common challenges in deploying LLMs at scale, aiming to improve GPU/NPU utilization, reduce latency, and simplify multi-model management within cloud-native environments.

Highlights

  • New Blog Post: A new English blog post has been added, introducing Kthena, a new sub-project of Volcano focused on cloud-native LLM inference.
  • Kthena's Purpose: The blog post details how Kthena addresses critical challenges in LLM serving on Kubernetes, such as low resource utilization, latency vs. throughput trade-offs, complex multi-model management, and the need for native K8s integration.
  • Core Features: Kthena's key capabilities are outlined, including production-grade inference orchestration (ModelServing), out-of-the-box deployment (ModelBooster), intelligent model-aware routing, cost-driven autoscaling, broad hardware and engine support, and built-in flow control and fairness.
  • Performance Gains: The post highlights significant performance improvements, such as a ~2.73x increase in throughput and a ~73.5% reduction in Time To First Token (TTFT) when using Kthena's KV Cache Awareness + Least Request strategy for long system prompts.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@volcano-sh-bot volcano-sh-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 6, 2026
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a new blog post introducing Kthena. The content looks informative. However, I've found a critical issue where the entire blog post content is duplicated within the same file, which needs to be resolved. I've also left a medium-severity comment regarding an inconsistency between the filename and the post's title, which could be improved for better maintainability and cleaner URLs.

Signed-off-by: Zhonghu Xu <xuzhonghu@huawei.com>
@volcano-sh-bot volcano-sh-bot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jan 7, 2026
Signed-off-by: Zhonghu Xu <xuzhonghu@huawei.com>
@hzxuzhonghu
Copy link
Member Author

@JesseStutler

@JesseStutler
Copy link
Member

/approve
/lgtm
Thanks!

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Jan 12, 2026
@volcano-sh-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JesseStutler

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 12, 2026
@volcano-sh-bot volcano-sh-bot merged commit be7dac3 into volcano-sh:master Jan 12, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/documentation Categorizes issue or PR as related to documentation. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants