Add kthena blog english version #438

hzxuzhonghu · 2026-01-06T12:46:07Z

Please check if the PR fulfills these requirements

The commit message follows our guidelines

What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)

/kind documentation

What this PR does / why we need it:
Following Introduce kthena #436
Which issue(s) this PR fixes:

Signed-off-by: Zhonghu Xu <xuzhonghu@huawei.com>

gemini-code-assist · 2026-01-06T12:46:20Z

Summary of Changes

Hello @hzxuzhonghu, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new English blog post that formally announces and describes Kthena, a new sub-project under Volcano. Kthena is presented as a Kubernetes-native, high-performance system designed to optimize Large Language Model (LLM) inference by providing advanced routing, orchestration, and scheduling capabilities. The post elaborates on how Kthena tackles common challenges in deploying LLMs at scale, aiming to improve GPU/NPU utilization, reduce latency, and simplify multi-model management within cloud-native environments.

Highlights

New Blog Post: A new English blog post has been added, introducing Kthena, a new sub-project of Volcano focused on cloud-native LLM inference.
Kthena's Purpose: The blog post details how Kthena addresses critical challenges in LLM serving on Kubernetes, such as low resource utilization, latency vs. throughput trade-offs, complex multi-model management, and the need for native K8s integration.
Core Features: Kthena's key capabilities are outlined, including production-grade inference orchestration (ModelServing), out-of-the-box deployment (ModelBooster), intelligent model-aware routing, cost-driven autoscaling, broad hardware and engine support, and built-in flow control and fairness.
Performance Gains: The post highlights significant performance improvements, such as a ~2.73x increase in throughput and a ~73.5% reduction in Time To First Token (TTFT) when using Kthena's KV Cache Awareness + Least Request strategy for long system prompts.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds a new blog post introducing Kthena. The content looks informative. However, I've found a critical issue where the entire blog post content is duplicated within the same file, which needs to be resolved. I've also left a medium-severity comment regarding an inconsistency between the filename and the post's title, which could be improved for better maintainability and cleaner URLs.

content/en/blog/Introducing Kthena_ Redefining LLM Inference for the Cloud-Native Era.md

Signed-off-by: Zhonghu Xu <xuzhonghu@huawei.com>

hzxuzhonghu · 2026-01-12T06:54:36Z

@JesseStutler

JesseStutler · 2026-01-12T07:01:41Z

/approve
/lgtm
Thanks!

volcano-sh-bot · 2026-01-12T07:01:50Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JesseStutler

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [JesseStutler]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Add kthena blog english version

4cddafe

Signed-off-by: Zhonghu Xu <xuzhonghu@huawei.com>

volcano-sh-bot added the kind/documentation Categorizes issue or PR as related to documentation. label Jan 6, 2026

volcano-sh-bot requested review from JesseStutler and william-wang January 6, 2026 12:46

volcano-sh-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 6, 2026

gemini-code-assist bot reviewed Jan 6, 2026

View reviewed changes

content/en/blog/Introducing Kthena_ Redefining LLM Inference for the Cloud-Native Era.md Outdated Show resolved Hide resolved

content/en/blog/Introducing Kthena_ Redefining LLM Inference for the Cloud-Native Era.md Outdated Show resolved Hide resolved

update

a309572

Signed-off-by: Zhonghu Xu <xuzhonghu@huawei.com>

Add kthena release blog

2c3a303

Signed-off-by: Zhonghu Xu <xuzhonghu@huawei.com>

hzxuzhonghu force-pushed the update-kethena branch from f0671b1 to 2c3a303 Compare January 7, 2026 02:47

volcano-sh-bot assigned JesseStutler Jan 12, 2026

volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Jan 12, 2026

volcano-sh-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 12, 2026

volcano-sh-bot merged commit be7dac3 into volcano-sh:master Jan 12, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add kthena blog english version #438

Add kthena blog english version #438

Uh oh!

hzxuzhonghu commented Jan 6, 2026

Uh oh!

gemini-code-assist bot commented Jan 6, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

hzxuzhonghu commented Jan 12, 2026

Uh oh!

JesseStutler commented Jan 12, 2026

Uh oh!

volcano-sh-bot commented Jan 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add kthena blog english version #438

Add kthena blog english version #438

Uh oh!

Conversation

hzxuzhonghu commented Jan 6, 2026

Uh oh!

gemini-code-assist bot commented Jan 6, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

hzxuzhonghu commented Jan 12, 2026

Uh oh!

JesseStutler commented Jan 12, 2026

Uh oh!

volcano-sh-bot commented Jan 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants