fix GQA by XZman · Pull Request #5 · platformxlab/NeuSim

XZman · 2026-03-23T02:35:39Z

No description provided.

Copilot

Pull request overview

This PR aims to fix/enable GQA (grouped-query attention) modeling by threading a num_kv_heads parameter through the LLM attention op generators, updating attention block tensor shapes to reflect Q-heads vs KV-heads, and relaxing FlashAttention shape assumptions to permit Q_heads != KV_heads.

Changes:

Add num_kv_heads plumbing across attention entrypoints (create_multi_head_attention*) and propagate it from LLMOpsGenerator.
Update normal/self-attention (fwd/bwd) and flash-attention blocks to use KV-head shapes where applicable.
Relax FlashAttention shape assertions to only require K_heads == V_heads (allowing GQA/MQA).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File	Description
`neusim/npusim/frontend/llm_ops_lib.py`	Adds `num_kv_heads` throughout attention construction and introduces GQA-shaped einsums/softmax for attention blocks.
`neusim/npusim/frontend/llm_ops_generator.py`	Passes `self.num_kv_heads` into attention creation for prefill/decode and fwd/bwd generation paths.
`neusim/npusim/backend/npusim_lib.py`	Relaxes FlashAttention head-count assertion to allow Q-heads to differ from KV-heads.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

fix GQA

182dcdb

XZman requested a review from Copilot March 23, 2026 02:35

Copilot started reviewing on behalf of XZman March 23, 2026 02:36 View session

XZman merged commit 4cf9661 into main Mar 23, 2026
8 checks passed

XZman deleted the gqa_fix branch March 23, 2026 02:39

Copilot AI reviewed Mar 23, 2026

View reviewed changes

Comment thread neusim/npusim/frontend/llm_ops_lib.py

Comment thread neusim/npusim/frontend/llm_ops_lib.py

Comment thread neusim/npusim/frontend/llm_ops_lib.py

Comment thread neusim/npusim/frontend/llm_ops_lib.py

Comment thread neusim/npusim/backend/npusim_lib.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix GQA#5

fix GQA#5
XZman merged 1 commit into
mainfrom
gqa_fix

XZman commented Mar 23, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

XZman commented Mar 23, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants