Skip to content

[FIX] Inline ceil_log2 in gpu_2d_continuous_cumsum to fix MakePackedAPI error#18957

Merged
MasterJH5574 merged 1 commit intoapache:mainfrom
gnguralnick:fix/cumsum-ceil-log2-letstmt
Apr 2, 2026
Merged

[FIX] Inline ceil_log2 in gpu_2d_continuous_cumsum to fix MakePackedAPI error#18957
MasterJH5574 merged 1 commit intoapache:mainfrom
gnguralnick:fix/cumsum-ceil-log2-letstmt

Conversation

@gnguralnick
Copy link
Copy Markdown
Contributor

Summary

  • The intermediate variable ceil_log2 in gpu_2d_continuous_cumsum created a LetStmt-bound Var in the TIR function
  • When MakePackedAPI processed the function, it reported ceil_log2 as an undefined variable not passed as an API argument
  • Inline the expression directly into total_rounds to avoid the intermediate Var — the computation is identical

Test plan

  • Compile a model that uses GPU sampling (e.g. any LLM with top-p sampling on Metal) and verify compilation succeeds
  • The error this fixes: Check failed: undefined.size() == 0: In PrimFunc gpu_2d_continuous_cumsum variables [ceil_log2] are used, but are not passed in as API arguments

…PI error

The intermediate variable ceil_log2 created a LetStmt-bound Var in the
TIR function. When MakePackedAPI processed the function, it saw
ceil_log2 as an undefined variable not passed as an API argument:

  Check failed: undefined.size() == 0: In PrimFunc
  gpu_2d_continuous_cumsum variables [ceil_log2] are used, but are not
  passed in as API arguments

Inline the expression into total_rounds to avoid the intermediate Var.
The computation is identical.
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

@gnguralnick
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request simplifies the cumsum implementation in the GPU generic backend by inlining the ceil_log2 calculation directly into the total_rounds variable assignment. I have no feedback to provide.

@babusid
Copy link
Copy Markdown
Contributor

babusid commented Apr 2, 2026

Looks like a duplicate of #18905.

@gnguralnick
Copy link
Copy Markdown
Contributor Author

You're right. Sorry, didn't notice that one. Though I do also see the CI weirdness; let me know if I should close this or not.

@MasterJH5574
Copy link
Copy Markdown
Contributor

@akaashrp @gnguralnick you guys can decide which PR to merge. I guess #18905 may need a rebase before it can pass CI. But given this PR already passes CI, an option is that we can add @akaashrp as a co-author when merging this PR.

@akaashrp
Copy link
Copy Markdown
Contributor

akaashrp commented Apr 2, 2026

@MasterJH5574 let's merge this PR (I've also closed mine). If it's not too involved to add me as a co-author, we could do that. Otherwise, I'm fine just merging this directly.

Copy link
Copy Markdown
Contributor

@MasterJH5574 MasterJH5574 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the fix.

@MasterJH5574 MasterJH5574 merged commit a7bfc85 into apache:main Apr 2, 2026
10 checks passed
@MasterJH5574
Copy link
Copy Markdown
Contributor

@akaashrp No problem. Merged with you as a co-author. Again thanks for the fix. Sorry that it has been stuck for two weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants