Skip to content

Avoid OOM for large GEMM 32k & modify layernorm cutile#50

Merged
hannahli-nv merged 2 commits intomainfrom
tilegym_update
Feb 6, 2026
Merged

Avoid OOM for large GEMM 32k & modify layernorm cutile#50
hannahli-nv merged 2 commits intomainfrom
tilegym_update

Conversation

@hannahli-nv
Copy link
Copy Markdown
Collaborator

@hannahli-nv hannahli-nv commented Feb 6, 2026

Description

Update codes.

This PR contains 2 new commit(s).

Commits included:

a3a32ef modify layernorm cutile
31db13c Avoid OOM for large GEMM 32k

CI Configuration

config:
  build: true
  # valid options are "ops" and "benchmark"
  test: ["ops", "benchmark"]

Checklist

  • Code formatted and imports sorted via repo specifications (./format.sh)
  • Documentation updated (if needed)
  • CI configuration reviewed

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Feb 6, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@hannahli-nv
Copy link
Copy Markdown
Collaborator Author

/ok to test b6144d5

@hannahli-nv hannahli-nv requested a review from xjmxyt February 6, 2026 04:13
Copy link
Copy Markdown
Collaborator

@xjmxyt xjmxyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hannahli-nv
Copy link
Copy Markdown
Collaborator Author

/ok to test a3a32ef

@hannahli-nv hannahli-nv merged commit cd69fe2 into main Feb 6, 2026
18 checks passed
@hannahli-nv hannahli-nv deleted the tilegym_update branch February 6, 2026 06:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants