Skip to content

Add LLM-driven tile search for CUTLASS SYCL kernels#22

Merged
sandlbn merged 15 commits into
mainfrom
sandlbn/tile_search
May 14, 2026
Merged

Add LLM-driven tile search for CUTLASS SYCL kernels#22
sandlbn merged 15 commits into
mainfrom
sandlbn/tile_search

Conversation

@sandlbn
Copy link
Copy Markdown
Contributor

@sandlbn sandlbn commented May 12, 2026

Add tile configuration tuning for CUTLASS SYCL kernels (GEMM, Grouped GEMM, MoE GEMM, Flash Attention V2) using a propose-validate-benchmark loop driven by DSPy/LLM
added validators enforce Intel Xe DPAS constraints (atom shapes, subgroup limits, SLM capacity)

@sandlbn sandlbn marked this pull request as ready for review May 12, 2026 17:35
@sandlbn sandlbn requested review from danielfleischer and gbenms May 12, 2026 17:35
@danielfleischer
Copy link
Copy Markdown
Member

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

Copy link
Copy Markdown
Member

@danielfleischer danielfleischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, made some comments.

Comment thread README.md
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we do these algebraic validations? it's not very AI

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because I saw number of time GPT proposing some weird NVIDIA style configuration, which will fail on our XPU. So instead of waiting time, I’m discarding those.

Comment thread src/xe_forge/cli.py
logger = logging.getLogger(__name__)


def _setup_dspy(config: Config) -> None:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not put it inside TileTuningAgent or that module?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_setup_dspy is a global configuration (env vars + dspy.configure()), not agent-specific logic. So I would keep it here. I wanted to have it one place.

@sandlbn sandlbn merged commit 5ebdd68 into main May 14, 2026
2 checks passed
@sandlbn sandlbn deleted the sandlbn/tile_search branch May 14, 2026 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants