Skip to content

Fix Metal training: use f32 tensors in microgpt#1010

Merged
aaylward merged 2 commits intomainfrom
fix/microgpt-metal-f32
Feb 17, 2026
Merged

Fix Metal training: use f32 tensors in microgpt#1010
aaylward merged 2 commits intomainfrom
fix/microgpt-metal-f32

Conversation

@aaylward
Copy link
Copy Markdown
Collaborator

Summary

  • Switch candle tensor creation from f64 to f32 to fix Metal contiguous index_select U32 F64 not implemented error on Apple GPU
  • Keep JSON weight serialization format as f64 for InferenceGpt interop
  • Bump microgpt CLI to 0.2.1

Test plan

  • All 35 existing tests pass (cargo test -p microgpt)
  • Verify microgpt train --device metal runs without error

🤖 Generated with Claude Code

Metal's index_select kernel doesn't support the U32×F64 combination,
causing "Metal contiguous index_select U32 F64 not implemented" at
training time. Switch all candle tensor creation to f32 while keeping
the JSON weight format compatible (f64) for InferenceGpt interop.

Bump microgpt CLI to 0.2.1.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@aaylward aaylward enabled auto-merge (squash) February 17, 2026 03:39
Chat history was initialized empty, but training data always starts
with BOS. Without it, position embeddings are shifted and the model
produces garbage. Prepend BOS on init and after /clear.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@aaylward aaylward merged commit 692b904 into main Feb 17, 2026
8 checks passed
@aaylward aaylward deleted the fix/microgpt-metal-f32 branch February 17, 2026 03:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant