v0 roadmap: metal runtime → comm benchmarks + kernels

We now have the initial `gpucomm/core` scaffold: a minimal SwiftPM Metal runtime (`MetalContext` + kernel loader), a CLI (`gpucomm`), and two first experiments (bandwidth + reduction).

## Goal
Make this repo a low-level GPU compute runtime + benchmark lab for Apple Silicon that focuses on **data movement, synchronization, and memory behavior** (not abstractions).

## Proposed next milestones
1) **Kernel suite v0.1**
   - Scan (prefix sum), matmul, and a proper memcpy-style bandwidth kernel
   - Parameterized kernels (problem size, threadgroup size) + correctness checks
2) **Benchmark harness v0.2**
   - Standard runner: warmup, repetitions, percentile stats
   - Output formats: human + `--json`
   - Stable timing: GPU timestamps + optional CPU wall time
3) **CPU ↔ GPU transfer + residency v0.3**
   - Upload/download latency + throughput (shared vs private + blit strategies)
   - Pinned/contiguous allocation experiments where applicable
4) **Synchronization + threadgroup experiments v0.4**
   - Barrier patterns, bank conflict probes, reduction/scan variants
   - Threadgroup scaling sweeps and occupancy-ish heuristics
5) **CI + docs v0.5**
   - GitHub Actions on macOS: `swift build -c release`
   - README: benchmark methodology + “how to interpret numbers”

## Decisions needed
- Do we prioritize **transfer benchmarks** (CPU↔GPU) before adding more kernels?
- Should the CLI be strict subcommands only, or do we also support a config-driven batch runner (YAML/JSON)?
- What’s the canonical set of benchmark outputs we want to standardize across gpucomm repos?

## Acceptance criteria for next PR
- Add 1 new benchmark end-to-end (kernel + runner + CLI) with correctness validation
- Add `--json` output for that benchmark
- CI stays green (`swift build -c release`)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0 roadmap: metal runtime → comm benchmarks + kernels #1

Goal

Proposed next milestones

Decisions needed

Acceptance criteria for next PR

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

v0 roadmap: metal runtime → comm benchmarks + kernels #1

Description

Goal

Proposed next milestones

Decisions needed

Acceptance criteria for next PR

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions