Skip to content

v0 roadmap: metal runtime → comm benchmarks + kernels #1

@bniladridas

Description

@bniladridas

We now have the initial gpucomm/core scaffold: a minimal SwiftPM Metal runtime (MetalContext + kernel loader), a CLI (gpucomm), and two first experiments (bandwidth + reduction).

Goal

Make this repo a low-level GPU compute runtime + benchmark lab for Apple Silicon that focuses on data movement, synchronization, and memory behavior (not abstractions).

Proposed next milestones

  1. Kernel suite v0.1
    • Scan (prefix sum), matmul, and a proper memcpy-style bandwidth kernel
    • Parameterized kernels (problem size, threadgroup size) + correctness checks
  2. Benchmark harness v0.2
    • Standard runner: warmup, repetitions, percentile stats
    • Output formats: human + --json
    • Stable timing: GPU timestamps + optional CPU wall time
  3. CPU ↔ GPU transfer + residency v0.3
    • Upload/download latency + throughput (shared vs private + blit strategies)
    • Pinned/contiguous allocation experiments where applicable
  4. Synchronization + threadgroup experiments v0.4
    • Barrier patterns, bank conflict probes, reduction/scan variants
    • Threadgroup scaling sweeps and occupancy-ish heuristics
  5. CI + docs v0.5
    • GitHub Actions on macOS: swift build -c release
    • README: benchmark methodology + “how to interpret numbers”

Decisions needed

  • Do we prioritize transfer benchmarks (CPU↔GPU) before adding more kernels?
  • Should the CLI be strict subcommands only, or do we also support a config-driven batch runner (YAML/JSON)?
  • What’s the canonical set of benchmark outputs we want to standardize across gpucomm repos?

Acceptance criteria for next PR

  • Add 1 new benchmark end-to-end (kernel + runner + CLI) with correctness validation
  • Add --json output for that benchmark
  • CI stays green (swift build -c release)
Pinned by gpucomm-hq

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions