Skip to content

Foadsf/opencl-async-multidevice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Async Multi-Device Vector Add (OpenCL, single host thread)

Fork of https://gist.github.com/Foadsf/628a046040c302f507c81fd0568d8b34

This minimal example demonstrates how to distribute a single vector-add job across multiple OpenCL devices (dGPU, iGPU, CPU) without host-side multithreading.
It uses one context per platform and one command queue per device, launches work asynchronously, and gates all kernel starts with a user event so devices begin together. Per-device timings are collected from profiling events, and the overall job time is the max queue span across devices.

plot

Build & Run (Windows / VS2019 Build Tools)

build.bat

This configures with CMake, builds async_multidevice.exe, runs it, and leaves:

  • build\Release\timings.csv – per-N timings (Total + each device)
  • build\Release\plot.gp – gnuplot script
  • build\Release\vector_add_kernel.cl

To plot:

cd build\Release
"C:\Program Files\gnuplot\bin\gnuplot.exe" ..\..\plot.gp

What this repo shows

  • Asynchronous multi-device execution with no OpenMP/threads
  • Work splitting proportional to compute_units × max_clock
  • Event profiling to verify overlap and compute wall-time
  • Log-log visualization of scale vs. latency

References & original discussion

About

Async multi-device OpenCL vector add (single host thread, user events, profiling, gnuplot).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published