EasyGPU is an embedded domain-specific language (eDSL) for GPU programming that allows writing compute kernels in standard C++20. No shader language knowledge required.
#include <GPU.h>
int main() {
std::vector<float> data(1024, 2.0f);
Buffer<float> input(data);
Buffer<float> output(1024);
Kernel1D square([](Int i) {
auto in = input.Bind();
auto out = output.Bind();
out[i] = in[i] * in[i];
});
square.Dispatch(16, true);
return 0;
}For beginners learning GPU programming:
- Write GPU kernels using familiar C++ syntax instead of learning GLSL/HLSL
- No graphics programming background required — works with arrays, not just triangles
- Full IDE support: autocomplete, type checking, compile-time error detection
- 10 lines of code for your first working GPU kernel
For experienced developers:
- Zero vendor lock-in (OpenGL 4.3+, cross-platform)
- Minimal dependencies (only GLAD, ~500KB)
- Clean C++20 interface without heavy template metaprogramming
- C++20 compatible compiler (GCC 11+, Clang 14+, MSVC 2022+)
- OpenGL 4.3+
- CMake 3.21+ (optional)
Traditional GPU programming requires maintaining two separate codebases:
// CPU: C++
std::vector<float> data = {1, 2, 3, 4, 5};
// GPU: GLSL (separate language)
const char* shader = R"(
#version 430 core
layout(local_size_x = 256) in;
layout(std430, binding = 0) buffer Data { float values[]; };
void main() {
uint idx = gl_GlobalInvocationID.x;
values[idx] = values[idx] * values[idx];
}
)";Issues: language fragmentation, no IDE support, runtime error detection, string-based data passing.
EasyGPU unifies both sides in C++:
// CPU and GPU: C++
std::vector<float> data = {1, 2, 3, 4, 5};
Buffer<float> input(data);
Buffer<float> output(data.size());
Kernel1D square([](Int i) {
auto in = input.Bind();
auto out = output.Bind();
out[i] = in[i] * in[i];
});
square.Dispatch(16, true);- User writes C++ kernels using EasyGPU types
- Library constructs an Intermediate Representation (IR)
- IR is compiled to GLSL compute shaders
- OpenGL executes on GPU
Standard C++ syntax for GPU code. IDE features (autocomplete, refactoring, static analysis) work out of the box.
Kernel1D sum([](Int i) {
c[i] = a[i] + b[i];
});EasyGPU integrates seamlessly with any OpenGL-based windowing framework. You control the window lifecycle; EasyGPU handles the GPU compute.
| Framework | Use Case | Demo |
|---|---|---|
| EasyX | Teaching / Rapid prototyping | ![]() |
| GLFW | Cross-platform applications | ![]() |
Original shader: Seascape on Shadertoy
Key benefits:
- Zero windowing interference — Bring your own window, EasyGPU only touches the GPU
- Native OpenGL interop — Render compute results directly to window textures
- Non-intrusive design — Adopt incrementally in existing projects
Structured control flow with C++-like semantics:
If(x > 0, [&]() {
result = Sqrt(x);
}).Else([&]() {
result = 0;
});
For(0, 100, [&](Int& i) {
If(i % 2 == 0) { Continue(); }
Process(i);
});Automatic buffer alignment and struct layout:
EASYGPU_STRUCT(Particle,
(Float3, position),
(Float3, velocity),
(float, mass)
);
Buffer<Particle> particles(1000);Important:
EASYGPU_STRUCTmust be defined in the global namespace. Defining it inside any namespace will cause compilation errors.// Correct: global namespace EASYGPU_STRUCT(Particle, ...); // Wrong: inside namespace namespace MyProject { EASYGPU_STRUCT(Particle, ...); // ERROR }
Callable<Float(Float)> square = [](Float x) {
Return(x * x);
};
result = square(input);// Inspect generated GLSL
std::cout << kernel.GetGeneratedGLSL() << std::endl;
// Profile execution
KernelProfiler::PrintReport(kernel);Pixel Buffer Objects (PBO) for non-blocking CPU/GPU transfers:
Texture2D<PixelFormat::RGBA8> video(1920, 1080);
video.InitUploadPBOPool(2); // Double buffering
// Upload without blocking - essential for real-time video
video.UploadAsync(frameData.data());
kernel.Dispatch(120, 68, true); // GPU processes while CPU continuesEasyGPU operates in exclusive mode by default, assuming it has sole ownership of the OpenGL context within the current thread. This design choice maximizes performance by:
- State caching: Programs, buffers, and textures are only rebound when actually changed
- Eliminating redundant
glMakeCurrent: Context is made current once duringAttach()and stays current - No defensive
glGetcalls: Trusting cached state avoids expensive driver synchronization
Implications:
- Do not interleave raw OpenGL calls with EasyGPU operations in the same context
- If you must use raw OpenGL, either:
- Use a separate OpenGL context, or
- Call
GPU::Runtime::GetStateCache().Invalidate()before returning to EasyGPU
FragmentKernel Lifecycle:
FragmentKernel2D kernel(...);
kernel.Attach(hwnd); // Context becomes current here
while (running) {
// No need to MakeCurrent - context stays current
kernel.Flush(); // Minimal state changes thanks to caching
}
// Context cleanup happens automaticallyCMake FetchContent:
include(FetchContent)
FetchContent_Declare(
easygpu
GIT_REPOSITORY https://github.com/easygpu/EasyGPU.git
GIT_TAG v0.1.0
)
FetchContent_MakeAvailable(easygpu)
target_link_libraries(your_target EasyGPU)Manual: Copy include/ to your project and link against OpenGL.
#include <GPU.h>
#include <iostream>
#include <vector>
int main() {
std::vector<float> numbers = {1, 2, 3, 4, 5};
Buffer<float> gpu_input(numbers);
Buffer<float> gpu_output(numbers.size());
Kernel1D double_values([&](Int i) {
auto input = gpu_input.Bind();
auto output = gpu_output.Bind();
output[i] = input[i] * 2.0f;
});
double_values.Dispatch(1, true);
gpu_output.Download(numbers);
for (float n : numbers) {
std::cout << n << " ";
}
return 0;
}
⚠️ Important: Variable InitializationAlways use
Make*()functions when initializing variables from buffer elements:// ✅ CORRECT: Creates a new independent variable Int val = MakeInt(input[i]); val = 5; // Only modifies val // ❌ DANGEROUS: May create a reference to input[i] Int val = input[i]; val = 5; // May unexpectedly modify input[i]!See Tutorial for details.
Build:
g++ -std=c++20 hello_gpu.cpp -lEasyGPU -lGL -o hello_gpu
./hello_gpu| Level | Example | Topics |
|---|---|---|
| Beginner | hello_world | Buffers, kernels |
| Beginner | mandelbrot | 2D kernels, math |
| Intermediate | julia_set | Complex numbers |
| Intermediate | ray_tracing | Structs, RNG, basic ray tracing |
| Advanced | sdf_renderer | Callables, SDF, path tracing |
Kernel2D mandelbrot([&](Int px, Int py) {
Float x = CENTER_X + (Float(px) / WIDTH - 0.5f) * ZOOM;
Float y = CENTER_Y + (Float(py) / HEIGHT - 0.5f) * ZOOM;
Float zx = 0, zy = 0;
Int iter = 0;
For(0, MAX_ITER, [&](Int i) {
If(zx*zx + zy*zy > 4.0f) {
iter = i;
Break();
};
Float new_zx = zx*zx - zy*zy + x;
zy = 2.0f*zx*zy + y;
zx = new_zx;
});
image[py * WIDTH + px] = ColorFromIteration(iter);
});
mandelbrot.Dispatch(WIDTH/16, HEIGHT/16);Basic Monte Carlo ray tracer demonstrating struct handling and random number generation.
Signed distance field path tracer with support for complex lighting and materials. Demonstrates advanced Callable usage and reusable kernel functions.
Always use Make*() functions when creating GPU variables from buffer elements or expressions:
auto buf = buffer.Bind();
// ✅ CORRECT: Explicitly create a new independent variable
Int val = MakeInt(buf[i]);
Float f = MakeFloat(buf[i] * 2.0f);
val = 5; // Only modifies val, NOT buf[i]
// ❌ DANGEROUS: Direct initialization may create a reference
Int val = buf[i]; // val may become an alias to buf[i]!
val = 5; // May unexpectedly modify buf[i] in the generated shaderWhy this matters: Due to move constructor optimizations, Int val = buf[i] transfers ownership of the underlying variable name, causing val to reference buffer[i] directly. Use Make*() to ensure value semantics and create truly independent variables.
Always explicitly convert the right-hand side to Expr<T> when assigning one Var to another:
Int A;
Int B = MakeInt(10);
// ❌ WRONG: Direct Var-Var assignment may not generate correct IR
A = B;
// ✅ CORRECT: Explicitly convert right-hand side to Expr
A = Expr<int>(B);Use ExprBase::NotUse() for expressions with side-effects that aren't captured by operators:
Callable<void(Int&)> A = [](Int &a) { a = 20; };
// ✅ CORRECT: Void-returning Callables automatically preserve side-effects
A(b);
Callable<Float(Float, Float&)> B = [](Float x, Float& out) {
out = x * 2;
Return(x + 1);
};
// ❌ WRONG: Non-void return with ignored result may lose side-effect on 'out'
Float y;
B(MakeFloat(5.0f), y);
// ✅ CORRECT: Explicitly mark the expression as "not used" to preserve side-effect
Float z;
ExprBase::NotUse(B(MakeFloat(5.0f), z));Important: Only
Callable<void>automatically handles side-effects. ForCallable<T>whereTis notvoid, if you ignore the return value but need the side-effects (e.g., modifications to reference parameters), you must wrap the call withExprBase::NotUse().
| Dependency | Required | Size | Purpose |
|---|---|---|---|
| OpenGL 4.3+ | Yes | System | Compute backend |
| GLAD | Yes | ~500KB (bundled) | OpenGL loader |
| stb_image | No | ~50KB (examples only) | Image I/O |
git clone --recursive https://github.com/easygpu/EasyGPU.git
cd EasyGPU
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j
cd build && ctest| Option | Default | Description |
|---|---|---|
EASYGPU_BUILD_EXAMPLES |
ON |
Build examples |
EASYGPU_BUILD_TESTS |
ON |
Build tests |
MIT License. See LICENSE.
- LuisaCompute — DSL design
- Taichi — Algorithms
- GLAD — OpenGL loader
- stb — Image utilities





