EasyGPU

Lightweight C++20 Embedded DSL for GPU Compute

Getting Started · Tutorial · Examples · API Reference

Overview

EasyGPU is an embedded domain-specific language (eDSL) for GPU programming that allows writing compute kernels in standard C++20. No shader language knowledge required.

#include <GPU.h>

int main() {
    std::vector<float> data(1024, 2.0f);
    Buffer<float> input(data);
    Buffer<float> output(1024);

    Kernel1D square([](Int i) {
        auto in = input.Bind();
        auto out = output.Bind();
        out[i] = in[i] * in[i];
    });

    square.Dispatch(16, true);
    return 0;
}

Who Is This For

For beginners learning GPU programming:

Write GPU kernels using familiar C++ syntax instead of learning GLSL/HLSL
No graphics programming background required — works with arrays, not just triangles
Full IDE support: autocomplete, type checking, compile-time error detection
10 lines of code for your first working GPU kernel

For experienced developers:

Zero vendor lock-in (OpenGL 4.3+, cross-platform)
Minimal dependencies (only GLAD, ~500KB)
Clean C++20 interface without heavy template metaprogramming

Requirements

C++20 compatible compiler (GCC 11+, Clang 14+, MSVC 2022+)
OpenGL 4.3+
CMake 3.21+ (optional)

Concept

The Problem

Traditional GPU programming requires maintaining two separate codebases:

// CPU: C++
std::vector<float> data = {1, 2, 3, 4, 5};

// GPU: GLSL (separate language)
const char* shader = R"(
    #version 430 core
    layout(local_size_x = 256) in;
    layout(std430, binding = 0) buffer Data { float values[]; };
    void main() {
        uint idx = gl_GlobalInvocationID.x;
        values[idx] = values[idx] * values[idx];
    }
)";

Issues: language fragmentation, no IDE support, runtime error detection, string-based data passing.

The Approach

EasyGPU unifies both sides in C++:

// CPU and GPU: C++
std::vector<float> data = {1, 2, 3, 4, 5};
Buffer<float> input(data);
Buffer<float> output(data.size());

Kernel1D square([](Int i) {
    auto in = input.Bind();
    auto out = output.Bind();
    out[i] = in[i] * in[i];
});

square.Dispatch(16, true);

Implementation

User writes C++ kernels using EasyGPU types
Library constructs an Intermediate Representation (IR)
IR is compiled to GLSL compute shaders
OpenGL executes on GPU

Features

Unified Language

Standard C++ syntax for GPU code. IDE features (autocomplete, refactoring, static analysis) work out of the box.

Kernel1D sum([](Int i) {
    c[i] = a[i] + b[i];
});

Interoperability — Works with Your Favorite Framework

EasyGPU integrates seamlessly with any OpenGL-based windowing framework. You control the window lifecycle; EasyGPU handles the GPU compute.

Framework	Use Case	Demo
EasyX	Teaching / Rapid prototyping
GLFW	Cross-platform applications

Original shader: Seascape on Shadertoy

Key benefits:

Zero windowing interference — Bring your own window, EasyGPU only touches the GPU
Native OpenGL interop — Render compute results directly to window textures
Non-intrusive design — Adopt incrementally in existing projects

Control Flow

Structured control flow with C++-like semantics:

If(x > 0, [&]() { 
    result = Sqrt(x); 
}).Else([&]() { 
    result = 0; 
});

For(0, 100, [&](Int& i) {
    If(i % 2 == 0) { Continue(); }
    Process(i);
});

Memory Management

Automatic buffer alignment and struct layout:

EASYGPU_STRUCT(Particle,
    (Float3, position),
    (Float3, velocity),
    (float, mass)
);

Buffer<Particle> particles(1000);

Important: EASYGPU_STRUCT must be defined in the global namespace. Defining it inside any namespace will cause compilation errors.
// Correct: global namespace
EASYGPU_STRUCT(Particle, ...);

// Wrong: inside namespace
namespace MyProject {
    EASYGPU_STRUCT(Particle, ...);  // ERROR
}

Reusable Functions

Callable<Float(Float)> square = [](Float x) {
    Return(x * x);
};

result = square(input);

Introspection

// Inspect generated GLSL
std::cout << kernel.GetGeneratedGLSL() << std::endl;

// Profile execution
KernelProfiler::PrintReport(kernel);

Async Data Transfer

Pixel Buffer Objects (PBO) for non-blocking CPU/GPU transfers:

Texture2D<PixelFormat::RGBA8> video(1920, 1080);
video.InitUploadPBOPool(2);  // Double buffering

// Upload without blocking - essential for real-time video
video.UploadAsync(frameData.data());
kernel.Dispatch(120, 68, true);  // GPU processes while CPU continues

Performance Notes — Exclusive OpenGL Context Mode

EasyGPU operates in exclusive mode by default, assuming it has sole ownership of the OpenGL context within the current thread. This design choice maximizes performance by:

State caching: Programs, buffers, and textures are only rebound when actually changed
Eliminating redundant glMakeCurrent: Context is made current once during Attach() and stays current
No defensive glGet calls: Trusting cached state avoids expensive driver synchronization

Implications:

Do not interleave raw OpenGL calls with EasyGPU operations in the same context
If you must use raw OpenGL, either:
- Use a separate OpenGL context, or
- Call GPU::Runtime::GetStateCache().Invalidate() before returning to EasyGPU

FragmentKernel Lifecycle:

FragmentKernel2D kernel(...);
kernel.Attach(hwnd);  // Context becomes current here

while (running) {
    // No need to MakeCurrent - context stays current
    kernel.Flush();     // Minimal state changes thanks to caching
}
// Context cleanup happens automatically

Quick Start

Installation

CMake FetchContent:

include(FetchContent)
FetchContent_Declare(
    easygpu
    GIT_REPOSITORY https://github.com/easygpu/EasyGPU.git
    GIT_TAG v0.1.0
)
FetchContent_MakeAvailable(easygpu)
target_link_libraries(your_target EasyGPU)

Manual: Copy include/ to your project and link against OpenGL.

First Program

#include <GPU.h>
#include <iostream>
#include <vector>

int main() {
    std::vector<float> numbers = {1, 2, 3, 4, 5};
    
    Buffer<float> gpu_input(numbers);
    Buffer<float> gpu_output(numbers.size());
    
    Kernel1D double_values([&](Int i) {
        auto input = gpu_input.Bind();
        auto output = gpu_output.Bind();
        output[i] = input[i] * 2.0f;
    });
    
    double_values.Dispatch(1, true);
    gpu_output.Download(numbers);
    
    for (float n : numbers) {
        std::cout << n << " ";
    }
    
    return 0;
}

⚠️ Important: Variable Initialization

Always use Make*() functions when initializing variables from buffer elements:
// ✅ CORRECT: Creates a new independent variable
Int val = MakeInt(input[i]);
val = 5;  // Only modifies val

// ❌ DANGEROUS: May create a reference to input[i]
Int val = input[i];
val = 5;  // May unexpectedly modify input[i]!
See Tutorial for details.

Build:

g++ -std=c++20 hello_gpu.cpp -lEasyGPU -lGL -o hello_gpu
./hello_gpu

Examples

Level	Example	Topics
Beginner	hello_world	Buffers, kernels
Beginner	mandelbrot	2D kernels, math
Intermediate	julia_set	Complex numbers
Intermediate	ray_tracing	Structs, RNG, basic ray tracing
Advanced	sdf_renderer	Callables, SDF, path tracing

Mandelbrot Set

Kernel2D mandelbrot([&](Int px, Int py) {
    Float x = CENTER_X + (Float(px) / WIDTH - 0.5f) * ZOOM;
    Float y = CENTER_Y + (Float(py) / HEIGHT - 0.5f) * ZOOM;
    
    Float zx = 0, zy = 0;
    Int iter = 0;
    
    For(0, MAX_ITER, [&](Int i) {
        If(zx*zx + zy*zy > 4.0f) {
            iter = i;
            Break();
        };
        Float new_zx = zx*zx - zy*zy + x;
        zy = 2.0f*zx*zy + y;
        zx = new_zx;
    });
    
    image[py * WIDTH + px] = ColorFromIteration(iter);
});

mandelbrot.Dispatch(WIDTH/16, HEIGHT/16);

View full example →

Ray Tracing

Basic Monte Carlo ray tracer demonstrating struct handling and random number generation.

View full example →

SDF Path Tracer

Signed distance field path tracer with support for complex lighting and materials. Demonstrates advanced Callable usage and reusable kernel functions.

View full example →

Best Practices

Variable Initialization

Always use Make*() functions when creating GPU variables from buffer elements or expressions:

auto buf = buffer.Bind();

// ✅ CORRECT: Explicitly create a new independent variable
Int val = MakeInt(buf[i]);
Float f = MakeFloat(buf[i] * 2.0f);
val = 5;  // Only modifies val, NOT buf[i]

// ❌ DANGEROUS: Direct initialization may create a reference
Int val = buf[i];  // val may become an alias to buf[i]!
val = 5;  // May unexpectedly modify buf[i] in the generated shader

Why this matters: Due to move constructor optimizations, Int val = buf[i] transfers ownership of the underlying variable name, causing val to reference buffer[i] directly. Use Make*() to ensure value semantics and create truly independent variables.

Var-Var Assignment

Always explicitly convert the right-hand side to Expr<T> when assigning one Var to another:

Int A;
Int B = MakeInt(10);

// ❌ WRONG: Direct Var-Var assignment may not generate correct IR
A = B;

// ✅ CORRECT: Explicitly convert right-hand side to Expr
A = Expr<int>(B);

Handling Side-Effects

Use ExprBase::NotUse() for expressions with side-effects that aren't captured by operators:

Callable<void(Int&)> A = [](Int &a) { a = 20; };

// ✅ CORRECT: Void-returning Callables automatically preserve side-effects
A(b);

Callable<Float(Float, Float&)> B = [](Float x, Float& out) {
    out = x * 2;
    Return(x + 1);
};

// ❌ WRONG: Non-void return with ignored result may lose side-effect on 'out'
Float y;
B(MakeFloat(5.0f), y);

// ✅ CORRECT: Explicitly mark the expression as "not used" to preserve side-effect
Float z;
ExprBase::NotUse(B(MakeFloat(5.0f), z));

Important: Only Callable<void> automatically handles side-effects. For Callable<T> where T is not void, if you ignore the return value but need the side-effects (e.g., modifications to reference parameters), you must wrap the call with ExprBase::NotUse().

Documentation

Building

Dependencies

Dependency	Required	Size	Purpose
OpenGL 4.3+	Yes	System	Compute backend
GLAD	Yes	~500KB (bundled)	OpenGL loader
stb_image	No	~50KB (examples only)	Image I/O

Build Commands

git clone --recursive https://github.com/easygpu/EasyGPU.git
cd EasyGPU

cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j

cd build && ctest

CMake Options

Option	Default	Description
`EASYGPU_BUILD_EXAMPLES`	`ON`	Build examples
`EASYGPU_BUILD_TESTS`	`ON`	Build tests

License

MIT License. See LICENSE.

Acknowledgements

LuisaCompute — DSL design
Taichi — Algorithms
GLAD — OpenGL loader
stb — Image utilities

Back to Top

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
docs		docs
examples		examples
include		include
scripts		scripts
source		source
tests		tests
third		third
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
CMakeSettings.json		CMakeSettings.json
LICENSE		LICENSE
README.md		README.md
format-code.py		format-code.py

Folders and files

Latest commit

History

Repository files navigation

EasyGPU

Table of Contents

Overview

Who Is This For

Requirements

Concept

The Problem

The Approach

Implementation

Features

Unified Language

Interoperability — Works with Your Favorite Framework

Control Flow

Memory Management

Reusable Functions

Introspection

Async Data Transfer

Performance Notes — Exclusive OpenGL Context Mode

Quick Start

Installation

First Program

Examples

Mandelbrot Set

Ray Tracing

SDF Path Tracer

Best Practices

Variable Initialization

Var-Var Assignment

Handling Side-Effects

Documentation

Building

Dependencies

Build Commands

CMake Options

License

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages