Skip to content

yhamdoud/engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deferred OpenGL rendering engine

showcase.mp4

Table of contents

  1. Feature list
  2. Building
  3. Showcase
  4. Technical overview
  5. Development log

Feature list

Rendering

  • Deferred rendering
  • PBR material system
  • Irradiance probes for diffuse GI
  • Stable CSM
  • SSR
  • SSAO
  • Bloom
  • TAA
  • HDR lighting
  • Automatic exposure
  • Tone mapping
  • Volumetric lighting
  • Motion blur
  • Directional and point lights
  • Light bounding volumes
  • Normal mapping
  • Skyboxes

Engine

  • OpenGL 4.6
  • Direct state access (DSA)
  • Amortized GPU probe baking
  • glTF model loading
  • DDS, PNG, JPEG texture loading
  • Transform hierarchy
  • GPU profiler
  • CPU profiling with Tracy
  • Editor UI
  • Mouse picking

Planned

  • Reflection probes

Building

This project uses CMake for building. Investigate on how to install CMake on your platform. All other dependencies are vendored directly or as submodules:

  • glad: for loading OpenGL functions at runtime.
  • GLFW: for windowing and input.
  • GLM: for linear algebra and other math.
  • GLI: for (compressed) DDS image loading.
  • cgtlf: for glTF model loading.
  • stb: for PNG and JPEG image loading.
  • Dear ImGui: for user interface.
  • fmt: for hassle-free string formatting.
  • Tracy: for CPU and GPU profiling.

Linux

$ git clone --recursive https://github.com/yhamdoud/engine.git
$ mkdir engine/build
$ cd engine/build
$ cmake ..
$ make
$ ./engine

Windows

The Windows equivalent of the steps shown above can be used. Alternatively, the CMake GUI or Visual Studio's CMake integration could be used.

Showcase

Baking

baking.mp4

Reflections

ssr.mp4

Tone mapping

tone_mapping.mp4

Technical overview

Deferred rendering

The G-buffer consists of the following render targets:

  • RT0 (RGBA16F)
    • RGB: Normal
    • A: Metallic
  • RT1 (SRGB8_ALPHA8)
    • RGB: Base color
    • A: Roughness
  • RT2 (RGBA16F)
    • RG: Velocity
    • BA: Unused
  • Depth buffer

The directional light is rendered in a full screen pass. Effects like SSR and SSAO are also composited during this pass. Point light rendering is optimized using bounding volume to only shade fragments that fall in the range of the light.

Physically based shading

The Cook-Torrance model is used as specular BRDF. The diffuse BRDF is approximated by a simplified Lambertian model. The implementation of both borrows from the work presented in Physically Based Rendering in Filament and Real Shading in Unreal Engine 4.

The material system only supports the metallic-roughness workflow as of writing.

Diffuse indirect lighting

Diffuse indirect lighting for static and dynamic objects is approximated using a grid of irradiance probes. For each probe, the scene is rasterized to a cube map and the irradiance is subsequently determined using a compute shader. This process is repeated for multiple bounces and can be amortized over several frames.

An efficient encoding for irradiance is desirable when storing many probes, the fact that we are primarily interested in capturing low frequency detail can be put to good use here. We opted to project the radiance map on the third-order spherical harmonics basis functions. This allows us to represent each map using just 27 floating-point numbers (9 for each channel). Furthermore, calculating the irradiance ends up being a multiplication of each coefficient by a constant factor, instead of a convolution.

In practice, the cube map projection algorithm presented in Stupid Spherical Harmonics Tricks is used. The algorithm is implemented using 2 compute shaders, one for the actual projection of each cube map texel and the other for reducing the results of the first pass in parallel. The resulting coefficients are packed in 7 three-dimensional textures and sampled at run-time using hardware accelerated trilinear interpolation.

Screen-space reflections

The view space position of the fragment is reconstructed from the depth buffer and corresponding surface normal is sampled from the g-buffer. A ray leaving the camera is reflected about this normal. The reflected ray is used to march the depth buffer. Given that it's visible on screen, we end up with an approximate hit position, which can be used to sample a buffer containing the lighted scene.

The implemented depth buffer marching solution relies on rasterization of the ray in screen-space using a DDA algorithm, as first presented by Morgan McGuire and Mike Mara. This technique evenly distributes the samples, and thus avoids over- and undersampling issues.

Screen-space ambient occlusion

The SSAO pass uses a normal-oriented hemisphere for sampling. A tiling texture is used to randomly offset the sample kernel, as described in this blog post. The SSAO texture is box blurred in a full-screen post-processing pass.

The editor exposes several parameters, like sample count and radius. This allows one to tweak the SSAO for a specific scene.

Volumetric lighting

Volumetric lighting is implemented as a post-processing pass consisting of 4 steps. First, the depth buffer is raymarched at half resolution to determine the scattering amount for a given fragment. In the next 2 passes, a seperable bilateral blur is performed to filter the result. Finally, a bilateral upsample is performed and the volumetrics are added to the HDR target.

Our approach is based on the work presented in the GPU Pro 5 article "Volumetric Light Effects in Killzone: Shadow Fall" by Nathan Vos. During the raymarching step, we sample the shadow map to determine if the ray is lit at its current position. We calculate the contribution of each sample using the Henyey-Greenstein phase function, to approximate the effects described by Mie-scattering.

To reduce the amount of raymarching steps without introducing undersampling artifacts, the ray's initial position is jittered using a tiled Bayer matrix. The blur passes perform a 7x7 bilateral Gaussian blur and is responsible for filtering the dithered volumetic lighting buffer.

Bloom

The bloom post-processing pass consists of four steps:

  1. Downsampling and filtering the HDR render target.
  2. Upsampling and filtering the downsampled image.
  3. Additively combining the upsampled images.
  4. Blending the resulting texture with the original HDR target.

Upsampling and downsampling happens progressively and results in a mip chain of textures. For clarification, we refer to presentation that was referenced for this implementation: Next Generation Post Processing in Call of Duty: Advanced Warfare.

The upsampling and downsampling passes are implemented as compute shaders. The following filters are available:

  • 13-tap downsampling filter, from the aforementioned presentation.
  • 16-texel average downsampling filter.
  • 3x3 tent upsampling filter.

Additionally, a specialized version of the 13-tap filter is implemented for the first downsample pass to reduce fireflies.

Tone mapping

The first part of the tone mapping process is calculating the average luminance for automatic exposure adjustment. A straightforward way of calculating this average is to successively downsample the HDR target into a 1×1 luminance buffer. However, the stability of this approach can suffer from the presence of very bright and dark pixels in the HDR target, which over-contribute to the average luminance.

A more robust approach for calculating the average luminance relies on the creation of a luminance histogram. Binning samples in a histogram gives us more control over the impact of samples at both extremes. Our histogram construction algorithm is based on the work presented by Alex Tarfif in Adaptive Exposure from Luminance Histograms. The implementation consists of two compute shader passes: one for creating the histogram, and one for calculating the mean. In practice, we bin log luminance values and calculate the average log luminance. This limits the effect of extremely bright spots, like the sun, on adjustment. Additionally, the averaged value is smoothed over time.