Skip to content

Commit

Permalink
Clarify compute shader N-body tutorial (#1798)
Browse files Browse the repository at this point in the history
* Use a more manageable default size for the screen

* Set resizable to false to play safe with tiling WMs

* Make the tutorial friendlier to low-end systems by using fewer particles

* Use a top-level constant to control the number of stars

* Make naming consistent, ie stars instead of balls

* s/effected/affected/ + phrasing tweak in same sentence

* Use self.position instead of .center_x, .center_y

* Change the window title to be clearer

* s/ball/star/g in the computer shader example's glsl

* Improve clarity by renaming buffer variables and updating comments

* Clean up buffer initialization by using a nested function

* Use standard conditional execution for the example

* Add units to comments

* Make phrasing consistent in program parts list

* Explain templating behavior at the top of the compute shader

* Use pathlib.Path.read_text instead of longer with statememnts

* Add comments explaining input variables to compute shaders

* Randomize star color to make movement visualization easier

* Incorporate feedback from discord discussion

* Use clearer names for the buffers

* Clarify a comment

* Add missing "the" to increase smoothness

* Add notes on vec3 being fake based on discussion with @einarf

* Extend comment to support the .rst file

* Rephrase label in SVG diagram

* Move initial data generation into separate function to make indent cleaner

* Clean up the top of the tutorial's __init__

* Update file docstring, window title, and class name

* Remove unused instance variable & reorganize code for easier inclusion in .rst

* Initialize both ssbo buffers with the same data to start with

* Add section on buffer allocation to computer shader tutorial

* Tweak tutorial intro phrasing

* Tweak buffer opening phrasing

* Add index to SVG diagram label to indicate parallel iteration

* Subdivide visualization section with headings + touch up Vertex & Geometry shader sections

* Make ordering of variables identical across doc, vertex, and geometry shader

* Improve clarity of Geometry Shader section & GLSL

* Clean up rst description of geometry & fragment shaders

* Finish first full rework draft

* Make star color togglable & off by default to match the embedded video

* Update example portion line numbers

* Fix an ambiguous comment in compute shader

* Fix plural in heading
  • Loading branch information
pushfoo committed May 29, 2023
1 parent 7068d35 commit 809dbe3
Show file tree
Hide file tree
Showing 5 changed files with 261 additions and 146 deletions.
119 changes: 92 additions & 27 deletions doc/tutorials/compute_shader/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,78 +7,143 @@ Compute Shader Tutorial

<div style="width: 100%; height: 0px; position: relative; padding-bottom: 56.250%;"><iframe src="https://streamable.com/e/ab8d87" frameborder="0" width="100%" height="100%" allowfullscreen style="width: 100%; height: 100%; position: absolute;"></iframe></div>

Using the compute shader, you can use the GPU to perform calculations thousands
of times faster than just by using the CPU.
For certain types of calculations, compute shaders on the GPU can be
thousands of times faster than on the CPU alone.

In this example, we will simulate a star field using an 'N-Body simulation'. Each
star is effected by each other star's gravity. For 1,000 stars, this means we have
In this tutorial, we will simulate a star field using an 'N-Body simulation'. Each
star is affected by the gravity of every other star. For 1,000 stars, this means we have
1,000 x 1,000 = 1,000,000 million calculations to perform for each frame.
The video has 65,000 stars, requiring 4.2 billion gravity force calculations per frame.
On high-end hardware it can still run at 60 fps!

How does this work?
There are three major parts to this program:

* The Python code, this glues everything together.
* The visualization shaders, which let us see the data.
* The compute shader, which moves everything.
* The Python code, which allocates buffers & glues everything together
* The visualization shaders, which let us see the data in the buffers
* The compute shader, which moves everything

Buffers
-------

We need a place to store the data we'll visualize. To do so, we'll create
two **Shader Storage Buffer Objects** (SSBOs) of floating point numbers from
within our Python code. One will hold the previous frame's star positions,
and the other will be used to store calculate the next frame's positions.

Each buffer must be able to store the following for each star:

1. The x, y, and radius of each star stored
2. The velocity of the star, which will be unused by the visualization
3. The floating point RGBA color of the star


Generating Aligned Data
^^^^^^^^^^^^^^^^^^^^^^^

To avoid issues with GPU memory alignment quirks, we'll use the function
below to generate well-aligned data ready to load into the SSBO. The
docstrings & comments explain why in greater detail:

.. literalinclude:: main.py
:language: python
:caption: Generating Well-Aligned Data to Load onto the GPU
:lines: 25-70

Allocating the Buffers
^^^^^^^^^^^^^^^^^^^^^^

.. literalinclude:: main.py
:language: python
:caption: Allocating the Buffers & Loading the Data onto the GPU
:lines: 88-116


Visualization Shaders
---------------------

There are multiple visualization shaders, which operate in this order:
Now that we have the data, we need to be able to visualize it. We'll do
it by applying vertex, geometry, and fragment shaders to convert the
data in the SSBO into pixels. For each star's 12 floats in the array, the
following flow of data will take place:

.. image:: shaders.svg

The Python program creates a **shader storage buffer object** (SSBO) of
floating point numbers. This buffer
has the x, y, z and radius of each star stored in ``in_vertex``. It also
stores the color in ``in_color``.
Vertex Shader
^^^^^^^^^^^^^

In this tutorial, the vertex shader will be run for each star's 12 float
long stretch of raw padded data in ``self.ssbo_current``. Each execution
will output clean typed data to an instance of the geometry shader.

Data is read in as follows:

The **vertex shader** doesn't do much more than separate out the radius
variable from the group of floats used to store position.
* The x, y, and radius of each star are accessed via ``in_vertex``
* The floating point RGBA color of the star, via ``in_color``

.. literalinclude:: shaders/vertex_shader.glsl
:language: glsl
:caption: shaders/vertex_shader.glsl
:linenos:

The **geometry shader** converts the single point (which we can't render) to
a square, which we can render. It changes the one point, to four points of a quad.
The variables below are then passed as inputs to the geometry shader:

* ``vertex_pos``
* ``vertex_radius``
* ``vertex_color``

Geometry Shader
^^^^^^^^^^^^^^^

The **geometry shader** converts a single point into a quad, in this
case a square, which the GPU can render. It does this by emitting four
points centered on the input point.

.. literalinclude:: shaders/geometry_shader.glsl
:language: glsl
:caption: shaders/geometry_shader.glsl
:linenos:

The **fragment shader** runs for each pixel. It produces the soft glow effect of the
star, and rounds off the quad into a circle.
Fragment Shader
^^^^^^^^^^^^^^^

A **fragment shader** runs for each pixel in a quad. It converts a UV
coordinate within the quad to a float RGBA value. In this tutorial's
case, the shader produces the soft glowing circle on the surface of each
star's quad.

.. literalinclude:: shaders/fragment_shader.glsl
:language: glsl
:caption: shaders/fragment_shader.glsl
:linenos:


Compute Shaders
---------------
Compute Shader
--------------

Now that we have a way to display data, we should update it.

This program runs two buffers. We have an **input buffer**, with all our current data. We perform
calculations on that data and write to the **output buffer**. We then swap those buffers for the
next frame, where we use the output of the previous frame as the input to the next frame.
We created pairs of buffers earlier. We will use one SSBO as an
**input buffer** holding the previous frame's data, and another
as our **output** buffer to write results to.

We then swap our buffers each frame after drawing, using the output
as the input of the next frame, and repeat the process until the program
stops running.

.. literalinclude:: shaders/compute_shader.glsl
:language: glsl
:caption: shaders/compute_shader.glsl
:linenos:

Python Program
--------------
The Finished Python Program
---------------------------

Read through the code here, I've tried hard to explain all the parts in the comments.
The code includes thorough docstrings and annotations explaining how it works.

.. literalinclude:: main.py
:caption: main.py
:linenos:

An expanded version of this, with support for 3D, is available at: https://github.com/pvcraven/n-body
An expanded version of this tutorial whith support for 3D is available at:
https://github.com/pvcraven/n-body
Loading

0 comments on commit 809dbe3

Please sign in to comment.