Vk Voxel Playground

Setup

Initialize all submodules

git submodule add https://github.com/ocornut/imgui.git vendor/imgui
git submodule add https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.git vendor/VulkanMemoryAllocator
git submodule update --init --recursive
cd vendor/imgui
git checkout docking
cd ../..

Compile shaders

bash compileShaders.sh

Build project

cmake -B build -DCMAKE_BUILD_TYPE=Debug && cmake --build build

Run the project

./build/Kitagawa

Benchmark / Performance

sudo apt install heaptrack heaptrack-gui
heaptrack ./build/vxen
sudo perf record -F 9999 -g --call-graph=dwarf ./build/vxen
sudo perf record -F 9999 -g -e branches,branch-misses ./build/experiment
sudo perf stat -e branches,branch-misses ./build/experiment
sudo perf report
./build/experiment --benchmark_repetitions=5 --benchmark_report_aggregates_only=true

# To set all cpus to performance mode to help prevent cpu impact on all performance stuff & benchmarking
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
sudo cpupower frequency-set -g performance
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

# Benchmark and check
# Modify the same test block and rerun
./build/experiment --benchmark_out=baseline.json --benchmark_out_format=json --benchmark_repetitions=20
./build/experiment --benchmark_out=contender.json --benchmark_out_format=json --benchmark_repetitions=20
python3 ~/benchmark/tools/compare.py benchmarks baseline.json contender.json

# Cache misses
perf stat -r 10 -e L1-dcache-loads,L1-dcache-load-misses,LLC-loads,LLC-load-misses ./build/experiment
perf stat -C 0-7 -r 10 -e L1-dcache-loads,L1-dcache-load-misses,LLC-loads,LLC-load-misses,l2_request.all,l2_request.miss ./build/experiment

Greedy mesh

Release mode

LOG /home/joshua/Youtube/VoxelEngine/src/Kitagawa/ChunkManager.cpp:88 (operator()): Took: 0.0680099 ms (average) over 1000 iterations
LOG /home/joshua/Youtube/VoxelEngine/src/Kitagawa/ChunkManager.cpp:88 (operator()): Took: 0.0712837 ms (average) over 1000 iterations
LOG /home/joshua/Youtube/VoxelEngine/src/Kitagawa/ChunkManager.cpp:88 (operator()): Took: 0.0917326 ms (average) over 1000 iterations
LOG /home/joshua/Youtube/VoxelEngine/src/Kitagawa/ChunkManager.cpp:88 (operator()): Took: 0.0814183 ms (average) over 1000 iterations
LOG /home/joshua/Youtube/VoxelEngine/src/Kitagawa/ChunkManager.cpp:88 (operator()): Took: 0.100206 ms (average) over 1000 iterations
LOG /home/joshua/Youtube/VoxelEngine/src/Kitagawa/ChunkManager.cpp:88 (operator()): Took: 0.0780194 ms (average) over 1000 iterations
LOG /home/joshua/Youtube/VoxelEngine/src/Kitagawa/ChunkManager.cpp:88 (operator()): Took: 0.115702 ms (average) over 1000 iterations
LOG /home/joshua/Youtube/VoxelEngine/src/Kitagawa/ChunkManager.cpp:88 (operator()): Took: 0.0519547 ms (average) over 1000 iterations
LOG /home/joshua/Youtube/VoxelEngine/src/Kitagawa/ChunkManager.cpp:88 (operator()): Took: 0.151858 ms (average) over 1000 iterations

Debug mode

LOG /home/joshua/Youtube/VoxelEngine/src/Kitagawa/ChunkManager.cpp:88 (operator()): Took: 0.250019 ms (average) over 1000 iterations
LOG /home/joshua/Youtube/VoxelEngine/src/Kitagawa/ChunkManager.cpp:88 (operator()): Took: 0.239719 ms (average) over 1000 iterations
LOG /home/joshua/Youtube/VoxelEngine/src/Kitagawa/ChunkManager.cpp:88 (operator()): Took: 0.347815 ms (average) over 1000 iterations
LOG /home/joshua/Youtube/VoxelEngine/src/Kitagawa/ChunkManager.cpp:88 (operator()): Took: 0.358445 ms (average) over 1000 iterations
LOG /home/joshua/Youtube/VoxelEngine/src/Kitagawa/ChunkManager.cpp:88 (operator()): Took: 0.372301 ms (average) over 1000 iterations
LOG /home/joshua/Youtube/VoxelEngine/src/Kitagawa/ChunkManager.cpp:88 (operator()): Took: 0.396542 ms (average) over 1000 iterations
LOG /home/joshua/Youtube/VoxelEngine/src/Kitagawa/ChunkManager.cpp:88 (operator()): Took: 0.204637 ms (average) over 1000 iterations
LOG /home/joshua/Youtube/VoxelEngine/src/Kitagawa/ChunkManager.cpp:88 (operator()): Took: 0.462924 ms (average) over 1000 iterations
LOG /home/joshua/Youtube/VoxelEngine/src/Kitagawa/ChunkManager.cpp:88 (operator()): Took: 0.513749 ms (average) over 1000 iterations

TODO

Chunk should now use threads to greedy mesh
Chunk manager does not need a GreedyMesh function, rather it needs a "Prepare function". Prepare should mesh all dirty chunks
Once all chunks are prepared we need to set the CHUNK_MANAGER_FLUSH_RENDER
CreateDescriptorSets() is called in Scene when the screen resizes or a buffer resizes, I need to also call it once Chunks are prepared, need to only set the chunks that are avaliable / have data.
Need to use draw indirect to draw chunks that have data. Then use gl_DrawID in the shader to access the correct SVOBuffer (Will be writing many buffers here, one for each chunk that has data. MUST MATCH THE draw indirect vertex buffer set. I.e, the vertex buffer and svo buffer need to be from the same chunk so gl_DrawID can be used to index into the SVO storage buffer)

layout(set = 0, binding = 0) readonly buffer SVOBuffers {
    uint data[];
} svoBuffers[MAX_CHUNKS];

NOTE: Prepare() is just a quick idea. Ideally, we don't want to check a dirty flag for all chunks to find out which ones to prepare, There will mostly only ever be 1(edited voxel's chunk) + lod chunk changes(in the future)

NOTE: May want to look into Quiescent State-Based Reclamation

When TSignal::Consume(0, CHUNK_MANAGER_FLUSH_UPDATE) is called, start processing dirty chunks m_World->Flush() Once flush is called, set an std::atomic<uint64_t> to the maximum number of threads we will be using. Once each thread is finished, aquire the atomic and check if it's 1, If it's 1, TSignal::Set(0, CHUNK_MANAGER_FLUSH_RENDER) In Scene.h, if the signal is on, update the descriptor sets for all dirty chunks & for each chunk, add an indirect draw command. Use the svoBuffers mentioned in point 5 above.

Dealing with the padding for greedy meshing is to hard. I need to query out the bit from the neighbouring engine. Solution: Each chunk keeps track of padding. When chunk manager sets voxels, padding should be calculated.

TODO: Need to create an allocator. Use one VkBuffer for verticies. Use indirect draw for drawing different offsets in that buffer. /// TODO: There is a glitch every time we need to resize the buffer and copy content. only when the buffer size is small The buffer default size in Buffer.h is 1 kb, need to increase it something more reasonably or set it whereever I create a buffer.

Create a Signal
- CHUNK_MANAGER_FLUSH_UPDATE
  - Signals when a chunk has changed.
  - Chunk Requirements:
    - 1 u8 | dirty flag | based on Set(), Clear() & Flatten()
    - 1 u8 | full or empty | based on if verticies were generated for underground voxels, they may be 1 block that is hidden on all sides & tracked by Set() & Clear()
    - 6 u64 | LOD selected | based on coord 0,0,0 is where the player is
    - 5 bytes padding left unused
  - CHUNK_MANAGER_FLUSH_VERTICES
    - Signals when verticies are ready to be rendered. | the indirect draw command would have changed.
    - Greedy mesh.
      - Check if full or empty
      - Check distance from 0,0,0 to select LOD
      - Generate LOD if it was not generated already. | the chunk keeps track of this (6 bits somewhere else for lods already existing)
      - Check if voxels exist or not | may generate an empty vector if chunk is underground
      - Update the chunk's lookup table data | LOD selected, full or empty, dirty flag
    - Upload verticies to the GPU | Synchronize with the preprocessor indirect buffer
    - Set the signal
  - CHUNK_MANAGER_FLUSH_PREPROCESSOR
    - Signals when FlatNodes & other requirements are read for the preprocessor compute shader.
    - Generate FlatNodes for Chunks
    - Generate LUT
    - Upload to GPU for processing
    - Try to cull underground voxels from flat nodes | This will enable both CPU compute and GPU compute to run in parallel. Save nano seconds!
    - Proprocessor
      - Culls chunks hidden from the camera view | Frustum culling
      - Builds indirect draw buffer | Uses the LUT & FlatNode Id to figure out LOD offsets
    - Update? indirect draw buffer
    - Set the signal!!

If it's possible to process both CPU and GPU work in parallel use this to synchronize work:

Use a memory barrier to synchronize the compute shader with the draw command
If the CPU work is not done, don't submit the draw command
Once the CPU work is done, submit the draw command, it'll wait on the memory barrier and eventually the draw command will execute with both data

There seems to be a memory issue, It's definitely in the Buffer::Upload(). Most likely because to copy memory, etc we use a command buffer. We may update/create a new buffer before the command is executed. There are issues with using Buffer::Upload() in a loop like how I use it. The problem seems to be reallocating more memory, causes an explosion. If I set the buffer size from 1024 to 64mb, it reallocates only once and no explosion happens. This Buffer:Upload() problem was resolved by calling Upload only once. Initially I was calling Upload() in a for loop. This is definitely not how I want to do it. So Does buffer.Upload() have a problem?

There are two problems

SVO RCU is causing memory to go up but not come down after Sync()
- Temporarly fixed by removing copies when changes are made. Need to add this back in the future
- May want to remove RCU entirely.
Buffer resize causes bugs if the Size is too small and we call Upload in quick successions

TODO:

Multiple Flattened SVO to the GPU
- Chunk Manager ChunkSO - Chunk VoxelSO

struct Header {
    uint id; // The id is a 1D index into a 3D grid (x + (y * 32 + (z * 32 * 32)))
    uint count; // Number of nodes
    uint offset; // The offset index from where this chunk's svo starts.
};

struct FlatNode {
    uint PackedIDC;
    uint ChildIndex;
}

layout(std430,set=1,binding=50)readonly buffer SparseOctreeLUT{
  Header headers[];
};

layout(std430,set=1,binding=50)readonly buffer SparseOctree{
  FlatNode nodes[];
};

Name		Name	Last commit message	Last commit date
Latest commit History 190 Commits
.vscode		.vscode
akari		akari
app		app
engine		engine
experiment		experiment
external/stb		external/stb
logs		logs
vendor		vendor
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
BENCHMARK.md		BENCHMARK.md
CMakeLists.txt		CMakeLists.txt
CornellBoxGI.webp		CornellBoxGI.webp
Makefile		Makefile
README.md		README.md
compileShaders.sh		compileShaders.sh
imgui.ini		imgui.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vk Voxel Playground

Setup

Greedy mesh

TODO

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Vk Voxel Playground

Setup

Greedy mesh

TODO

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages