TLAS and MultitypeSet by SimonDanisch · Pull Request #14 · JuliaGeometry/Raycore.jl

SimonDanisch · 2026-03-02T11:23:01Z

This adds:

GPU two-level acceleration (TLAS/BLAS): Instanced BVH with per-instance transforms, TLAS/StaticTLAS split (mutable for construction, immutable isbits for kernel traversal), Adapt.jl for CPU→GPU transfer
MultiTypeSet: GPU-safe heterogeneous collection with compile-time type-stable dispatch via with_index, enabling multiple material/texture types without dynamic dispatch on the GPU
GPU utilities: @get/@set SoA macros, for_unrolled/map_unrolled/reduce_unrolled for compile-time loop unrolling, FastClosure for GPU-safe closures

…ry/Raycore.jl into sd/gpu-instanced-bvh

…aycore.jl into sd/multitype-vec

@generated

…12) SetKey.type_idx was changed from UInt8 to UInt32 for LLVM/SPIR-V compatibility, but the @generated with_index function still compared against UInt8 literals. Since Julia's === checks both value and type, UInt32(1) === UInt8(1) is always false, causing all branches to fall through to the default (first material). This made every object render with the same material. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

On Metal, device pointers (Core.LLVMPtr) stored inside GPU buffers cannot be reliably dereferenced by kernels. The inline data (root_aabb) reads correctly, but following embedded pointers to per-BLAS node/primitive arrays returns zeros. Replace the pointer-based BLAS architecture in StaticTLAS with: - BLASDescriptor: lightweight struct with nodes_offset, primitives_offset, root_aabb - Flat concatenated arrays (all_blas_nodes, all_blas_prims) built from per-BLAS GPU arrays - Offset-based indexing in closest_hit/any_hit traversal Management kernels (update_tlas_leaf_aabbs_kernel!, etc.) still use blas_array but only read root_aabb (inline, unaffected). Verified: CPU and Metal produce identical results (mean pixel ~0.327 on 3-sphere test scene). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tal-blas-pointers

Pkg.test() defaults to --check-bounds=yes which injects error paths that can't compile to SPIR-V. GPU tests now auto-skip with @test_broken when bounds checking is forced. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- viewfactors_content.md: Build visualization mesh from TLAS primitives instead of raw merged mesh. The TLAS removes degenerate triangles, so face counts differed (202 vs 250), causing FaceView verification error. - gpu_raytracing_tutorial.md: bvh.primitives → bvh.all_blas_prims (StaticTLAS field was renamed in TLAS/BLAS refactor) - bvh_hit_tests_content.md: Fix swapped benchmark labels (closest_hit was showing any_hit timing and vice versa), remove empty section header, fix test numbering (3 not 4) - instanced-bvh-architecture.md: Replace broken example using TriangleMesh/inv_translate with working high-level TLAS API - raytracing_tutorial_content.md: Fix "Analougus" → "Analogous" - README.md: Add MultiTypeSet and GPU TLAS to features list

Results (400×720, 4spp, 6014 triangles): Wavefront GPU: 2.7 ms (winner, 223x vs CPU baseline) Tiled (32×16): 7.5 ms Tiled (32×32): 8.3 ms Unrolled: 14.2 ms Baseline GPU: 16.2 ms Tiled (8×8): 16.7 ms Wavefront CPU: 97.0 ms Baseline CPU: 602.7 ms

The example scene uses Y-up geometry (floor at y=-1.5), but the wavefront renderer defaulted to camera_up=Vec3f(0,0,1) (Z-up), producing an upside-down/rotated view. Also fix camera_lookat to look along +Z matching the simple camera used by other kernels.

Shows how to enable hw_accel=true with Lava backend, explains the architecture (extract-trace-shade pipeline), includes benchmark comparison between SW BVH and HW RT on materials scene (20 spheres, AMD RX 7900 XTX). Honest results: parity on simple scenes, HW RT advantage on complex geometry (3.5M+ triangles).

AMD RX 7900 XTX via AMDGPU.jl, dragon mesh (249K tris) + procedural. Key findings: Raycore 3.5-20x faster for ray tracing (single-pass closest-hit with early termination vs two-pass BV candidate list). ImplicitBVH 2-5x faster for BVH build (simpler construction).

SimonDanisch and others added 30 commits December 22, 2025 19:23

add better camera

3f2e5c5

get things working

7360498

fixes tests and docs

cb3ca12

bvh4 experiment

b3312da

bvh4

cd1701c

Merge branch 'sd/gpu-instanced-bvh' of https://github.com/JuliaGeomet…

ad827a0

…ry/Raycore.jl into sd/gpu-instanced-bvh

fixes

4d56d99

unrolling and gpu tools

3d579cd

refactor our unroll strategy

058e724

getindex unrolled

286c7a2

implement multitype vec

0203b6d

api refactor

a4a3651

refactor

7494706

fixes

09d5d76

add mapreduce

6dc4e3a

renaming and fixes

59d3498

add deref for array for more uniform handling

819d1a4

add comment

2c5cb2c

small fixes

12b2ccb

improve updating support

ab4184c

fix empty blas?

23e0465

change for per triangle meta

70b2ea3

allow submesh materials

0bdbb63

refactor and cleanup

875706a

fix setkey for OpenCL

3e6b3c8

use less depth

f132fbe

Merge branch 'sd/multitype-vec' of https://github.com/JuliaGeometry/R…

7c0123f

…aycore.jl into sd/multitype-vec

polish for release

46133d2

SimonDanisch added 2 commits March 2, 2026 11:41

Merge remote-tracking branch 'origin/sd/multitype-vec' into jk/fix-me…

b8b2be5

…tal-blas-pointers

final cleanup and removal of unused apis

1712d5e

SimonDanisch mentioned this pull request Mar 2, 2026

Sd/gpu instanced bvh #11

Closed

SimonDanisch and others added 5 commits March 2, 2026 16:31

use tlas consistently

21678d7

synchronize?

f41d746

Skip GPU kernel tests under --check-bounds=yes

f0f57ed

Pkg.test() defaults to --check-bounds=yes which injects error paths that can't compile to SPIR-V. GPU tests now auto-skip with @test_broken when bounds checking is forced. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

SimonDanisch force-pushed the sd/multitype-vec branch from 2d3d32f to dc3c159 Compare March 28, 2026 11:45

SimonDanisch added 3 commits March 28, 2026 12:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TLAS and MultitypeSet#14

TLAS and MultitypeSet#14
SimonDanisch wants to merge 40 commits intomasterfrom
sd/multitype-vec

SimonDanisch commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SimonDanisch commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants