Skip to content

Release v0.5.0

Latest

Choose a tag to compare

@swahtz swahtz released this 02 Jul 08:09
v0.5.0
4e192dd

What's Changed

  • Bump verion to 0.3.1 by @swahtz in #303
  • shared memory types in IntegrateTSDF kernel by @swahtz in #307
  • Viewer now supports multiple scenes by @phapalova in #308
  • Update nanovdb_editor version. by @areidmeyer in #309
  • Implementing morton and hilbert for Grid and GridBatch ijk by @blackencino in #311
  • Google analytics in docs by @fwilliams in #312
  • minor edits by @kmuseth in #305
  • Fix radix sort prefetch for disjoint sorts by @matthewdcong in #315
  • Rename _Cpp python extension library binary to _fvdb_cpp, include pybind11 headers first by @harrism in #313
  • Morton/Hilbert fixes by @blackencino in #316
  • Active grid coords cleanup by @blackencino in #318
  • Use smaller runtime images for non-build CI steps by @matthewdcong in #320
  • Make nightly workflows not run on forks of openvdb/fvdb-core by @harrism in #319
  • Changed the _Cpp.pyi filename to _fvdb_cpp.pyi by @blackencino in #322
  • Convolution default fixed, extensive tests by @blackencino in #321
  • Morton/Hilbert module level standalone functions by @blackencino in #323
  • conda environment files: switch to conda-forge torchvision by @swahtz in #327
  • Fix documentation for setting TORCH_CUDA_ARCH_LIST by @matthewdcong in #328
  • Eliminate snake_case in GaussianTileIntersection.cu by @harrism in #330
  • Fix GHA error in (currently unused) nightly workflows by @harrism in #329
  • Disable running tests for Draft PRs by @swahtz in #339
  • build.sh debug build fixes by @swahtz in #338
  • Add viz bindings for wait and add_image by @phapalova in #332
  • Render all contributing gaussian IDs/weights by @swahtz in #340
  • Fix Gaussian rasterization shared memory alignment by @swahtz in #342
  • [Bug Fix] Improve binary search handling in JIdxForJOffsets by @iYuqinL in #325
  • Debug build gtest fix by @swahtz in #344
  • GaussianProjectionForward: fix camera data loading that exceeds blockDim by @swahtz in #345
  • Support backgrounds in Gaussian Rasterization by @harrism in #343
  • Remove tests.yml 'push' trigger by @swahtz in #347
  • Temporarily disable nightly benchmark/test job by @swahtz in #346
  • Fix potential oversubscription of nvcc * cmake parallel threads by @swahtz in #351
  • Hot fixes for Jcat errors by @blackencino in #352
  • Fix inverted logic on computing abs(gradient) in backward Gaussian rasterization by @harrism in #355
  • A pedantic pull request by @swahtz in #359
  • Fix JaggedTensor.from_*_and_list_ids ldim=2 issue by @swahtz in #357
  • Gradients added to Convolution Ground Truth Unit Tests by @blackencino in #358
  • JaggedTensor::unbind*: Reduce number of blocking GPU->CPU copies by reusing cached lsizes by @swahtz in #360
  • Usescaled_dot_product_attention operator from Torch, remove our SDPA by @swahtz in #364
  • Added backwards/gradient tests to the default convolution unit tests. by @blackencino in #361
  • Add lineinfo option to build by @harrism in #367
  • Create and use getMaxSharedMemory utility by @harrism in #368
  • Remove unnecessary stream sync in GaussianTileIntersection by @harrism in #370
  • plumb sparse rendering functions by @fwilliams in #348
  • MCMC gaussian splatting relocation kernel and unit tests by @harrism in #374
  • MCMC add noise kernel and gtests by @harrism in #377
  • Bindings and Pytests for MCMC functions by @harrism in #394
  • Expose min_opacity parameter in Gaussian MCMC relocate function. by @harrism in #396
  • Expose k and t parameters in MCMC relocation by @fwilliams in #402
  • Optimize joffsets construction via pinned memory by @matthewdcong in #403
  • NVIDIA Branding in docs by @fwilliams in #405
  • Fix hardcoded float dtype by @matthewdcong in #406
  • Prefetch fused SSIM outputs to avoid write page faults by @matthewdcong in #407
  • Fix the docstrings of fvdb.viz.Scene.camera_orbit_direction by @swahtz in #398
  • Fix NaNs in rasterizeTopContributingGaussianIdsForward weights by @swahtz in #400
  • Plumb Sparse Gaussian TileIntersection by @swahtz in #401
  • Fix/improve radix sort synchronization by @matthewdcong in #409
  • Fix the derivation of the number of cameras in the rasterization kernels that would be incorrect when in packed mode. by @swahtz in #412
  • CMake: create nanovdb_editor_BINARY_DIR before using it as a working directory by @harrism in #416
  • Fix crash loading GaussianPly files to CPU device by @swahtz in #417
  • Switch radix sort merge from host to device side synchronization by @matthewdcong in #415
  • Pin PyTorch version in CI to 2.9.1 by @matthewdcong in #424
  • computeSparseInfo small optimization by @swahtz in #428
  • Fix semantics of GaussianTileIntersection's torch::cumsum by @swahtz in #427
  • Fix version macro and PyTorch 2.10 build error by @matthewdcong in #423
  • Multi-Axis Dispatch Framework by @blackencino in #418
  • SampleGridTrilinear: Vectorized float4 loads by @swahtz in #430
  • Implement PrivateUse1 (multiGPU) support for MCMC kernels by @harrism in #421
  • Rasterize contributing Gaussian ID kernels optimizations by @swahtz in #429
  • Expose evaluate_spherical_harmonics in Python bindings by @swahtz in #431
  • Fix: GaussianSphericalHarmonicsBackwards race condition for cameras/batch-size > 1 by @swahtz in #434
  • Update CUDA 13 nightly test image by @matthewdcong in #438
  • Disable two GCC 13.3 warnings by @matthewdcong in #439
  • Improve PyTorch build configuration time by @matthewdcong in #441
  • Switch to device-centric synchronization for forEach mGPU by @matthewdcong in #440
  • Fix chain rule for log_scale gradient in projection backward pass by @harrism in #433
  • Gaussian Projection with the Unscented Transform by @fwilliams in #420
  • Build speedups, added trace and optional pip forced install by @blackencino in #443
  • Fix: GaussianProjectionBackward dLossDQuat missing warpSum by @swahtz in #435
  • Optimize Gaussian tile intersection for mGPU by @matthewdcong in #446
  • dispatch framework for_each, views, and tag canonicalization by @blackencino in #452
  • Fix: ProjectionForward initializes accessors in both the initializer list and constructor body by @swahtz in #450
  • ProjectedGaussianSplats opacities uses expand/view and accessors instead of per-camera copies by @swahtz in #451
  • Remove unused sparse convolution backends by @blackencino in #454
  • Add developer worktree tools for parallel development workflows by @harrism in #445
  • Add AGENTS.md for AI agent guidance by @harrism in #455
  • Fix fvdb-issue clipboard crash in SSH sessions by @harrism in #461
  • Fix CI checks showing 'waiting for status' when paths-ignore skips workflows by @swahtz in #462
  • Fix JaggedTensor single-element constructor unconditionally initializing CUDA via pinned_memory by @swahtz in #468
  • CI: Skip stopping runners if starting runners was skipped by @harrism in #471
  • Skip stopping runners that were skipped for CUDA 12.8 and 13.0 tests. by @harrism in #472
  • Add no-argument interactive mode to fvdb-issue by @harrism in #466
  • Sparse Conv Default full feature support by @blackencino in #473
  • Rasterization using 3d gaussians by @fwilliams in #444
  • Update nanovdb to version 32.9.0 and refine grid type checks by @swahtz in #475
  • Nightly build and publish action by @swahtz in #477
  • Nightly publish fix for non-existent nightly packages by @swahtz in #478
  • Updated nightly build install docs by @swahtz in #481
  • Downgrade nano to 32.8.0 by @swahtz in #482
  • Fix multibatch mGPU race condition in SH backwards op by @matthewdcong in #484
  • Upgrade openvdb git tag for NanoVDB 32.9.0 by @swahtz in #483
  • Fix datatype in backward projection test by @matthewdcong in #486
  • Switch from inverse to linalg_inv_ex to avoid sync by @matthewdcong in #487
  • Refactor CameraIntrinsics constructor and add missing include by @blackencino in #489
  • SampleGridTrilinear optimization: stencil + sample by @swahtz in #474
  • Refactor Gaussian rendering to use composable camera model for projection and ray generation by @fwilliams in #485
  • Add masks support to all Gaussian render methods by @swahtz in #480
  • Update NanoVDB to v32.9.1 by @swahtz in #493
  • Feature/op consolidation by @blackencino in #492
  • SaveNanoVDB: Fix voxel size/origin metadata on serialized index grids by @swahtz in #490
  • viewer fix for notebook by @zlalena in #350
  • Disable unreliable rasterization deadlock test by @fwilliams in #498
  • SimpleUnet bug fixes by @swahtz in #496
  • Added the github CLI to the conda dev environment by @blackencino in #500
  • Handle duplicate pixels in sparse pixel gaussian rendering by @harrism in #488
  • Add bfloat16 support to JaggedTensor reduce operators by @swahtz in #501
  • Add seed initialization in TestSimpleUNet class by @swahtz in #502
  • Add unit tests for fvdb.nn modules by @swahtz in #497
  • Pin CI checkout refs to immutable commit SHAs to fix build/test skew by @swahtz in #503
  • Docs, examples, notebooks udpates by @swahtz in #504
  • Version class by @swahtz in #507
  • Improve/optimize mGPU scaling via batched prefetching and sorting changes by @matthewdcong in #499
  • Switch 32-bit tensor index accessors to 64-bit across all ops by @harrism in #505
  • Fix CCCL version check macro by @matthewdcong in #509
  • Add devtools script to report unanswered external issues by @harrism in #510
  • Fix flaky test_jsum_list_of_lists bfloat16 test by @fwilliams in #517
  • Update CONDA_OVERRIDE_CUDA to 13.0 by @swahtz in #519
  • Add Slack output format and daily CI workflow for unanswered issues report by @harrism in #513
  • Fix insider issues filtering for CI unanswered issues script by @harrism in #522
  • Release Process Updates by @swahtz in #525
  • Update CI workflow to include git installation in system dependencies by @swahtz in #526
  • Add release branching docs and automation scripts by @harrism in #512
  • Fix smoke test Python setup and open release PR as draft by @harrism in #528
  • Make start-release.sh idempotent for safe re-runs by @harrism in #529
  • Run unit tests only for matching test_environment.yml config by @harrism in #531
  • Restore fix for dLossDQuat missing warpSum by @matthewdcong in #533
  • Fix quaternion gradient accumulation in GaussianProjectionJaggedBackward by @swahtz in #534
  • Fix publish workflow: Rocky Linux 8 containers and single unit test job by @harrism in #536
  • Fix publish.yml python install action by @swahtz in #537
  • Update publish.yml to include additional system dependencies by @swahtz in #538
  • Align publish.yml build containers with tests.yml (Rocky Linux 8 / manylinux_2_28) by @swahtz in #540
  • publish: dual S3 + PyPI publish on release (with GPU tests) by @swahtz in #545
  • Improve mGPU partitioning for Gaussian projection operators by @matthewdcong in #547
  • Improve mGPU partitioning for SH operators by @matthewdcong in #546
  • nightly-publish fix errors for missing tools by @swahtz in #549
  • Check for zero intersection case in tile intersection prefetch by @matthewdcong in #553
  • Add automated doc version updates to release scripts by @harrism in #552
  • finish-release-process.sh updates to preserve release/* branch integrity by @swahtz in #544
  • Update NanoVDB Editor to latest. by @areidmeyer in #556
  • Add CHANGES.md by @swahtz in #539
  • Upgrade conda env files to gcc/gxx 14.3 by @swahtz in #557
  • Replace Slack issue report with event-driven issue triage labels by @harrism in #551
  • Added nanovdb-editor as an optional dependency by @swahtz in #559
  • 0.4.2 Changelog and Docs update by @swahtz in #565
  • Docs deployment workflow fix by @swahtz in #566
  • Fuse computeGradientState into projectionBackwardsKernel by @matthewdcong in #560
  • PyTorch 2.11 support for venv CI by @matthewdcong in #561
  • Reduce shared memory usage in pinhole projection by re-arranging blocks by @matthewdcong in #555
  • Centralize Github workflow and doc version configuration into shared config by @swahtz in #569
  • Add camera_fov getter/setter to fvdb.viz.Scene by @swahtz in #558
  • Remove torchsparse from all environment and CI configurations by @swahtz in #572
  • Add cache clearing step in nightly publish workflow by @swahtz in #574
  • Remove torch_scatter dependency in favor of built-in PyTorch scatter_reduce_ by @swahtz in #571
  • Remove vestigial setup.py and GitLab CI config by @swahtz in #570
  • Hotfix release process improvements by @swahtz in #563
  • Update Gaussian splatting camera API and world-space parity by @fwilliams in #518
  • Retire C++ GridBatch wrapper; add functional API and Grid class by @blackencino in #582
  • scaled_dot_production_attention support for additional Torch backends (Flash Attention) by @swahtz in #365
  • Fixes code samples in Sphinx docs that should have border by @swahtz in #577
  • Reimplement JaggedReduce ops with PyTorch scatter_reduce_ by @swahtz in #578
  • Eliminate .item() synchronization stalls in hot C++ paths by @swahtz in #586
  • Pass current CUDA stream to all kernel launches by @swahtz in #587
  • Add TEACHME interactive lesson documents for fvdb core API by @harrism in #584
  • Fix missing CUDA device guards in kernel-launching functions by @swahtz in #589
  • [CI] Get nanovdb-editor from pip instead from built whl by @phapalova in #581
  • Use pip package of nanovdb-editor by @phapalova in #580
  • Upgrade clang-tools to 21 to fix clangd SIGSEGV on CUDA files by @fwilliams in #491
  • Move Gaussian splatting autograd and pipeline logic from C++ to Python by @fwilliams in #595
  • Fix tutorial docs: move from wip, fix broken APIs, add CI testing by @harrism in #592
  • Fix additional sync point introduced in autograd change by @matthewdcong in #599
  • Materialize repeated opacities for compatibility with multiple splatting implementations by @matthewdcong in #600
  • Restore useful comments from autograd/pipeline refactor by @swahtz in #603
  • Fix weighted average in TSDF integration to apply pixelWeight to new samples by @jinhwanlazy in #588
  • Improve tutorial content based on review feedback by @harrism in #598
  • Environment/Docs/URL Updates by @swahtz in #605
  • Initialize gradient accumulation tensors before UT projection path by @harrism in #608
  • Versioned Documentation: Documentation integration with Read the Docs by @swahtz in #610
  • CI: GH Actions Version Updates by @swahtz in #611
  • Fixes to Sphinx Docs Build by @swahtz in #613
  • Documentation: Add pre_build job to Read the Docs configuration for version generation by @swahtz in #615
  • Documentation: Fix Read the Docs build and resolve Sphinx warnings by @swahtz in #618
  • Refactor Gaussian splatting ops and extract utility functions by @fwilliams in #596
  • Implement GitHub Actions workflow for Sphinx documentation build test by @swahtz in #622
  • Documentation: Improve version label contrast in sidebar and fix reality-capture URLs by @swahtz in #623
  • Documentation: Switch docs redirect to point to main RTD URL by @swahtz in #625
  • CI: Revert failing drop cache step by @phapalova in #629
  • Fix URLs in the README to point to the ReadtheDocs site by @swahtz in #626
  • Promote GridBatchData to public header by @swahtz in #632
  • Fix fvdb.viz.PointCloudView use of older API by @swahtz in #631
  • Reorder Gaussian2D to improve field alignment by @matthewdcong in #624
  • Remove unused SH function by @matthewdcong in #630
  • Add CMake installation support for public headers and configuration files by @swahtz in #633
  • Fix inject_from CUDA crash when source grid has 0 voxels by @harrism in #616
  • Disambiguate CI job names and fix Torch CMake header path by @swahtz in #635
  • Add sample_nearest operator for GridBatch and Grid by @swahtz in #628
  • Generalize volume_render to N channels by @swahtz in #636
  • Fix build issue with SampleNearest by @swahtz in #637
  • Add __launch_bounds__ to forEach CUDA kernels by @swahtz in #638
  • ci(nightly): anchor nightly version to upcoming release in pyproject.toml by @swahtz in #645
  • CI: fix nightly wheel build by @phapalova in #634
  • Add Vec2 and double fast paths to SampleGridTrilinear by @swahtz in #639
  • Optimizations for volume_render and move its autograd layer to Python by @swahtz in #640
  • Speed up builds with ccache, host PCH, and trimmed torch headers by @swahtz in #644
  • NanoVDB loading: fix mixed grid type loading and add read_metadata API by @swahtz in #641
  • [docs] Nightly build version numbering update for installation documentation by @swahtz in #646
  • ci(publish): set short cache-control and invalidate CloudFront on index uploads by @swahtz in #647
  • saveNVDB Optimizations by @swahtz in #650
  • Shared memory optimizations for Gaussian rasterization by @matthewdcong in #554
  • Avoid repeated delta computation by @matthewdcong in #651
  • Add warp level early exit for forward rasterization by @matthewdcong in #658
  • More optimal prefetching for mGPU Gaussian splatting by @matthewdcong in #657
  • Fix Conda build failure by @matthewdcong in #661
  • Improve parity with gsplat for dense rasterization by @matthewdcong in #659
  • CMake: Link exported Torch target by @swahtz in #662
  • fix: silence SyntaxWarning and tensor copy-construct UserWarning in tests by @mvanhorn in #654
  • Improve mGPU Gaussian tile intersection by @matthewdcong in #664
  • ray_implicit_intersection improvements by @swahtz in #663
  • Improve prefetch granularity for rasterization kernels by @matthewdcong in #665
  • Upgrade to PyTorch 2.11 by @swahtz in #573
  • Add narrow-band SDF reinitialize/retopologize ops by @swahtz in #669
  • Update dev_environment.yml to newer openusd version by @zlalena in #667
  • CI: Removed the pytorch upper-bound version in pyproject.toml by @swahtz in #671
  • CI Token Best Practices Sweep by @swahtz in #672
  • Workflow Security: scope bundled shellcheck to real issues by @swahtz in #674
  • CODEOWNERS: require NVIDIA maintainer review for governance/CI files by @harrism in #676
  • CI: correct the change-detection gate so docs-only PRs skip cleanly by @harrism in #677
  • v0.5 Release: Update CHANGES.md to reflect contributions since 0.4 by @swahtz in #675
  • fix: TensorGrid blind-data lookup uses index 0 instead of loop counter by @mvanhorn in #652
  • docs: fix marching_cubes return type (unique vertex indices, not normals) by @mvanhorn in #653
  • Add CHANGES.md entries by @swahtz in #679
  • CI: bump uv to 0.11.26 so Python 3.14 builds use stable CPython by @swahtz in #681

New Contributors

Full Changelog: v0.3.0...v0.5.0