7.0
Overview
This release of Falcor provides the following significant improvements and new features:
- Falcor as a Python extension, allowing Falcor to be used directly from Python.
- CUDA interop, including sharing buffers and synchronization between Falcor and CUDA.
- PyTorch interop, allowing Falcor to be used for implementing PyTorch functions.
- Differentiable Slang and various examples including a BSDF optimizer and a differentiable path tracer.
- Support for Shader Execution Reordering (SER) in the path tracer.
Dependencies
- Update
slang
to version2023.3.20
. - Update
nvapi
to versionR535
. - Update
DLSS
to version3.5
.
Build System
- Enable more MSVC warnings.
- Add
FALCOR_ENABLE_ASSERTS
CMake option. - Remove
FALCOR_REPORT_EXCEPTION_AS_ERROR
CMake option.
Assets
- Fix media files to use relative paths.
- Move volume test scenes to media package.
- Move
grey_and_white_room
scene tofalcor_media
. - Replace
Cerberus
withCesiumMan
.
Examples
- Add a simple example showing interop between Falcor and pytroch, learning to represent an image from a set of gaussians.
- Add Slang based BC7 compressor.
DiffSlang
- Add a
BSDFOptimizer
that can run simple inverse rendering for solving material parameters without any path tracing. - Add a differentiable
evalAD
function toIMaterialInstance
and a differentiablesetupDiffMaterialInstance
toIMaterial
. - Implement
evalAD
andsetupDiffMaterialInstance
forPBRTDiffuseMaterial
. Other types of materials have naive placeholders for now. - Add a helper class
GradientIOWrapper
to set up the dataflow of scene gradients during backpropagation. - Mark
ShadingData
andShadingFrame
asIDifferentiable
. - Add a unit test to check
evalAD
ofPBRTDiffuseMaterial
.
Add helper classes for differentiable rendering: SceneManager
: get/set scene parameters (only apply to the albedo value ofPBRTDiffuse
for now).SceneGradients
: store scene gradients from backpropagation.
Python API
- Add interface for passing in an already created
Device
toTestbed
. - Add Python bindings for accessing
Buffer
as pytorch tensor. - Add Python bindings for CUDA/Falcor sync.
- Add Python bindings for
Program
,ProgramDesc
andComputePass
. - Add a simple example using compute shaders from Python.
- Fix
EnvMap
Python constructor (they can't return null which we did when the file isn't found). - Improve
Testbed
:- Add shader reloading on
F5
- Fix profiler toggling and render profiler even if UI is disabled
- Fix
show_ui
- Add
render_texture
for setting a texture to be rendered on the window - Add
should_close
andclose
method for handling shutdown - Cleanup + comments
- Add shader reloading on
- Add Python bindings for program reflection types.
- Add Python bindings for
ShaderVar
. - Add Python bindings to
Texture
for reading/writing subresources as numpy arrays. - Add Python bindings to
Buffer
class:- Buffer properties
to_numpy
/from_numpy
to convert to/from numpy arrays
- Add Python bindings to
Device
class:create_buffer
,create_typed_buffer
andcreate_structured_buffer
- Add initial Python bindings to
CopyContext
class. - Add Python unit tests for buffer creation, writing and reading from/to numpy arrays.
- Add
pybind11::falcor_enum
helper to allow binding enums that already have string infos (usingFALCOR_ENUM_INFO
). - Include
pybind11/functional.h
to get correct typing information onstd::function
types. - Add Python bindings for
Sampler
class. - Add
[]
accessor toRenderGraph
Python bindings, allowing to access a pass throughg["PathTracer"]
for example. - Add basic ImGUI wrapper with Python bindings in
falcor.ui
submodule. - Only render UI for render graph / scene when they one is loaded.
- Add implicit conversion from python lists to vector types (allow assigning
[10, 20, 30]
instead offloat3(10, 20, 30)
). - Python bindings for creating profiler events.
Cleanup
- Cleanup use of
$for..
loops in shaders. - Add comment about usage of
SlangCompilationFlags::DumpIntermediates
. - Rename
computeNewRayOrigin()
tocomputeRayOrigin()
. - Rename
setShaderData
tobindShaderData
. - Add
FALCOR_EXPORT_D3D12_AGILITY_SDK
to all sample applications. - Apply
clang-format
on most render passes. - Remove
PythonDictionary
class. - Rename
InternalDictionary
toDictionary
. - Disable
clang-format
argument bin packing.
Error Handling
- Fix
fstd::source_location::current()
on MSVC. - Take
std::string_view
as message onException
and descendants. Simplify the exception classes as they don't need to do string formatting anymore. FALCOR_GFX_CALL
(gfxReportError
handler) now handles NVIDIA Aftermath crash dumps before callingreportFatalErrorAndTerminate
.msgBox
creates a window that is always top-most (otherwise we got hidden windows if the main window was not open / already closed).- Fix
getStackTrace
(need to create theTraceResolver
first). - Assertions now throw
AssertionError
exceptions which translate to Python. - Add support for string message (and formatting) with
FALCOR_ASSERT
. - Add
ErrorDiagnosticFlags
to control how errors are reported:BreakOnThrow
enables breaking into attached debugger when callingFALCOR_THROW
(break on call-site).BreakOnAssert
enables breaking into attached debugger when callingFALCOR_ASSERT
(break on call-site).AppendStackTrace
enables appending a stack trace to the exception message when callingFALCOR_THROW
andFALCOR_ASSERT
.ShowMessageBoxOnError
enables showing a message box when calling thereportError
family of functions.
- Change
reportError
intoreportErrorAndContinue
to better describe what it is doing. This function no longer terminates the application. - Change
reportErrorAndAllowRetry
to returntrue
if user clickedRetry
. This function no longer terminates the application. - Change
reportFatalError
intoreportFatalErrorAndTerminate
to better describe what it is doing. - Add
catchAndReportAllExceptions
that is now used in all sample applications to globally catch errors. - Remove the local exception catching in
SampleApp::run
. - Fix places where before we relied on application termination when using
reportErrorAndAllowRetry
. We now throw an exception. - Replace calls to
reportError
with either one of these: exceptions, logging, show message box. - Consolidate
Core/Assert.h
,Core/Errors.h
andCore/ErrorHandling.h
intoCore/Error.h
. - Simplify error handling conventions in Falcor:
- Use
static_assert
andFALCOR_ASSERT
for assertions. - Use
FALCOR_THROW
andFALCOR_CHECK
to throwRuntimeError
exception. - Remove
ArgumentError
. - Remove
checkArgument
andcheckInvariant
and just useFALCOR_CHECK
. - Replace all use of the
FALCOR_CHECK_ARG_XXX
macros with justFALCOR_CHECK
. - Make
FALCOR_UNIMPLEMENTED
andFALCOR_UNREACHABLE
throw exceptions instead of being assertions.
- Use
- Adjust conventions in
error-handling.md
.
Core
- Rename
RenderContext::flush
toRenderContext::submit
. - Rename
Device::flushAndSync
toDevice::wait
. - Move state object creation to
Device
and simplify description structs. - Remove profiler calls around
Swapchain::present()
(these calls crash on Vulkan and we probably don't need them so let's just get rid of them). - Remove
ProgramDesc::languagePrelude
andProgram::setLanguagePrelude
. - Add error checks for texture resource view creation that the right bind flags are set.
- Rename
TextureManager::TextureHandle
toTextureManager::CpuTextureHandle
to avoid name clash with GPU-sideTextureHandle
. - Add convenience functions to convert between CPU and GPU texture handles.
- Update
TextureManager
for safe resolve of UDIMs. - Disable logging to a file when using Falcor from Python. Without this we litter the runtime folder with
python.exe.X.log
files. - Add native
AdapterLUID
andAdapterInfo
classes inDevice.h
. - Rename
GpuFence
toFence
. - Introduce
FenceDesc
andFence::getDesc
. - Add
CopyContext::signal
andCopyContext::wait
to handle fence signaling and waiting on the command queue. - Remove
GpuFence::syncGpu
, instead clients need to callCopyContext::wait
. - Rename
GpuFence::syncCpu
toFence::wait
. - Introduce signaled value which represents the last signaled value of the fence (this replaces the CPU value, which was always the last signaled value + 1).
- Introduce
Fence::kAuto
which replaces thestd::optional
we used before to differentiate between signaling specific fence values or auto-incrementing. - Add
Fence::updateSignaledValue
which any signaler (host, device, external) can use to update the signaled value and/or get the auto-incremented value to signal. - Introduce timeout when waiting on fence on host.
- Improve error handling for setting variables in
ParameterBlock
and throughShaderVar
.- Throw exception when trying to bind a resource to a variable of a different type.
- Throw exception when trying to bind a resource to a SRV that is not created with the
ShaderResource
flag. - Throw exception when trying to bind a resource to a UAV that is not created with the
UnorderedAccess
flag. - Throw exception when trying to get a resource from a variable of a different type.
- Throw exception when trying to set a uniform variable with a different size/type.
- Reduce reference counting overhead for type information.
- Lifetime is tied to the
ParameterBlockReflection
object. TheTypedShaderVarOffset
has a non-owning pointer to the type. ShaderVar
does not own either the pointer toParameterBlock
(same as before) or the type information.
- Lifetime is tied to the
- Add
FALCOR_GFX_CALL
checks to GFX dispatch calls. - Use
std::string_view
for shader variable and reflection lookups, avoiding a lot of heap allocations for constructing temporarystd::string
objects. - Falcor uses SM6_6 by default now, so the explicit calls for 6_5 are no longer required
- Add
ShaderDesc::fromFile
andShaderDesc::fromString
to reduce temporary copies (and improve readability). - Cleanup
ShaderModule
constructors and usage. - Rename
downstreamCompilerArgs
back tocompilerArguments
as this is the better name (it's command line options passed to slang, not the downstream compiler, it's just that we most often use it for that). - Remove
ProgramDesc::addShaderSource
. - Remove
ProgramDesc::getMaxTraceRecursionDepth
. - Use Shader Model 6.6 by default, or most recent supported one by the device.
- Add checks in render passes that need minimum shader model.
- Throw in
Program
constructor if the requested shader model is not supported. - Create
Types.h
for common graphics types. - Move
Device::ShaderModel
toShaderModel
inTypes.h
. - Move
ShaderType
toTypes.h
and removeShaderType.h
. - Use
ShaderModel
inProgramDesc
. - Remove
RtProgram
(merge functionality intoProgram
).- Add raytracing pipeline properties to
ProgramDesc
:maxTraceRecursionDepth
,maxPayloadSize
,maxAttributeSize
,rtPipelineFlags
- Add raytracing pipeline properties to
- Move remaining code in
RtProgram
ontoProgram
for now (getRtso
)- Will be refactored later
- Replace
Program::CompilerFlags
withSlangCompilerFlags
. - Replace
Program::Desc
withProgramDesc
. - Replace
Program::ShaderModule
withProgramDesc::ShaderModule
. - Replace
Program::ShaderModuleList
withProgramDesc::ShaderModuleList
. - Replace
Program::TypeConformanceList
withTypeConformanceList
. - Remove
ComputeProgram
andGraphicsProgram
and useProgram
instead.- Add
Program::createCompute
andProgam::createGraphics
helpers.
- Add
- Remove
ComputeVars
andGraphicsVars
and useProgramVars
instead. - Refactor
Program::Desc
intoProgramDesc
that has a more reasonable structure:- A
ShaderModule
is a list of sources and a module name that gets compiled into a separate translation unit (using same terminology as slang). - Get rid of the
createNewTranslationUnit
flag, which was just a weird way to split the global source list into multiple modules. - A
EntryPointGroup
is a list of entry points in a specific shader module. Before theProgram::Desc
had a flat list of entry points and various other places (sources, groups) pointing to it. - All fields on the
ProgramDesc
are public. In theory one could create a description directly, without using the builder helper functions. - Get rid of internal state for building (active group index, etc.).
- A
- Move various
Texture::create
functions toDevice::createTexture
functions. - Move various
Buffer::create
functions toDevice::createBuffer
functions. - Remove
createStructured
function that takes apProgram
. - Add
Device::createStructuredBuffer
that takes aReflectionType
. - Set global defines in
ProgramManager
constructor. - Rename
Buffer::CpuAccess
toMemoryType
:CpuAccess::None
->MemoryType::DeviceLocal
CpuAccess::Write
->MemoryType::Upload
CpuAccess::Read
->MemoryType::ReadBack
- Cleanup
GpuMemoryHeap
to also useMemoryType
. - Add a readback memory heap to
Device
. - Make
CpuAccess
enum have different semantics:CpuAccess::None
meansDeviceLocal
type memory (no access from CPU).CpuAccess::Write
meansUpload
type memory (write access from CPU).CpuAccess::Read
meansReadback
type memory (read access from CPU).- These will be renamed to
MemoryType::DeviceLocal
,MemoryType::Upload
andMemoryType::Readback
in a later MR. - The
Buffer
class now represents a fixed piece of memory and not some potentially transient piece of memory on a heap.
Buffer::map
now only works on buffers that have the correct memory type.MapType::WriteDiscard
is deprecated.- Add
Buffer::getBlob
to read back memory from a buffer. - Add
Buffer::getElement<T>
andBuffer::getElements<T>
helper functions. - Replace lots of
map/unmap
code on device local buffers (which is now illegal) to useBuffer::getElement(s)
. - Use rotating vertex buffers / vaos on
TextRenderer
andGui
to avoid stalling on writes. - Implement manual constant buffer handling in
NRD
.- Add
D3D12ConstantBufferView
constructor taking a memory address + size.
- Add
- Rework
BufferAccessTests
to test the new semantics. - Replace
Sampler::create
withDevice::createSampler
. - Move
Sampler::Filter
toTextureFilteringMode
. - Move
Sampler::AddressMode
toTextureAddressingMode
. - Move
Sampler::ReductionMode
toTextureReductionMode
. - Remove
Sampler::ComparisonMode
and useComparisonFunc
instead. - Rename
Sampler::Desc::comparisonMode
toSampler::Desc::comparisonFunc
. - Remove
DepthStencilState::Func
alias and useComparisonFunc
instead. - Deprecate
Resource::BindFlags
and useResourceBindFlags
instead. - Refactor asset path resolution:
- Add
AssetResolver
class for resolving asset paths. - Remove global data file directories in
OS.h
/OS.cpp
. - Remove path resolution in all of the
createFromFile
functions. - Move asset path resolution to the Python bindings that are used in
.pyscene
files.
- Add
- Use absolute paths for loading data files in both application code and unit tests.
- Report downstream shader compilation time.
- Add
getProjectDirectory()
that returns the absolute path to the root of the project directory. - Rename
_PROJECT_DIR_
toFALCOR_PROJECT_DIR
and useCMAKE_SOURCE_DIR
. - Cleanup
getInitialShaderDirectories()
andgetInitialDataDirectories()
. - Add
FALCOR_ENABLE_PROFILER
CMake option. - Remove
FalcorConfig.h
. - Remove
FALCOR_ENABLE_LOGGER
configuration option.
CUDA
- Add CUDA shared memory holder to
Buffer
. - Make
cuda_utils::ExternalBuffer
andcuda_utils::ExternalSemaphore
keep non-owning pointer to the Falcor resource/fence. In the future we should replace that withweak_ref
. - Add
CopyContext::waitForCuda
andCopyContext::waitForFalcor
to synchronize CUDA to Falcor and vice-versa. - Use new CUDA synchronization in
OptixDenoiser
(improves perf with the denoisedWavefrontPathTracer
from 110fps to 130fps). - Use new CUDA synchronization in
TestPyTorchPass
and make suretest_pytorch.py
still succeeds. - GFX doesn't support shared fences with Vulkan yet, so this new synchronization method currently only works with D3D12.
- Add support for importing Vulkan buffers.
- Remove
cuda_utils::initCuda
andcuda_utils::setCudaContext
. - Add
cuda::utils::CudaDevice
class for creating a CUDA device sharing the same adapter as the graphics device. - Add
Device::initCudaDevice()
andDevice::getCudaDevice()
functions for initializing/getting a CUDA device sharing the same adapter as the graphics device. - Switch to using
initCudaDevice
for all code that currently requires a CUDA device. - Minor cleanup in
CudaUtils.h
/CudaUtils.cpp
. - Add
CudaRuntime.h
wrapper for fixing the vector type name clashes. - Refactor
CudaUtils.h
and put them intocuda_utils
namespace. - Move
CudaBuffer
helper class toOptixDenoiser
which is the only client (the buffer class isn't great, so let's not encourage to use it in other places). - Consolidate the two
FalcorCUDA
modules withCudaUtils
and use that.
Utilities
- Refactor
PixelDebug
:- Use
ParameterBlock
for keepingPixelDebug
data. - Use
RWByteAddressBuffer
to manage buffer counters (works in Vulkan). - Use combined readback buffer for all data (counters + record buffer).
- General cleanup.
- Use
Scene
- Remove old
nullTracePass
workaround. - Replace
getParameterBlock
withsetShaderData
. - Add a fallback tangent generation to ensure that valid degenerate inputs (collapsed triangles without nans) always produce valid outputs (no NaNs).
- Fix a bug where the base mesh was assumed to have triangleCount faces rather than its actual count.
- Parallelized UV mesh tiles creation.
- Fixed a regression in skinning normal computation.
- Change the
setCacheMeshes
to incrementally adding them, allowing having more than one source of CachedMeshes. - Fixed Loop subdiv safeguards incorrectly checking for all-triangle meshes.
- Remove
pybind11
dependency in importers. - Split general USD utility from USDImporter into their own libraries, so they could be reused.
- Fix enabling/disabling animations.
- Use upload heap for writing instance descs for TLAS build/update.
- Add debug prints of
SceneBuilder
content, used to compare whether various importers loaded the same data. - Add small efficiencies in ingesting cached data (allowing to ingest them incrementally).
- Changed skeletal matrices from doubles to floats.
- Fix a bug where curves of length 2 could access out of bounds memory.
- Fix a bug where the curve frame would accumulate error in normal and binormal, causing the frame to stop being orthonormal.
- Add checks to make sure the curve frame is unit length.
- Update
Scene
to recreate its parameter block when scene defines change. - Update rendering code to always bind the latest scene block.
- Fix a bug where curves that taper to 0 width would break if converted to polygons. We now clamp the width to min float16 value.
- Fix a bug where all USD curves always had animation, even if they had just one keyframe
Materials
- Material parameter serialization and reflection (used for inverse rendering).
- Make
StandardMaterial
differentiable. - Make
PBRTConductor
differentiable. - Introduce
EmissiveMaterialsChanged
material update flag to notify when emissive materials change. - Recreate
LightCollection
if emissive materials change. - Refactor
MaterialSystem::update()
to cleanup how updates are tracked and handled for some types of changes. - Add support for dynamic materials and call
update()
unconditionally each frame for such. - Update path tracers to recompile shaders less often on material changes.
- Point of entry subsurface for
StandardMaterial
. - Fix desc count in
MaterialSystem
. - Update material system to handle replacing materials at runtime.
- Add
Material::getMaterialLayout()
.
Render Passes
- Implement
setProperties
forPathTracer
render pass. - Propagate options to subsystems when calling
setProperties
. - Add
reset
methods toPathTracer
for resetting frame counter. - Adjust image test scripts to make use of
reset
andset_properties
. - Disable warning 30056 when compiling NRD shaders (short-circuit
?
being deprecated). - Update render passes and modules to handle
Scene::UpdateFlags::RecompileNeeded
.
Pathtracer
- Extend
PathTracer
with SER support. - Fix detection of total internal reflection.
Testing
- Enable more image tests for Vulkan.
- Make
run_unit_tests
andrun_image_tests
work if called from any directory other thantests
. - Add
--run-only
option torun_image_tests
for slang testing. - Cleanup
StructuredBufferMatrix
unit test. - Break debugger on failure if in debug mode.
- Add
EXPECT_THROW
andEXPECT_THROW_AS
checks. - Allow running test scripts outside of a git clone.