Skip to content

Releases: NVIDIA/warp

v1.0.2

22 Mar 20:42
Compare
Choose a tag to compare

[1.0.2] - 2024-03-22

  • Make examples runnable from any location
  • Fix the examples not running directly from their Python file
  • Add the example gallery to the documentation
  • Update README.md examples USD location
  • Update example_graph_capture.py description

v1.0.1

15 Mar 17:32
Compare
Choose a tag to compare

[1.0.1] - 2024-03-15

  • Document Device total_memory and free_memory
  • Documentation for allocators, streams, peer access, and generics
  • Changed example output directory to current working directory
  • Added python -m warp.examples.browse for browsing the examples folder
  • Print where the USD stage file is being saved
  • Added examples/optim/example_walker.py sample
  • Make the drone example not specific to USD
  • Reduce the time taken to run some examples
  • Optimise rendering points with a single colour
  • Clarify an error message around needing USD
  • Raise exception when module is unloaded during graph capture
  • Added wp.synchronize_event() for blocking the host thread until a recorded event completes
  • Flush C print buffers when ending stdout capture
  • Remove more unneeded CUTLASS files
  • Allow setting mempool release threshold as a fractional value

v1.0.0

08 Mar 01:58
Compare
Choose a tag to compare

[1.0.0] - 2024-03-07

  • Add FeatherstoneIntegrator which provides more stable simulation of articulated rigid body dynamics in generalized coordinates (State.joint_q and State.joint_qd)
  • Introduce warp.sim.Control struct to store control inputs for simulations (optional, by default the Model control inputs are used as before); integrators now have a different simulation signature: integrator.simulate(model: Model, state_in: State, state_out: State, dt: float, control: Control)
  • joint_act can now behave in 3 modes: with joint_axis_mode set to JOINT_MODE_FORCE it behaves as a force/torque, with JOINT_MODE_VELOCITY it behaves as a velocity target, and with JOINT_MODE_POSITION it behaves as a position target; joint_target has been removed
  • Add adhesive contact to Euler integrators via Model.shape_materials.ka which controls the contact distance at which the adhesive force is applied
  • Improve handling of visual/collision shapes in URDF importer so visual shapes are not involved in contact dynamics
  • Experimental JAX kernel callback support
  • Improve module load exception message
  • Add wp.ScopedCapture
  • Removing enable_backward warning for callables
  • Copy docstrings and annotations from wrapped kernels, functions, structs

v0.15.1

06 Mar 02:49
Compare
Choose a tag to compare

[0.15.1] - 2024-03-05

  • Add examples assets to the wheel packages
  • Fix broken image link in documentation
  • Fix codegen for custom grad functions calling their respective forward functions
  • Fix custom grad function handling for functions that have no outputs
  • Fix issues when wp.config.quiet = True

v0.15.0

05 Mar 05:06
Compare
Choose a tag to compare

[0.15.0] - 2024-03-04

  • Add thumbnails to examples gallery
  • Apply colored lighting to examples
  • Moved examples directory under warp/
  • Add example usage to python -m warp.tests --help
  • Adding torch.autograd.function example + docs
  • Add error-checking to array shapes during creation
  • Adding example_graph_capture
  • Add a Diffsim Example of a Drone
  • Fix verify_fp causing compiler errors and support CPU kernels
  • Fix to enable matmul to be called in CUDA graph capture
  • Enable mempools by default
  • Update wp.launch to support tuple args
  • Fix BiCGSTAB and GMRES producing NaNs when converging early
  • Fix warning about backward codegen being disabled in test_fem
  • Fix assert_np_equal when NaN's and tolerance are involved
  • Improve error message to discern between CUDA being disabled or not supported
  • Support cross-module functions with user-defined gradients
  • Suppress superfluous CUDA error when ending capture after errors
  • Make output during initialization atomic
  • Add warp.config.max_unroll, fix custom gradient unrolling
  • Support native replay snippets using @wp.func_native(snippet, replay_snippet=replay_snippet)
  • Look for the CUDA Toolkit in default locations if the CUDA_PATH environment variable or --cuda_path build option are not used
  • Added wp.ones() to efficiently create one-initialized arrays
  • Rename wp.config.graph_capture_module_load_default to wp.config.enable_graph_capture_module_load_by_default

[0.14.0] - 2024-02-19

  • Add support for CUDA pooled (stream-ordered) allocators
    • Support memory allocation during graph capture
    • Support copying non-contiguous CUDA arrays during graph capture
    • Improved memory allocation/deallocation performance with pooled allocators
    • Use wp.config.enable_mempools_at_init to enable pooled allocators during Warp initialization (if supported)
    • wp.is_mempool_supported() - check if a device supports pooled allocators
    • wp.is_mempool_enabled(), wp.set_mempool_enabled() - enable or disable pooled allocators per device
    • wp.set_mempool_release_threshold(), wp.get_mempool_release_threshold() - configure memory pool release threshold
  • Add support for direct memory access between devices
    • Improved peer-to-peer memory transfer performance if access is enabled
    • Caveat: enabling peer access may impact memory allocation/deallocation performance and increase memory consumption
    • wp.is_peer_access_supported() - check if the memory of a device can be accessed by a peer device
    • wp.is_peer_access_enabled(), wp.set_peer_access_enabled() - manage peer access for memory allocated using default CUDA allocators
    • wp.is_mempool_access_supported() - check if the memory pool of a device can be accessed by a peer device
    • wp.is_mempool_access_enabled(), wp.set_mempool_access_enabled() - manage access for memory allocated using pooled CUDA allocators
  • Refined stream synchronization semantics
    • wp.ScopedStream can synchronize with the previous stream on entry and/or exit (only sync on entry by default)
    • Functions taking an optional stream argument do no implicit synchronization for max performance (e.g., wp.copy(), wp.launch(), wp.capture_launch())
  • Support for passing a custom deleter argument when constructing arrays
    • Deprecation of owner argument - use deleter to transfer ownership
  • Optimizations for various core API functions (e.g., wp.zeros(), wp.full(), and more)
  • Fix wp.matmul() to always use the correct CUDA context
  • Fix memory leak in BSR transpose
  • Fix stream synchronization issues when copying non-contiguous arrays

[0.13.1] - 2024-02-22

  • Ensure that the results from the Noise Deform are deterministic across different Kit sessions

v0.13.0

16 Feb 23:42
Compare
Choose a tag to compare

[0.13.0] - 2024-02-16

  • Update the license to NVIDIA Software License, allowing commercial use (see LICENSE.md)
  • Add CONTRIBUTING.md guidelines (for NVIDIA employees)
  • Hash CUDA snippet and adj_snippet strings to fix caching
  • Fix build_docs.py on Windows
  • Add missing .py extension to warp/tests/walkthrough_debug
  • Allow wp.bool usage in vector and matrix types

[0.12.0] - 2024-02-05

  • Add a warning when the enable_backward setting is set to False upon calling wp.Tape.backward()
  • Fix kernels not being recompiled as expected when defined using a closure
  • Change the kernel cache appauthor subdirectory to just "NVIDIA"
  • Ensure that gradients attached to PyTorch tensors have compatible strides when calling wp.from_torch()
  • Add a Noise Deform node for OmniGraph that deforms points using a perlin/curl noise

v0.11.0

23 Jan 21:39
Compare
Choose a tag to compare

[0.11.0] - 2024-01-23

  • Re-release 1.0.0-beta.7 as a non-pre-release 0.11.0 version so it gets selected by pip install warp-lang.
  • Introducing a new versioning and release process, detailed in PACKAGING.md and resembling that of Python itself:
    • The 0.11 release(s) can be found on the release-0.11 branch.
    • Point releases (if any) go on the same minor release branch and only contain bug fixes, not new features.
    • The public branch, previously used to merge releases into and corresponding with the GitHub main branch, is retired.

[1.0.0-beta.7] - 2024-01-23

  • Ensure captures are always enclosed in try/finally
  • Only include .py files from the warp subdirectory into wheel packages
  • Fix an extension's sample node failing at parsing some version numbers
  • Allow examples to run without USD when possible
  • Add a setting to disable the main Warp menu in Kit
  • Add iterative linear solvers, see wp.optim.linear.cg, wp.optim.linear.bicgstab, wp.optim.linear.gmres, and wp.optim.linear.LinearOperator
  • Improve error messages around global variables
  • Improve error messages around mat/vec assignments
  • Support conversion of scalars to native/ctypes, e.g.: float(wp.float32(1.23)) or ctypes.c_float(wp.float32(1.23))
  • Add a constant for infinity, see wp.inf
  • Add a FAQ entry about array assignments
  • Add a mass spring cage diff simulation example, see examples/example_diffsim_mass_spring_cage.py
  • Add -s, --suite option for only running tests belonging to the given suites
  • Fix common spelling mistakes
  • Fix indentation of generated code
  • Show deprecation warnings only once
  • Improve wp.render.OpenGLRenderer
  • Create the extension's symlink to the core library at runtime
  • Fix some built-ins failing to compile the backward pass when nested inside if/else blocks
  • Update examples with the new variants of the mesh query built-ins
  • Fix type members that weren't zero-initialized
  • Fix missing adjoint function for wp.mesh_query_ray()

v1.0.0-beta.6

10 Jan 21:44
Compare
Choose a tag to compare
v1.0.0-beta.6 Pre-release
Pre-release

[1.0.0-beta.6] - 2024-01-10

  • Do not create CPU copy of grad array when calling array.numpy()
  • Fix assert_np_equal() bug
  • Support Linux AArch64 platforms, including Jetson/Tegra devices
  • Add parallel testing runner (invoke with python -m warp.tests, use warp/tests/unittest_serial.py for serial testing)
  • Fix support for function calls in range()
  • matmul adjoints now accumulate
  • Expand available operators (e.g. vector @ matrix, scalar as dividend) and improve support for calling native built-ins
  • Fix multi-gpu synchronization issue in sparse.py
  • Add depth rendering to OpenGLRenderer, document warp.render
  • Make atomic_min, atomic_max differentiable
  • Fix error reporting using the exact source segment
  • Add user-friendly mesh query overloads, returning a struct instead of overwriting parameters
  • Address multiple differentiability issues
  • Fix backpropagation for returning array element references
  • Support passing the return value to adjoints
  • Add point basis space and explicit point-based quadrature for warp.fem
  • Support overriding the LLVM project source directory path using build_lib.py --build_llvm --llvm_source_path=
  • Fix the error message for accessing non-existing attributes
  • Flatten faces array for Mesh constructor in URDF parser

v1.0.0-beta.5

22 Nov 03:09
Compare
Choose a tag to compare
v1.0.0-beta.5 Pre-release
Pre-release

[1.0.0-beta.5] - 2023-11-22

  • Fix for kernel caching when function argument types change
  • Fix code-gen ordering of dependent structs
  • Fix for wp.Mesh build on MGPU systems
  • Fix for name clash bug with adjoint code: #154
  • Add wp.frac() for returning the fractional part of a floating point value
  • Add support for custom native CUDA snippets using @wp.func_native decorator
  • Add support for batched matmul with batch size > 2^16-1
  • Add support for tranposed CUTLASS wp.matmul() and additional error checking
  • Add support for quad and hex meshes in wp.fem
  • Detect and warn when C++ runtime doesn't match compiler during build, e.g.: libstdc++.so.6: version `GLIBCXX_3.4.30' not found
  • Documentation update for wp.BVH
  • Documentaiton and simplified API for runtime kernel specialization wp.Kernel

[1.0.0-beta.4] - 2023-11-01

  • Add wp.cbrt() for cube root calculation
  • Add wp.mesh_furthest_point_no_sign() to compute furthest point on a surface from a query point
  • Add support for GPU BVH builds, 10-100x faster than CPU builds for large meshes
  • Add support for chained comparisons, i.e.: 0 < x < 2
  • Add support for running warp.fem examples headless
  • Fix for unit test determinism
  • Fix for possible GC collection of array during graph capture
  • Fix for wp.utils.array_sum() output initialization when used with vector types
  • Coverage and documentation updates

[1.0.0-beta.3] - 2023-10-19

  • Add support for code coverage scans (test_coverage.py), coverage at 85% in omni.warp.core
  • Add support for named component access for vector types, e.g.: a = v.x
  • Add support for lvalue expressions, e.g.: array[i] += b
  • Add casting constructors for matrix and vector types
  • Add support for type() operator that can be used to return type inside kernels
  • Add support for grid-stride kernels to support kernels with > 2^31-1 thread blocks
  • Fix for multi-process initialization warnings
  • Fix alignment issues with empty wp.struct
  • Fix for return statement warning with tuple-returning functions
  • Fix for wp.batched_matmul() registering the wrong function in the Tape
  • Fix and document for wp.sim forward + inverse kinematics
  • Fix for wp.func to return a default value if function does not return on all control paths
  • Refactor wp.fem support for new basis functions, decoupled function spaces
  • Optimizations for wp.noise functions, up to 10x faster in most cases
  • Optimizations for type_size_in_bytes() used in array construction

[1.0.0-beta.2] - 2023-09-01

  • Fix for passing bool into wp.func functions
  • Fix for deprecation warnings appearing on stderr, now redirected to stdout
  • Fix for using for i in wp.hash_grid_query(..) syntax

[1.0.0-beta.1] - 2023-08-29

  • Fix for wp.float16 being passed as kernel arguments
  • Fix for compile errors with kernels using structs in backward pass
  • Fix for wp.Mesh.refit() not being CUDA graph capturable due to synchronous temp. allocs
  • Fix for dynamic texture example flickering / MGPU crashes demo in Kit by reusing ui.DynamicImageProvider instances
  • Fix for a regression that disabled bundle change tracking in samples
  • Fix for incorrect surface velocities when meshes are deforming in OgnClothSimulate
  • Fix for incorrect lower-case when setting USD stage "up_axis" in examples
  • Fix for incompatible gradient types when wrapping PyTorch tensor as a vector or matrix type
  • Fix for adding open edges when building cloth constraints from meshes in wp.sim.ModelBuilder.add_cloth_mesh()
  • Add support for wp.fabricarray to directly access Fabric data from Warp kernels, see https://omniverse.gitlab-master-pages.nvidia.com/usdrt/docs/usdrt_prim_selection.html for examples
  • Add support for user defined gradient functions, see @wp.func_replay, and @wp.func_grad decorators
  • Add support for more OG attribute types in omni.warp.from_omni_graph()
  • Add support for creating NanoVDB wp.Volume objects from dense NumPy arrays
  • Add support for wp.volume_sample_grad_f() which returns the value + gradient efficiently from an NVDB volume
  • Add support for LLVM fp16 intrinsics for half-precision arithmetic
  • Add implementation of stochastic gradient descent, see wp.optim.SGD
  • Add warp.fem framework for solving weak-form PDE problems (see https://nvidia.github.io/warp/_build/html/modules/fem.html)
  • Optimizations for omni.warp extension load time (2.2s to 625ms cold start)
  • Make all omni.ui dependencies optional so that Warp unit tests can run headless
  • Deprecation of wp.tid() outside of kernel functions, users should pass tid() values to wp.func functions explicitly
  • Deprecation of wp.sim.Model.flatten() for returning all contained tensors from the model
  • Add support for clamping particle max velocity in wp.sim.Model.particle_max_velocity
  • Remove dependency on urdfpy package, improve MJCF parser handling of default values

v0.10.1

01 Aug 22:09
Compare
Choose a tag to compare
v0.10.1 Pre-release
Pre-release

[0.10.1] - 2023-07-25

  • Fix for large multidimensional kernel launches (> 2^32 threads)
  • Fix for module hashing with generics
  • Fix for unrolling loops with break or continue statements (will skip unrolling)
  • Fix for passing boolean arguments to build_lib.py (previously ignored)
  • Fix build warnings on Linux
  • Fix for creating array of structs from NumPy structured array
  • Fix for regression on kernel load times in Kit when using warp.sim
  • Update warp.array.reshape() to handle -1 dimensions
  • Update margin used by for mesh queries when using wp.sim.create_soft_body_contacts()
  • Improvements to gradient handling with warp.from_torch(), warp.to_torch() plus documentation

[0.10.0] - 2023-07-05

  • Add support for macOS universal binaries (x86 + aarch64) for M1+ support
  • Add additional methods for SDF generation please see the following new methods:
    • wp.mesh_query_point_nosign() - closest point query with no sign determination
    • wp.mesh_query_point_sign_normal() - closest point query with sign from angle-weighted normal
    • wp.mesh_query_point_sign_winding_number() - closest point query with fast winding number sign determination
  • Add CSR/BSR sparse matrix support, see warp.sparse module:
    • wp.sparse.BsrMatrix
    • wp.sparse.bsr_zeros(), wp.sparse.bsr_set_from_triplets() for construction
    • wp.sparse.bsr_mm(), wp.sparse_bsr_mv() for matrix-matrix and matrix-vector products respectively
  • Add array-wide utilities:
    • wp.utils.array_scan() - prefix sum (inclusive or exlusive)
    • wp.utils.array_sum() - sum across array
    • wp.utils.radix_sort_pairs() - in-place radix sort (key,value) pairs
  • Add support for calling @wp.func functions from Python (outside of kernel scope)
  • Add support for recording kernel launches using a wp.Launch object that can be replayed with low overhead, use wp.launch(..., record_cmd=True) to generate a command object
  • Optimizations for wp.struct kernel arguments, up to 20x faster launches for kernels with large structs or number of params
  • Refresh USD samples to use bundle based workflow + change tracking
  • Add Python API for manipulating mesh and point bundle data in OmniGraph, see omni.warp.nodes module
    • See omni.warp.nodes.mesh_create_bundle(), omni.warp.nodes.mesh_get_points(), etc.
  • Improvements to wp.array:
    • Fix a number of array methods misbehaving with empty arrays
    • Fix a number of bugs and memory leaks related to gradient arrays
    • Fix array construction when creating arrays in pinned memory from a data source in pageable memory
    • wp.empty() no longer zeroes-out memory and returns an uninitialized array, as intended
    • array.zero_() and array.fill_() work with non-contiguous arrays
    • Support wrapping non-contiguous NumPy arrays without a copy
    • Support preserving the outer dimensions of NumPy arrays when wrapping them as Warp arrays of vector or matrix types
    • Improve PyTorch and DLPack interop with Warp arrays of arbitrary vectors and matrices
    • array.fill_() can now take lists or other sequences when filling arrays of vectors or matrices, e.g. arr.fill_([[1, 2], [3, 4]])
    • array.fill_() now works with arrays of structs (pass a struct instance)
    • wp.copy() gracefully handles copying between non-contiguous arrays on different devices
    • Add wp.full() and wp.full_like(), e.g., a = wp.full(shape, value)
    • Add optional device argument to wp.empty_like(), wp.zeros_like(), wp.full_like(), and wp.clone()
    • Add indexedarray methods .zero_(), .fill_(), and .assign()
    • Fix indexedarray methods .numpy() and .list()
    • Fix array.list() to work with arrays of any Warp data type
    • Fix array.list() synchronization issue with CUDA arrays
    • array.numpy() called on an array of structs returns a structured NumPy array with named fields
    • Improve the performance of creating arrays
  • Fix for Error: No module named 'omni.warp.core' when running some Kit configurations (e.g.: stubgen)
  • Fix for wp.struct instance address being included in module content hash
  • Fix codegen with overridden function names
  • Fix for kernel hashing so it occurs after code generation and before loading to fix a bug with stale kernel cache
  • Fix for wp.BVH.refit() when executed on the CPU
  • Fix adjoint of wp.struct constructor
  • Fix element accessors for wp.float16 vectors and matrices in Python
  • Fix wp.float16 members in structs
  • Remove deprecated wp.ScopedCudaGuard(), please use wp.ScopedDevice() instead