Skip to content

v1.2

Latest

Choose a tag to compare

@zeux zeux released this 30 Jun 14:42
9d9890c
image

This release introduces a major new feature, tangent frame generation, significantly improves vertex decoding performance on Intel/AMD CPUs and also adds several smaller features and improvements. Highlights:

A new tangent frame generator can generate tangents from either indexed or unindexed triangle mesh with positions/normals/texture coordinates. The generator implements MikkTSpace algorithm, but is significantly faster to run (~6-10x faster than mikktspace.c depending on the input structure). Note that the generation result is a per-corner tangent and applying the tangents to the indexed mesh may require splitting source vertices; consult documentation for details. By default, the algorithm uses a modified weighting scheme that significantly improves tangent quality around beveled regions in the mesh, with meshopt_TangentCompatible option provided for cases where exact compatibility with mikktspace.c is important (e.g. normal map baking workflows).

Vertex decoding (meshopt_decodeVertexBuffer) implementation for Intel/AMD CPUs has been significantly revised. The new mostly branchless SSSE3 implementation is usually ~20-45% faster than it was in previous releases, with the gains depending on CPU and the data composition; it's typical to see gains in the upper half of this range for engine-packed data. This implementation is automatically selected on compatible CPUs (SSSE3+POPCNT), with no change in data encoding, so to get this performance boost a library update is sufficient. If the code is compiled with AVX-512/AVX10 support (which is currently only selected at compile time when opted into), the decoding is an additional ~10% faster.

Additionally, new index filtering functions are provided to remove degenerate/duplicate triangles based on positional identity, which can be especially helpful for raytracing performance, a function for computing optimal shared exponent for cluster positions can be used to prepare geometry for upcoming DXR2 Compressed1 format, and clusterlod.h now implements DAG BVH construction via clodBuildHierarchy.

The majority of the work on the core library in this release has been sponsored by Valve; thank you!

Library improvements

  • New experimental function for generating tangents based on MikkTSpace algorithm, meshopt_generateTangents
  • New experimental functions for filtering out degenerate and duplicate triangles based on positional identity, meshopt_filterIndexBuffer/meshopt_filterIndexBufferMulti
  • New experimental function for computing optimal shared exponent for cluster positions which can be used with upcoming DXR2 Compressed1 format
  • meshopt_encode/decodeMeshlet* functions, meshopt_extractMeshletIndices and meshopt_optimizeMeshletLevel functions, as well as meshopt_SimplifyVertex_Priority and meshopt_SimplifyRegularizeLight flags, are now stable.
  • Significantly improve meshopt_decodeVertexBuffer performance on existing data for Intel/AMD CPUs (20-45% faster depending on CPU and data characteristics)
  • Significantly improve performance of meshopt_partitionClusters on larger partition sizes (~2x for target_size 64, ~200x for 1024)
  • Improve post-compression ratio for meshlets encoded using meshlet codec after meshopt_optimizeMeshletLevel with level 1+ (~0.5% gains)
  • Fix reduced encoding precision of small numbers when using meshopt_encodeFilterExp with a shared exponent mode if the input contains exact zeroes
  • Support special meshlet hardware configurations that require a limited triangle index span via MESHOPTIMIZER_CLUSTERIZER_INDEXLIMIT define
  • Support direct decoding of vertex data into destination buffer via MESHOPTIMIZER_VERTEXCODEC_ZEROCOPY define for slightly faster decoding (disabled by default as it does not work well with write-combined memory)

Additional improvements

  • clusterlod.h now implements DAG BVH construction via clodBuildHierarchy for efficient hierarchical cut selection
  • Add tangent generation to gltfpack when requested via -gt argument
  • Add experimental MeshoptTangents JavaScript module with the new tangent space generator
  • Fix meshopt_decoder.js Wasm SIMD implementation corner case when using v1 format and highest compression (part of 1.1.1 patch release)
  • Fix a rare race condition in meshopt_decoder.js when using WebWorkers via useWorkers (part of 1.1.1 patch release)
  • Improve vertex decoding performance in meshopt_decoder.js by 5-10%
  • Fix several bugs in gltfpack (mesh merging with negative scales no longer flips tangent frames, more careful handling of animation tracks with zero scale)