Skip to content

Commit

Permalink
BLI: refactor IndexMask for better performance and memory usage
Browse files Browse the repository at this point in the history
Goals of this refactor:
* Reduce memory consumption of `IndexMask`. The old `IndexMask` uses an
  `int64_t` for each index which is more than necessary in pretty much all
  practical cases currently. Using `int32_t` might still become limiting
  in the future in case we use this to index e.g. byte buffers larger than
  a few gigabytes. We also don't want to template `IndexMask`, because
  that would cause a split in the "ecosystem", or everything would have to
  be implemented twice or templated.
* Allow for more multi-threading. The old `IndexMask` contains a single
  array. This is generally good but has the problem that it is hard to fill
  from multiple-threads when the final size is not known from the beginning.
  This is commonly the case when e.g. converting an array of bool to an
  index mask. Currently, this kind of code only runs on a single thread.
* Allow for efficient set operations like join, intersect and difference.
  It should be possible to multi-thread those operations.
* It should be possible to iterate over an `IndexMask` very efficiently.
  The most important part of that is to avoid all memory access when iterating
  over continuous ranges. For some core nodes (e.g. math nodes), we generate
  optimized code for the cases of irregular index masks and simple index ranges.

To achieve these goals, a few compromises had to made:
* Slicing of the mask (at specific indices) and random element access is
  `O(log #indices)` now, but with a low constant factor. It should be possible
  to split a mask into n approximately equally sized parts in `O(n)` though,
  making the time per split `O(1)`.
* Using range-based for loops does not work well when iterating over a nested
  data structure like the new `IndexMask`. Therefor, `foreach_*` functions with
  callbacks have to be used. To avoid extra code complexity at the call site,
  the `foreach_*` methods support multi-threading out of the box.

The new data structure splits an `IndexMask` into an arbitrary number of ordered
`IndexMaskSegment`. Each segment can contain at most `2^14 = 16384` indices. The
indices within a segment are stored as `int16_t`. Each segment has an additional
`int64_t` offset which allows storing arbitrary `int64_t` indices. This approach
has the main benefits that segments can be processed/constructed individually on
multiple threads without a serial bottleneck. Also it reduces the memory
requirements significantly.

For more details see comments in `BLI_index_mask.hh`.

I did a few tests to verify that the data structure generally improves
performance and does not cause regressions:
* Our field evaluation benchmarks take about as much as before. This is to be
  expected because we already made sure that e.g. add node evaluation is
  vectorized. The important thing here is to check that changes to the way we
  iterate over the indices still allows for auto-vectorization.
* Memory usage by a mask is about 1/4 of what it was before in the average case.
  That's mainly caused by the switch from `int64_t` to `int16_t` for indices.
  In the worst case, the memory requirements can be larger when there are many
  indices that are very far away. However, when they are far away from each other,
  that indicates that there aren't many indices in total. In common cases, memory
  usage can be way lower than 1/4 of before, because sub-ranges use static memory.
* For some more specific numbers I benchmarked `IndexMask::from_bools` in
  `index_mask_from_selection` on 10.000.000 elements at various probabilities for
  `true` at every index:
  ```
  Probability      Old        New
  0              4.6 ms     0.8 ms
  0.001          5.1 ms     1.3 ms
  0.2            8.4 ms     1.8 ms
  0.5           15.3 ms     3.0 ms
  0.8           20.1 ms     3.0 ms
  0.999         25.1 ms     1.7 ms
  1             13.5 ms     1.1 ms
  ```

Pull Request: https://projects.blender.org/blender/blender/pulls/104629
  • Loading branch information
JacquesLucke committed May 24, 2023
1 parent f3f2f7f commit 2cfcb8b
Show file tree
Hide file tree
Showing 182 changed files with 4,061 additions and 2,954 deletions.
20 changes: 10 additions & 10 deletions source/blender/blenkernel/BKE_attribute_math.hh
Original file line number Diff line number Diff line change
Expand Up @@ -294,7 +294,7 @@ template<typename T> class SimpleMixer {
/**
* \param mask: Only initialize these indices. Other indices in the buffer will be invalid.
*/
SimpleMixer(MutableSpan<T> buffer, const IndexMask mask, T default_value = {})
SimpleMixer(MutableSpan<T> buffer, const IndexMask &mask, T default_value = {})
: buffer_(buffer), default_value_(default_value), total_weights_(buffer.size(), 0.0f)
{
BLI_STATIC_ASSERT(std::is_trivial_v<T>, "");
Expand Down Expand Up @@ -327,7 +327,7 @@ template<typename T> class SimpleMixer {
this->finalize(IndexMask(buffer_.size()));
}

void finalize(const IndexMask mask)
void finalize(const IndexMask &mask)
{
mask.foreach_index([&](const int64_t i) {
const float weight = total_weights_[i];
Expand Down Expand Up @@ -365,7 +365,7 @@ class BooleanPropagationMixer {
/**
* \param mask: Only initialize these indices. Other indices in the buffer will be invalid.
*/
BooleanPropagationMixer(MutableSpan<bool> buffer, const IndexMask mask) : buffer_(buffer)
BooleanPropagationMixer(MutableSpan<bool> buffer, const IndexMask &mask) : buffer_(buffer)
{
mask.foreach_index([&](const int64_t i) { buffer_[i] = false; });
}
Expand All @@ -391,7 +391,7 @@ class BooleanPropagationMixer {
*/
void finalize() {}

void finalize(const IndexMask /*mask*/) {}
void finalize(const IndexMask & /*mask*/) {}
};

/**
Expand Down Expand Up @@ -421,7 +421,7 @@ class SimpleMixerWithAccumulationType {
* \param mask: Only initialize these indices. Other indices in the buffer will be invalid.
*/
SimpleMixerWithAccumulationType(MutableSpan<T> buffer,
const IndexMask mask,
const IndexMask &mask,
T default_value = {})
: buffer_(buffer), default_value_(default_value), accumulation_buffer_(buffer.size())
{
Expand Down Expand Up @@ -449,7 +449,7 @@ class SimpleMixerWithAccumulationType {
this->finalize(buffer_.index_range());
}

void finalize(const IndexMask mask)
void finalize(const IndexMask &mask)
{
mask.foreach_index([&](const int64_t i) {
const Item &item = accumulation_buffer_[i];
Expand Down Expand Up @@ -478,12 +478,12 @@ class ColorGeometry4fMixer {
* \param mask: Only initialize these indices. Other indices in the buffer will be invalid.
*/
ColorGeometry4fMixer(MutableSpan<ColorGeometry4f> buffer,
IndexMask mask,
const IndexMask &mask,
ColorGeometry4f default_color = ColorGeometry4f(0.0f, 0.0f, 0.0f, 1.0f));
void set(int64_t index, const ColorGeometry4f &color, float weight = 1.0f);
void mix_in(int64_t index, const ColorGeometry4f &color, float weight = 1.0f);
void finalize();
void finalize(IndexMask mask);
void finalize(const IndexMask &mask);
};

class ColorGeometry4bMixer {
Expand All @@ -500,12 +500,12 @@ class ColorGeometry4bMixer {
* \param mask: Only initialize these indices. Other indices in the buffer will be invalid.
*/
ColorGeometry4bMixer(MutableSpan<ColorGeometry4b> buffer,
IndexMask mask,
const IndexMask &mask,
ColorGeometry4b default_color = ColorGeometry4b(0, 0, 0, 255));
void set(int64_t index, const ColorGeometry4b &color, float weight = 1.0f);
void mix_in(int64_t index, const ColorGeometry4b &color, float weight = 1.0f);
void finalize();
void finalize(IndexMask mask);
void finalize(const IndexMask &mask);
};

template<typename T> struct DefaultMixerStruct {
Expand Down
14 changes: 7 additions & 7 deletions source/blender/blenkernel/BKE_curves.hh
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ class CurvesGeometry : public ::CurvesGeometry {
/** Set all curve types to the value and call #update_curve_types. */
void fill_curve_types(CurveType type);
/** Set the types for the curves in the selection and call #update_curve_types. */
void fill_curve_types(IndexMask selection, CurveType type);
void fill_curve_types(const IndexMask &selection, CurveType type);
/** Update the cached count of curves of each type, necessary after #curve_types_for_write. */
void update_curve_types();

Expand All @@ -173,10 +173,10 @@ class CurvesGeometry : public ::CurvesGeometry {
/**
* All of the curve indices for curves with a specific type.
*/
IndexMask indices_for_curve_type(CurveType type, Vector<int64_t> &r_indices) const;
IndexMask indices_for_curve_type(CurveType type, IndexMaskMemory &memory) const;
IndexMask indices_for_curve_type(CurveType type,
IndexMask selection,
Vector<int64_t> &r_indices) const;
const IndexMask &selection,
IndexMaskMemory &memory) const;

Array<int> point_to_curve_map() const;

Expand Down Expand Up @@ -361,16 +361,16 @@ class CurvesGeometry : public ::CurvesGeometry {

void calculate_bezier_auto_handles();

void remove_points(IndexMask points_to_delete,
void remove_points(const IndexMask &points_to_delete,
const AnonymousAttributePropagationInfo &propagation_info = {});
void remove_curves(IndexMask curves_to_delete,
void remove_curves(const IndexMask &curves_to_delete,
const AnonymousAttributePropagationInfo &propagation_info = {});

/**
* Change the direction of selected curves (switch the start and end) without changing their
* shape.
*/
void reverse_curves(IndexMask curves_to_reverse);
void reverse_curves(const IndexMask &curves_to_reverse);

/**
* Remove any attributes that are unused based on the types in the curves.
Expand Down
18 changes: 10 additions & 8 deletions source/blender/blenkernel/BKE_curves_utils.hh
Original file line number Diff line number Diff line change
Expand Up @@ -481,14 +481,14 @@ void copy_point_data(OffsetIndices<int> src_points_by_curve,

void copy_point_data(OffsetIndices<int> src_points_by_curve,
OffsetIndices<int> dst_points_by_curve,
IndexMask src_curve_selection,
const IndexMask &src_curve_selection,
GSpan src,
GMutableSpan dst);

template<typename T>
void copy_point_data(OffsetIndices<int> src_points_by_curve,
OffsetIndices<int> dst_points_by_curve,
IndexMask src_curve_selection,
const IndexMask &src_curve_selection,
Span<T> src,
MutableSpan<T> dst)
{
Expand All @@ -500,13 +500,13 @@ void copy_point_data(OffsetIndices<int> src_points_by_curve,
}

void fill_points(OffsetIndices<int> points_by_curve,
IndexMask curve_selection,
const IndexMask &curve_selection,
GPointer value,
GMutableSpan dst);

template<typename T>
void fill_points(const OffsetIndices<int> points_by_curve,
IndexMask curve_selection,
const IndexMask &curve_selection,
const T &value,
MutableSpan<T> dst)
{
Expand Down Expand Up @@ -541,7 +541,9 @@ bke::CurvesGeometry copy_only_curve_domain(const bke::CurvesGeometry &src_curves
/**
* Copy the number of points in every curve in the mask to the corresponding index in #sizes.
*/
void copy_curve_sizes(OffsetIndices<int> points_by_curve, IndexMask mask, MutableSpan<int> sizes);
void copy_curve_sizes(OffsetIndices<int> points_by_curve,
const IndexMask &mask,
MutableSpan<int> sizes);

/**
* Copy the number of points in every curve in #curve_ranges to the corresponding index in
Expand All @@ -554,12 +556,12 @@ void copy_curve_sizes(OffsetIndices<int> points_by_curve,
IndexMask indices_for_type(const VArray<int8_t> &types,
const std::array<int, CURVE_TYPES_NUM> &type_counts,
const CurveType type,
const IndexMask selection,
Vector<int64_t> &r_indices);
const IndexMask &selection,
IndexMaskMemory &memory);

void foreach_curve_by_type(const VArray<int8_t> &types,
const std::array<int, CURVE_TYPES_NUM> &type_counts,
IndexMask selection,
const IndexMask &selection,
FunctionRef<void(IndexMask)> catmull_rom_fn,
FunctionRef<void(IndexMask)> poly_fn,
FunctionRef<void(IndexMask)> bezier_fn,
Expand Down
34 changes: 18 additions & 16 deletions source/blender/blenkernel/BKE_geometry_fields.hh
Original file line number Diff line number Diff line change
Expand Up @@ -136,53 +136,55 @@ class GeometryFieldInput : public fn::FieldInput {
public:
using fn::FieldInput::FieldInput;
GVArray get_varray_for_context(const fn::FieldContext &context,
IndexMask mask,
const IndexMask &mask,
ResourceScope &scope) const override;
virtual GVArray get_varray_for_context(const GeometryFieldContext &context,
IndexMask mask) const = 0;
const IndexMask &mask) const = 0;
virtual std::optional<eAttrDomain> preferred_domain(const GeometryComponent &component) const;
};

class MeshFieldInput : public fn::FieldInput {
public:
using fn::FieldInput::FieldInput;
GVArray get_varray_for_context(const fn::FieldContext &context,
IndexMask mask,
const IndexMask &mask,
ResourceScope &scope) const override;
virtual GVArray get_varray_for_context(const Mesh &mesh,
eAttrDomain domain,
IndexMask mask) const = 0;
const IndexMask &mask) const = 0;
virtual std::optional<eAttrDomain> preferred_domain(const Mesh &mesh) const;
};

class CurvesFieldInput : public fn::FieldInput {
public:
using fn::FieldInput::FieldInput;
GVArray get_varray_for_context(const fn::FieldContext &context,
IndexMask mask,
const IndexMask &mask,
ResourceScope &scope) const override;
virtual GVArray get_varray_for_context(const CurvesGeometry &curves,
eAttrDomain domain,
IndexMask mask) const = 0;
const IndexMask &mask) const = 0;
virtual std::optional<eAttrDomain> preferred_domain(const CurvesGeometry &curves) const;
};

class PointCloudFieldInput : public fn::FieldInput {
public:
using fn::FieldInput::FieldInput;
GVArray get_varray_for_context(const fn::FieldContext &context,
IndexMask mask,
const IndexMask &mask,
ResourceScope &scope) const override;
virtual GVArray get_varray_for_context(const PointCloud &pointcloud, IndexMask mask) const = 0;
virtual GVArray get_varray_for_context(const PointCloud &pointcloud,
const IndexMask &mask) const = 0;
};

class InstancesFieldInput : public fn::FieldInput {
public:
using fn::FieldInput::FieldInput;
GVArray get_varray_for_context(const fn::FieldContext &context,
IndexMask mask,
const IndexMask &mask,
ResourceScope &scope) const override;
virtual GVArray get_varray_for_context(const Instances &instances, IndexMask mask) const = 0;
virtual GVArray get_varray_for_context(const Instances &instances,
const IndexMask &mask) const = 0;
};

class AttributeFieldInput : public GeometryFieldInput {
Expand Down Expand Up @@ -212,7 +214,7 @@ class AttributeFieldInput : public GeometryFieldInput {
}

GVArray get_varray_for_context(const GeometryFieldContext &context,
IndexMask mask) const override;
const IndexMask &mask) const override;

std::string socket_inspection_name() const override;

Expand All @@ -229,7 +231,7 @@ class IDAttributeFieldInput : public GeometryFieldInput {
}

GVArray get_varray_for_context(const GeometryFieldContext &context,
IndexMask mask) const override;
const IndexMask &mask) const override;

std::string socket_inspection_name() const override;

Expand All @@ -239,7 +241,7 @@ class IDAttributeFieldInput : public GeometryFieldInput {

VArray<float3> curve_normals_varray(const CurvesGeometry &curves, const eAttrDomain domain);

VArray<float3> mesh_normals_varray(const Mesh &mesh, const IndexMask mask, eAttrDomain domain);
VArray<float3> mesh_normals_varray(const Mesh &mesh, const IndexMask &mask, eAttrDomain domain);

class NormalFieldInput : public GeometryFieldInput {
public:
Expand All @@ -249,7 +251,7 @@ class NormalFieldInput : public GeometryFieldInput {
}

GVArray get_varray_for_context(const GeometryFieldContext &context,
IndexMask mask) const override;
const IndexMask &mask) const override;

std::string socket_inspection_name() const override;

Expand Down Expand Up @@ -288,7 +290,7 @@ class AnonymousAttributeFieldInput : public GeometryFieldInput {
}

GVArray get_varray_for_context(const GeometryFieldContext &context,
IndexMask mask) const override;
const IndexMask &mask) const override;

std::string socket_inspection_name() const override;

Expand All @@ -302,7 +304,7 @@ class CurveLengthFieldInput final : public CurvesFieldInput {
CurveLengthFieldInput();
GVArray get_varray_for_context(const CurvesGeometry &curves,
eAttrDomain domain,
IndexMask mask) const final;
const IndexMask &mask) const final;
uint64_t hash() const override;
bool is_equal_to(const fn::FieldNode &other) const override;
std::optional<eAttrDomain> preferred_domain(const bke::CurvesGeometry &curves) const final;
Expand Down
2 changes: 1 addition & 1 deletion source/blender/blenkernel/BKE_instances.hh
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ class Instances {
* Remove the indices that are not contained in the mask input, and remove unused instance
* references afterwards.
*/
void remove(const blender::IndexMask mask,
void remove(const blender::IndexMask &mask,
const blender::bke::AnonymousAttributePropagationInfo &propagation_info);
/**
* Get an id for every instance. These can be used for e.g. motion blur.
Expand Down
14 changes: 7 additions & 7 deletions source/blender/blenkernel/BKE_mesh_sample.hh
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ void sample_point_attribute(Span<int> corner_verts,
Span<int> looptri_indices,
Span<float3> bary_coords,
const GVArray &src,
IndexMask mask,
const IndexMask &mask,
GMutableSpan dst);

void sample_point_normals(Span<int> corner_verts,
Expand All @@ -47,20 +47,20 @@ void sample_corner_attribute(Span<MLoopTri> looptris,
Span<int> looptri_indices,
Span<float3> bary_coords,
const GVArray &src,
IndexMask mask,
const IndexMask &mask,
GMutableSpan dst);

void sample_corner_normals(Span<MLoopTri> looptris,
Span<int> looptri_indices,
Span<float3> bary_coords,
Span<float3> src,
IndexMask mask,
const IndexMask &mask,
MutableSpan<float3> dst);

void sample_face_attribute(Span<int> looptri_polys,
Span<int> looptri_indices,
const GVArray &src,
IndexMask mask,
const IndexMask &mask,
GMutableSpan dst);

/**
Expand Down Expand Up @@ -148,7 +148,7 @@ class BaryWeightFromPositionFn : public mf::MultiFunction {

public:
BaryWeightFromPositionFn(GeometrySet geometry);
void call(IndexMask mask, mf::Params params, mf::Context context) const;
void call(const IndexMask &mask, mf::Params params, mf::Context context) const;
};

/**
Expand All @@ -163,7 +163,7 @@ class CornerBaryWeightFromPositionFn : public mf::MultiFunction {

public:
CornerBaryWeightFromPositionFn(GeometrySet geometry);
void call(IndexMask mask, mf::Params params, mf::Context context) const;
void call(const IndexMask &mask, mf::Params params, mf::Context context) const;
};

/**
Expand All @@ -183,7 +183,7 @@ class BaryWeightSampleFn : public mf::MultiFunction {
public:
BaryWeightSampleFn(GeometrySet geometry, fn::GField src_field);

void call(IndexMask mask, mf::Params params, mf::Context context) const;
void call(const IndexMask &mask, mf::Params params, mf::Context context) const;

private:
void evaluate_source(fn::GField src_field);
Expand Down
Loading

0 comments on commit 2cfcb8b

Please sign in to comment.