Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several minor improvements #148

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Several minor improvements #148

wants to merge 6 commits into from

Conversation

nmellado
Copy link
Contributor

No description provided.

@nmellado nmellado force-pushed the several_improvements branch 2 times, most recently from 56de709 to 48cea2d Compare September 6, 2024 13:46
Changing the min_cell_size affects the construction, but cannot be set only after processing.
Changes:
 - remove setter, and add a paramter when building the tree
Aim at detecting problems with
 - the structure KdTreeDefaultTraits (copies, references, ...)
 - duplicated samples
@@ -154,6 +156,8 @@ void KdTreeBase<Traits>::build_rec(NodeIndexType node_id, IndexType start, Index
node.set_is_leaf(
end-start <= m_min_cell_size ||
level >= Traits::MAX_DEPTH ||
// Stop descending if the content of the node is not one unique point
aabb.diagonal().squaredNorm() <= Eigen::NumTraits<Scalar>::epsilon() ||
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@azaleostu I did this trick to stop the recursion when a node contains only several copies of the same point. Unfortunately it does not do the trick, and I cannot reproduce the segfault on my computer.
Could you please give it a try ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am getting consistent segfaults when there are duplicates. I haven't been able the find the source yet but it looks like duplicate samples can break the node structure.

For example in some runs I have been getting the same node index being processed multiple times, even at different depths:

Node 0, range: [0, 24] (depth: 1), samples: [7, 5, 5, 7, 6, 5, 6, 7, 8, 10, 10, 11, 0, 9, 4, 3, 2, 1, 0, 9, 9, 2, 0, 0]
Node 1, range: [0, 12] (depth: 2), samples: [11, 8, 6, 6, 7, 5, 5, 7, 5, 10, 10, 7]
Node 4, range: [4, 12] (depth: 3), samples: [5, 5, 5, 7, 7, 10, 10, 7]
Node 2, range: [12, 24] (depth: 2), samples: [0, 0, 0, 3, 2, 1, 0, 2, 9, 9, 4, 9]
Node 0, range: [12, 20] (depth: 3), samples: [1, 3, 0, 0, 2, 0, 0, 2]

There also seems to be some corrupted data after the build, like empty leaves, inconsistent child indices, negative child indices, etc.

KdTree:
  MaxNodes: 9223372036854775808
  MaxPoints: 8589934592
  MaxDepth: 32
  PointCount: 12
  SampleCount: 24
  NodeCount: 11
  Samples: [
    11, 8, 6, 6, 5, 5, 5, 7, 7, 10,
    10, 7, 1, 3, 0, 0, 2, 0, 0, 2,
    9, 9, 4, 9]
  Nodes:
    - Type: Inner
      SplitDim: 1
      SplitValue: -0.121189
      FirstChild: 9
    - Type: Leaf
      Start: 20
      Size: 4
    - Type: Inner
      SplitDim: 1
      SplitValue: 0.00595114
      FirstChild: 1701603584
    - Type: Leaf
      Start: 0
      Size: 4
    - Type: Inner
      SplitDim: 1
      SplitValue: 0.664357
      FirstChild: 2002541616
    - Type: Leaf
      Start: 4
      Size: 3
    - Type: Leaf
      Start: 7
      Size: 5
    - Type: Leaf
      Start: 0
      Size: 0
    - Type: Leaf
      Start: 0
      Size: 0
    - Type: Leaf
      Start: 12
      Size: 2
    - Type: Leaf
      Start: 14
      Size: 6

I'm still not sure where the issue might be but I think there could be some edge cases in the build algorithm when it comes to duplicates, I will try again when I can.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, i ran to similar inconsistent behaviors.
Something I found: the reallocation of the nodeContainer breaks (when the number of nodes goes larger than the preallocation) the structure: nodes copy fails, and nodes are inconsistently initialized as leaf or internal, intervals are lost, ...

What I don't get is: how is this related to duplicates ?!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also unsure since it seems like the algorithm should work with duplicates, I see that the node array alloc is based on point_count() instead of sample_count(), maybe changing this could at least fix the realloc issues?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix could work indeed.
But it does not solve the reallocation problem, we need to fix this..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants