Skip to content

Commit

Permalink
Warn about loss of precision in FDBSCAN-DenseBox
Browse files Browse the repository at this point in the history
For some problems and eps, FDBSCAN-DenseBox may suffer from severe loss
of precision when computing cellBox(). This may result in wrong DBSCAN
results.

Initially, the problem was observed on a full NGSIM dataset:
```
$ ./ArborX_DBSCAN.exe --binary \
    --filename /data/data_and_logs/2021_02_20_dbscan_datasets/mustafa_2019/ngsim.arborx \
    --print-dbscan-timers --core-min-size 2 --eps 1.0 --max-num-points 90000 \
    --impl fdbscan-densebox --verify
<...>
Core point is marked as noise: 41888 [-1]
Core point is marked as noise: 17027 [-1]
Core point is marked as noise: 43104 [-1]
```
Tracking down the issue revealed that a point 2279 (part of a dense
cell) was finding point 10369 (not part of a dense cell), but the
reverse was not true (i.e., 10369 was not finding the dense cell that
2279 was in). Studying the dense boxes, I observed that the problematic
box had bounds
  [[6452399.000000,1872182.375000,0.000000],
   [6452399.000000,1872183.000000,0.577350]]
Note the same x coordinate for both min and max corner. The h in this
run was ~0.57. So it completely got lost. Even Y coordinate lost some
precision, though not as much.
  • Loading branch information
aprokop committed Oct 11, 2022
1 parent c3fd091 commit 95595f7
Showing 1 changed file with 20 additions and 0 deletions.
20 changes: 20 additions & 0 deletions src/details/ArborX_DetailsCartesianGrid.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,9 @@ struct CartesianGrid
{
auto min = _bounds.minCorner();
decltype(min) max;

// This code may suffer from loss of precision depending on the problem
// bounds and h. We try to detect this case in the constructor.
for (int d = 0; d < DIM; ++d)
{
auto i = cell_index % _n[d];
Expand Down Expand Up @@ -118,6 +121,23 @@ struct CartesianGrid
m /= _n[d - 1];
ARBORX_ASSERT(_n[d] < m);
}

// Catch a potential loss of precision that may happen in cellBox() and can
// lead to wrong results.
//
// The machine precision by itself is not sufficient. In some experiments
// run with a full NGSIM datasets, values below 3 could still produce wrong
// results. This may still not be conservative enough, but all runs passed
// verification when this warning was not triggered.
float constexpr eps = 5 * std::numeric_limits<float>::epsilon();
for (int d = 0; d < DIM; ++d)
{
if (std::abs(_h[d] / min_corner[d]) < eps)
throw std::runtime_error(
"ArborX exception: FDBSCAN-DenseBox algorithm will experience loss "
"of precision, undetectably producing wrong results. Please switch "
"to using FDBSCAN.");
}
}

Box _bounds;
Expand Down

0 comments on commit 95595f7

Please sign in to comment.