Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protecting against trouble on invalid limit values #70

Closed
springmeyer opened this issue Mar 10, 2018 · 9 comments
Closed

Protecting against trouble on invalid limit values #70

springmeyer opened this issue Mar 10, 2018 · 9 comments

Comments

@springmeyer
Copy link
Contributor

I've done some poking at what happens when invalid limit values are passed. I've found that:

limit=0 segfaults the process

Replicate with:

diff --git a/bench/rules.js b/bench/rules.js
index 4587f25..080f471 100644
--- a/bench/rules.js
+++ b/bench/rules.js
@@ -9,7 +9,7 @@ module.exports = [
   {
     description: 'pip: many building polygons',
     queryPoint: [120.9667, 14.6028],
-    options: { radius: 0 },
+    options: { radius: 0 , limit: 0},
     tiles: [
       { z: 16, x: 54789, y: 30080, buffer: fs.readFileSync('./test/fixtures/manila-buildings-16-54789-30080.mvt')}
     ]

And node bench/vtquery.bench.js --iterations 50 --concurrency 1

limit=1000000000 hangs the process

Likely due to a large allocation.

Replicate with:

diff --git a/bench/rules.js b/bench/rules.js
index 4587f25..080f471 100644
--- a/bench/rules.js
+++ b/bench/rules.js
@@ -9,7 +9,7 @@ module.exports = [
   {
     description: 'pip: many building polygons',
     queryPoint: [120.9667, 14.6028],
-    options: { radius: 0 },
+    options: { radius: 0 , limit: 1000000000},
     tiles: [
       { z: 16, x: 54789, y: 30080, buffer: fs.readFileSync('./test/fixtures/manila-buildings-16-54789-30080.mvt')}
     ]

And node bench/vtquery.bench.js --iterations 50 --concurrency 1

limit=1000000000000 overflows

The error is Error: 'limit' must be a positive number, which is better than a crash but indicates that the number has overflowed to negative which should be protected. Instead we should have some reasonable limit to the max value and throw if it is over that value.

Replicate with:

diff --git a/bench/rules.js b/bench/rules.js
index 4587f25..dfa5866 100644
--- a/bench/rules.js
+++ b/bench/rules.js
@@ -9,7 +9,7 @@ module.exports = [
   {
     description: 'pip: many building polygons',
     queryPoint: [120.9667, 14.6028],
-    options: { radius: 0 },
+    options: { radius: 0 , limit: 1000000000000},
     tiles: [
       { z: 16, x: 54789, y: 30080, buffer: fs.readFileSync('./test/fixtures/manila-buildings-16-54789-30080.mvt')}
     ]

And node bench/vtquery.bench.js --iterations 50 --concurrency 1

@springmeyer
Copy link
Contributor Author

Forgot to mention, the way I noticed this is that I saw this code:

vtquery/src/vtquery.cpp

Lines 265 to 268 in c3bad9b

results_queue_.reserve(data.num_results);
for (std::size_t i = 0; i < data.num_results; ++i) {
results_queue_.emplace_back();
}
which pre-allocates an array at N size based on user input, which is something we should be very cautious about 🙅‍♂️ . @mapsam @flippmoke - can we:

  • rewrite the logic such that we do not need to pre-create results?
  • avoid allocating more results than are actually returned, which would allow us to remove this conditional:
    if (feature.distance < std::numeric_limits<double>::max()) {
    ?

@mapsam
Copy link
Member

mapsam commented Apr 6, 2018

@springmeyer taking a look at this now. Good catch!

I think we can bump the validation to make sure the limit value is > 0 and set a maximum of 100,000 results to avoid huge results.

rewrite the logic such that we do not need to pre-create results?

I'm not sure how to go about doing this. Right now we need to be able to iterate through results in order to sort and dedupe, so the container they exist in must be created. Perhaps we can just only pre-allocate up to a certain number?

@springmeyer
Copy link
Contributor Author

I think we can bump the validation to make sure the limit value is > 0 and set a maximum of 100,000 results to avoid huge results.

100,000 feels too large. It might be totally fine and perhaps I'm feeling overly cautious. But I worry about how much memory that would take. Can you assuage my fears here by calculating the exact memory requirement for when num_results = 100,000? The way you would do this is:

100,000 x <size of ResultObject> == number of bytes in memory (roughly)

Ideally we don't allocated more than 5 MB here. Anything more could really add up (since we'd be allocating 5 MB for every thread).

To estimate the size of ResultObject put, temporarily, somewhere in your code:

std::clog << "Size of ResultObject: " << sizeof(ResultObject) << "\n";

And capture that value. That is the size, in bytes, that each instance of that class takes in memory. This does not capture all the memory that would be needed for a class instance (since ResultObject does have members that dynamically allocate) but it does capture the contiguous memory need, and that's the important one because std::vector<ResultObject>::reserve(100,000) is going to allocate a contiguous chunk of memory all at once. This is trivially fast for small allocations. For large allocations (MBs) it might cause the allocator to hang momentarily, which could hurt the performance of every thread (including the main event loop) because every thing needing an allocation (in every thread) would have to wait until this allocation was made.

So, what I'm getting at is that 100,000 might be DDOS potential. And if not (and I'm being overly scared), it might still cause a lot of memory to be reserved contiguously when not actually needed. Which would be wasteful. e.g. if limit=100,000 is passed but the tiles only contain 10 features we still end up allocating the same amount of memory.

I'm not sure how to go about doing this.

Neither am I. While I don't like this pre-allocation, I'm not following the code well enough to know how to remove it. Was hoping you could find a way.

Right now we need to be able to iterate through results in order to sort and dedupe, so the container they exist in must be created. Perhaps we can just only pre-allocate up to a certain number?

What would happen if you only pre-allocated 1 ResultObject. Would the logic fall apart and the results be incorrect?

@mapsam
Copy link
Member

mapsam commented Apr 6, 2018

What would happen if you only pre-allocated 1 ResultObject. Would the logic fall apart and the results be incorrect?

Nope, the reason we were pre-allocating is to save ourselves from on-the-fly allocation every time we add to the vector. Perhaps it's worth benching what "no preallocation" looks like, perhaps it's trivial.

@springmeyer
Copy link
Contributor Author

Nope, the reason we were pre-allocating is to save ourselves from on-the-fly allocation every time we add to the vector.

Ah, that is great news. I'd assumed wrongly then that the pre-allocation was fundamental to the algorithm.

Perhaps it's worth benching what "no preallocation" looks like, perhaps it's trivial.

I think it is. The rule book is to pre-allocate when:

  • a. you can accurately predict the size of a final object, and
  • b. that object actually is going to end up that size

If a is not true, don't pre-allocate. If b is not true, then pre-allocation might help you or might hurt you, so don't pre-allocate. The on-the-fly allocation scenario is designed to do a good job when either a or b are not true. Now, maybe what this means is that we:

  • don't want to pre-allocation, and
  • we need to think about how to reduce the cost of on-the-fly allocation

With a std::vector on-the-fly allocation can be expensive when full re-allocation happens. "Full reallocation" happens when the container needs to be re-sized internally to keep the memory contiguous. So, if benchmarking shows that on-the-fly allocation is slower, maybe the solution is to change to a container that does not promise to be contiguous, so something that is not std::vector. That might reduce the re-allocation overhead at the cost of slightly slower sorts.

@mapsam
Copy link
Member

mapsam commented Apr 6, 2018

@springmeyer I lied to you, the reservation is important to the sorting algorithm, in that it pops and pushes objects into the vector and sorts against the empty result objects which have distance that is an unrealistically huge number to compare against.

Looking at the size of a ResultObect:

Size of ResultObject: 88 bytes

So for every 1000 items we pre-allocate, that uses 0.088 MB of memory per thread.

@mapsam mapsam modified the milestone: v0.2.0 Apr 13, 2018
@springmeyer
Copy link
Contributor Author

@mapsam, okay how about we clamp the max at 1000 then? .0088 MB per thread is totally okay. But larger (100,000) could get us into memory trouble.

@mapsam
Copy link
Member

mapsam commented Jun 6, 2018

Setting the maximum results limit to 1000 - ready to rock over in #74

@mapsam
Copy link
Member

mapsam commented Jun 6, 2018

Thanks for opening this @springmeyer, closing now, and we can reopen if 1000 appears too tight of a limitation.

@mapsam mapsam closed this as completed Jun 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants