Fix builtin _sum reduce function #1574
Merged
Conversation
Trivially reproduced with this Python script: I wanted to get the PR out but I'm also going to try and contemplate how to test this a bit more directly. #!/usr/bin/env python
import base64
import json
import os
import random
import requests
S = requests.session()
S.auth = ("adm", "pwd")
S.headers["Content-Type"] = "application/json"
BASE = "http://127.0.0.1:15986/"
DB = BASE + "foo"
DDOC = DB + "/_design/bar"
VIEW = DDOC + "/_view/bam"
SIG = "1d2f27b7abd18837599f078927c7bc1f"
def gen_key():
data = os.urandom(50)
return base64.b64encode(data)
def gen_device():
return {
"tx_bytes": random.randint(1, 8192),
"rx_bytes": random.randint(1, 8192)
}
def gen_data():
ret = {}
for i in range(1024):
ret[gen_key()] = gen_device()
return ret
def load_docs():
for i in range(25):
print "Generating: %d - %d" % (i*100, (i+1)*100)
docs = []
for j in range(100):
docs.append({
"_id": "%06d" % ((i*100) + j),
"data": gen_data()
})
body = json.dumps({"docs": docs})
r = S.post(DB + "/_bulk_docs", data=body)
r.raise_for_status()
def load_ddoc():
ddoc = {
"language": "javascript",
"views": {
"bam": {
"map": """
function(doc) {
emit(doc._id, doc.data);
}""",
#"reduce": "_sum"
"reduce": """
function(keys, values) {
var obj = {};
for(var v in values) {
for(var k in values[v]) {
log(k);
obj[k] = values[v][k];
}
}
return obj;
}
"""
}
}
}
body = json.dumps(ddoc)
r = S.put(DB + "/_design/bar", data=body)
r.raise_for_status()
def get_size():
fname = "dev/lib/node1/data/.foo_design/mrview/%s.view" % SIG
return os.stat(fname).st_size
def get_task():
r = S.get(BASE + "/_active_tasks")
for t in r.json():
return t
def main():
S.delete(DB)
r = S.put(DB)
r.raise_for_status()
load_docs()
load_ddoc()
r = S.get(VIEW, params={"limit": 0})
r.raise_for_status()
print "Initial build: %d" % get_size()
r = S.post(DDOC + "/_compact")
r.raise_for_status()
print get_task()
if __name__ == "__main__":
main() |
@nickva pointed me to the module tests which I'll add tonight or tomorrow before this is merged. |
{<<"error">>, <<"builtin_reduce_error">>}, | ||
{<<"reason">>, Msg} | ||
]}; | ||
"log" when OutSize > 4096 andalso OutSize > 2 * InSize -> |
nickva
Aug 23, 2018
Contributor
Use Overflowed
variable here as well, it also the updated correct logic for overflow.
To double check:
Line 41
in
bf72d61
Use Overflowed
variable here as well, it also the updated correct logic for overflow.
To double check:
Line 41 in bf72d61
davisp
Aug 23, 2018
Author
Member
Ah, good catch. I refactored that out intending to have it both places.
Ah, good catch. I refactored that out intending to have it both places.
c97121d
to
84505f0
With With
as expected and also see log error lines like:
EUnit tests pass:
|
The builting _sum reduce function has no protection against overflowing reduce values. Users can emit objects with enough unique keys to cause the builtin _sum to create objects that are exceedingly large in the inner nodes of the view B+Tree. This change adds the same logic that applies to JavaScript reduce functions to check if a reduce function is properly reducing its input.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
The builting _sum reduce function has no protection against overflowing
reduce values. Users can emit objects with enough unique keys to cause
the builtin _sum to create objects that are exceedingly large in the
inner nodes of the view B+Tree.
This change adds the same logic that applies to JavaScript reduce
functions to check if a reduce function is properly reducing its input.
Testing recommendations
make check
Checklist