New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
segmentation fault #26
Comments
Hi Hamper, Thanks. |
this test script can reproduce error: https://dl.dropboxusercontent.com/u/12199274/github/test.tar.gz client machine: Intel® Core™ i7-4770 Quadcore Haswell, 32 GB DDR3 RAM, Ubuntu 12.04 |
Hamper, Thanks. |
I still can reproduce with this script (ubuntu 14.10, nodejs 0.10.35, aerospikeclient 1.0.26, aerospike server 3.4.1 with default settings), also node main.js -P 1 -I 10 -O 100000 from benchmark reproduces error |
We're also getting sporadic segmentation faults under high loads. |
We set up totally fresh servers on ec2 (r3.xlarge instances) running Ubuntu and only the bare minimum installed to get the nodejs client, and the bare minimum to get aerospike installed to verify there wasn't anything odd conflicting. Both are: Ubuntu 14.04.1 LTS and Node.js version: v0.10.36 In case it impacts anything the aerospike server was initialized with 10m keys. Doesn't seem like the aerospike server would have any impact here as it's a client error though. We're considering moving some of our higher load data over but this would be accessed through the nodejs client.. as a result it really needs to be rock solid under heavy highly concurrent load. Using this we can get the faults pretty consistently. Sometimes it'll nearly finish, other times the fault happens right away. The Aerospike server is remote to the client as well. Obviously this is hugely stripped down.. just wanted the bare minimum to reproduce it there. var async = require('async');
var aerospike = require('aerospike');
var status = aerospike.status;
var client = aerospike.client({
hosts: [ { addr: '10.10.10.220', port: 3000 } ]
});
var key = {
ns: 'store_disk_ebs',
set: 'benchmark',
key: 'keytest'
};
var iteration_count = 100000;
var concurrency_count = 10;
//connect to server
client.connect(function(err, client) {
//verify connected ok
if (err.code != status.AEROSPIKE_OK) throw 'failed connection';
//loop iteration_count number of times in a series
async.timesSeries(iteration_count, function(n, iteration) {
//print iteration count every 1000 iterations
if (n % 1000 === 0) console.log('iteration: ' + n);
//for each iteration run this command concurrency_count times in parallel
async.times(concurrency_count, function(n, concurrent) {
//get the key
client.get(key, function() {
//we don't need to do anything with the result. just complete this command
concurrent();
});
}, iteration); //after the concurrency_count commands are complete, start a new iteration
});
}); |
Thanks for your patience. We have identified the issue. There was a problem with the way we were handling buffers in V8 layer. We fixed that and made an official release to npm. The latest version is 1.0.28. Please use the latest and give us your feedback. Thanks. |
Hey Gayathri, It seems the version 1.0.28 also contains this error. I am getting these things in amazon linux log messages. [ 1514.330637] node[21522]: segfault at 4 ip 00007fea72009833 sp 00007fea6bffea88 error 6 in libc-2.17.so[7fea71eda000+19b000] Could you please check. |
@vivekkrbajpai are you using batchGet()? I'm getting segfaults in this function on 1.0.28 and not with get() |
@ryanwitt Thanks for the input regarding batch_get, I'll try reproducing the segfault with batch_get and root cause the issue. Thanks |
@GayathriKaliyamoorthy @ryanwitt yeah i am using batchGet(). During high concurrent load it produces segfault. |
@vivekkrbajpai @ryanwitt I have identified a loop hole in batchGet logic, where if one of the keys in the batchKeys is corrupted or not constructed properly the driver segfaults. Can you confirm that, your application always sends a well constructed batchKeys to aerospike nodejs driver. And also I tried reproducing a segfault under heavy load, but could not reproduce. Could you give a sample code snippet so that I can work on reproducing the segfault under high load. Thanks |
Yups my keys are well formed and not corrupted or empty.
|
They're all of the form: var keys = [
{ ns: 'namespace', set: 'some_set', key: 'some, possibly very long key' },
...
]; Is there any limit to the length of the key itself? |
Could you give an approximate size of each batch requests? Thanks |
Its about 5 to 10 in each batch. On Wed, Feb 11, 2015 at 2:57 PM, Gayathri notifications@github.com wrote:
|
@Hamper @courtneycouch One of our customers had reported back saying after the fix, the driver ran without any segfault for 19 continuous hours, could you also confirm this please? Thanks |
@vivekkrbajpai @ryanwitt could you please open another issue stating segfault in batchGet API, it will be easy for us to track the issues. Thanks |
Sure |
It's work, Thanks. |
I have segfaults in client on high load (bin content is map object like {key1: timestamp, key2: timestamp, ...}):
#0 0x00007ffff6c5ead0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff6c60146 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007ffff6c62c95 in malloc () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007ffff7783ded in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007ffff6994343 in prepare (args=...) at ../src/main/client/put.cc:88
#5 0x00007ffff699905b in async_invoke (args=..., prepare=0x7ffff69942d0 <prepare(v8::Arguments const&)>, execute=0x7ffff6994140 <execute(uv_work_t*)>, respond=0x7ffff6993e70 <respond(uv_work_t*, int)>) at ../src/main/util/async.cc:37
#6 0x00007ffff6994c1e in AerospikeClient::Put (args=...) at ../src/main/client/put.cc:290
#7 0x00003f02aad1ad99 in ?? ()
#8 0x00007fffffffaad8 in ?? ()
#9 0x00007fffffffaaf0 in ?? ()
#10 0x0000000000000003 in ?? ()
#11 0x0000000000000000 in ?? ()
#0 0x00007ffff6c5ead0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff6c60146 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007ffff6c62c95 in malloc () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007ffff7783ded in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007ffff6995a98 in prepare (args=...) at ../src/main/client/select.cc:90
#5 0x00007ffff699905b in async_invoke (args=..., prepare=0x7ffff6995a30 <prepare(v8::Arguments const&)>, execute=0x7ffff69958e0 <execute(uv_work_t*)>, respond=0x7ffff6995620 <respond(uv_work_t*, int)>) at ../src/main/util/async.cc:37
#6 0x00007ffff699630e in AerospikeClient::Select (args=...) at ../src/main/client/select.cc:294
#7 0x0000392b8fe53339 in ?? ()
#8 0x00007fffffffde98 in ?? ()
#9 0x00007fffffffdeb0 in ?? ()
#10 0x0000000000000003 in ?? ()
#11 0x0000000000000000 in ?? ()
The text was updated successfully, but these errors were encountered: