-
Notifications
You must be signed in to change notification settings - Fork 50
Thread safety problems? #45
Description
I have created a heavily threaded test rig to try and point out what I think are thread-safety problems in aws-sdk-core. Before diving too far into investigation I wanted to throw this a bit wider to see if I'm missing something.
These tests were carried out in the following environment:
$ uname -a
Linux myitcv-virtual-machine 3.11.0-17-generic #31-Ubuntu SMP Mon Feb 3 21:52:43 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
$ ruby -v
rubinius 2.2.5 (2.1.0 e543ba32 2014-02-08 JI) [x86_64-linux-gnu]
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 4
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 70
Stepping: 1
CPU MHz: 2594.193
BogoMIPS: 5188.38
Hypervisor vendor: VMware
Virtualisation type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 6144K
L4 cache: 131072K
NUMA node0 CPU(s): 0-3Source code for v1 of the test rig and the accompanying Gemfile behind those links. bundle install to get up and running.
The reason for using celluloid is that we are building a process atop celluloid hence the test is more fair (but admittedly not fully stripped back to bare Ruby/Rubinius)
access_key_idetc will need to be populated before usingtest.rb- The code creates a pool of 50 threads, then makes 100 async calls into that pool
- Each call makes a call to DynamoDB to list tables
- After
bundle install,ruby test.rb(assuming you have the right Ruby interpreter set viarbenvetc) should be enough - Yes, this line could be made more efficient but leaving it as such makes the thread safety problem more apparent (see later discussion about a revised versions
v2andv3)
This should be as vanilla as it gets, yet there are three types of exception I've been hitting. But not consistently which is what leads me to believe there's a thread safety issue. They are:
Looking at the top of the call stack of exception 1, we are taken to this code. There are lots of class instance variables here which I don't believe are thread safe unless I'm missing something about how this get's called?
v2 and v3 present alternatives which create one Aws::DynamoDB instance per thread and globally respectively. Both suffer similar issues to varying degrees. The one regularly occurring common error between all three versions is point 3 above, the invalid signature error.
Before we look any further, is there an assumed usage pattern here? i.e. should one create a single, global Aws::DynamoDB instance, or one per thread, or per call?
Any thoughts on what the issue is here?
Are any of these issues potentially related to #43?