Potential infinite loop from multithreaded use of HashMap by way of "skip" HashSet in SaltScanner? #823

Open
thatsafunnyname opened this Issue Jun 30, 2016 · 25 comments

Projects

None yet

4 participants

@thatsafunnyname
Contributor

Hello and thank you for opentsdb,

We are using opentsdb-2.2.0 with

tsd.storage.salt.buckets = 30
tsd.storage.salt.width = 1

and mapr-hadoop-core-2.7.0.37549.GA-1.x86_64, both using java-1.7.0-openjdk-1.7.0.101.x86_64 on RHEL6.6 on 2.6.32-573.12.1.el6.x86_64

We see the opentsdb process occasionally spinning CPU in user time, logs from querying the api/stats/threads HTTP interface show a thread resizing a HashMap at these times. We do not have a reproducer. We have a monit process that detects the high CPU and restarts opentsdb.

I read http://mailinator.blogspot.co.uk/2009/06/beautiful-race-condition.html and was wondering if we are hitting this race, and if this might explain what we see.

I will attach the api/stats/threads log files. Some process stats for opentsdb process (the host has 12 vCPUS):

# TOP PROCESSES sorted by time (counters are /sec) 16:57:00
# PID  User     PR  PPID THRD S   VSZ   RSS CP  SysT   UsrT Pct  AccuTime  RKB  WKB MajF MinF 
96659  opentsdb 20     1  213 S   15G  719M  7  0.01  0.12   0   0:24.21    0    1    0    8

# TOP PROCESSES sorted by time (counters are /sec) 16:58:00
# PID  User     PR  PPID THRD S   VSZ   RSS CP  SysT   UsrT Pct  AccuTime  RKB  WKB MajF MinF 
96659  opentsdb 20     1  213 S   16G    1G  7  1.01 179.33 300   3:24.55    0   30    0 4227

# TOP PROCESSES sorted by time (counters are /sec) 16:59:00
# PID  User     PR  PPID THRD S   VSZ   RSS CP  SysT   UsrT Pct  AccuTime  RKB  WKB MajF MinF 
96659  opentsdb 20     1  213 S   16G    1G  7  0.65 415.73 693  10:20.93    0    1    0    8

# TOP PROCESSES sorted by time (counters are /sec) 17:00:00
# PID  User     PR  PPID THRD S   VSZ   RSS CP  SysT   UsrT Pct  AccuTime  RKB  WKB MajF MinF 
96659  opentsdb 20     1  213 S   16G    1G  7  0.67 416.33 695  17:17.93    0    1    0    9

Thanks in advance for any help you can provide.

( Whilst looking at the stacks I also noticed that opentsdb is using version 1.4.0 of stumbleupon/async (https://github.com/OpenTSDB/opentsdb/blob/e2efe9a4875f8cddd3404e9c6b64f5aab4b72c0a/third_party/suasync/include.mk#L16) and not version 1.4.1 which contains a fix for OpenTSDB/async#4 ( OpenTSDB/async@3552d39 ) I was curious why 1.4.1 is not being used. )

Thread seen resizing HashMap

3dow_16h_58m.log-        "interrupted": false,
3dow_16h_58m.log-        "name": "pool-6-thread-48",
3dow_16h_58m.log-        "priority": 5,
3dow_16h_58m.log-        "stack": [
3dow_16h_58m.log-            "java.util.HashMap.transfer(HashMap.java:601)",
3dow_16h_58m.log:            "java.util.HashMap.resize(HashMap.java:581)",
3dow_16h_58m.log-            "java.util.HashMap.addEntry(HashMap.java:879)",
3dow_16h_58m.log-            "java.util.HashMap.put(HashMap.java:505)",
3dow_16h_58m.log-            "java.util.HashSet.add(HashSet.java:217)",
3dow_16h_58m.log-            "net.opentsdb.core.SaltScanner$ScannerCB$1MatchCB.call(SaltScanner.java:420)",
3dow_16h_58m.log-            "net.opentsdb.core.SaltScanner$ScannerCB$1MatchCB.call(SaltScanner.java:414)",
3dow_16h_58m.log-            "com.stumbleupon.async.Deferred.doCall(Deferred.java:1278)",
3dow_16h_58m.log-            "com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)",
3dow_16h_58m.log-            "com.stumbleupon.async.Deferred.handleContinuation(Deferred.java:1313)",
3dow_16h_58m.log-            "com.stumbleupon.async.Deferred.doCall(Deferred.java:1284)",
3dow_16h_58m.log-            "com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)",
3dow_16h_58m.log-            "com.stumbleupon.async.Deferred.callback(Deferred.java:1005)",
3dow_16h_58m.log-            "com.stumbleupon.async.DeferredGroup.done(DeferredGroup.java:173)",
3dow_16h_58m.log-            "com.stumbleupon.async.DeferredGroup.recordCompletion(DeferredGroup.java:158)",
3dow_16h_58m.log-            "com.stumbleupon.async.DeferredGroup.access$200(DeferredGroup.java:36)",
3dow_16h_58m.log-            "com.stumbleupon.async.DeferredGroup$1NotifyOrdered.call(DeferredGroup.java:97)",
3dow_16h_58m.log-            "com.stumbleupon.async.Deferred.doCall(Deferred.java:1278)",
3dow_16h_58m.log-            "com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)",
3dow_16h_58m.log-            "com.stumbleupon.async.Deferred.callback(Deferred.java:1005)",
3dow_16h_58m.log-            "org.hbase.async.HBaseRpc.callback(HBaseRpc.java:734)",
3dow_16h_58m.log-            "org.hbase.async.MapRThreadPool$RpcRunnable.run(MapRThreadPool.java:230)",
3dow_16h_58m.log-            "java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)",
3dow_16h_58m.log-            "java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)",
3dow_16h_58m.log-            "java.lang.Thread.run(Thread.java:745)"
3dow_16h_58m.log-        ],
3dow_16h_58m.log-        "state": "RUNNABLE",
3dow_16h_58m.log-        "threadID": 109
3dow_16h_59m.log-        "interrupted": false,
3dow_16h_59m.log-        "name": "pool-6-thread-48",
3dow_16h_59m.log-        "priority": 5,
3dow_16h_59m.log-        "stack": [
3dow_16h_59m.log-            "java.util.HashMap.transfer(HashMap.java:601)",
3dow_16h_59m.log:            "java.util.HashMap.resize(HashMap.java:581)",
3dow_16h_59m.log-            "java.util.HashMap.addEntry(HashMap.java:879)",
3dow_16h_59m.log-            "java.util.HashMap.put(HashMap.java:505)",
3dow_16h_59m.log-            "java.util.HashSet.add(HashSet.java:217)",
3dow_16h_59m.log-            "net.opentsdb.core.SaltScanner$ScannerCB$1MatchCB.call(SaltScanner.java:420)",
3dow_16h_59m.log-            "net.opentsdb.core.SaltScanner$ScannerCB$1MatchCB.call(SaltScanner.java:414)",
3dow_16h_59m.log-            "com.stumbleupon.async.Deferred.doCall(Deferred.java:1278)",
3dow_16h_59m.log-            "com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)",
3dow_16h_59m.log-            "com.stumbleupon.async.Deferred.handleContinuation(Deferred.java:1313)",
3dow_16h_59m.log-            "com.stumbleupon.async.Deferred.doCall(Deferred.java:1284)",
3dow_16h_59m.log-            "com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)",
3dow_16h_59m.log-            "com.stumbleupon.async.Deferred.callback(Deferred.java:1005)",
3dow_16h_59m.log-            "com.stumbleupon.async.DeferredGroup.done(DeferredGroup.java:173)",
3dow_16h_59m.log-            "com.stumbleupon.async.DeferredGroup.recordCompletion(DeferredGroup.java:158)",
3dow_16h_59m.log-            "com.stumbleupon.async.DeferredGroup.access$200(DeferredGroup.java:36)",
3dow_16h_59m.log-            "com.stumbleupon.async.DeferredGroup$1NotifyOrdered.call(DeferredGroup.java:97)",
3dow_16h_59m.log-            "com.stumbleupon.async.Deferred.doCall(Deferred.java:1278)",
3dow_16h_59m.log-            "com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)",
3dow_16h_59m.log-            "com.stumbleupon.async.Deferred.callback(Deferred.java:1005)",
3dow_16h_59m.log-            "org.hbase.async.HBaseRpc.callback(HBaseRpc.java:734)",
3dow_16h_59m.log-            "org.hbase.async.MapRThreadPool$RpcRunnable.run(MapRThreadPool.java:230)",
3dow_16h_59m.log-            "java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)",
3dow_16h_59m.log-            "java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)",
3dow_16h_59m.log-            "java.lang.Thread.run(Thread.java:745)"
3dow_16h_59m.log-        ],
3dow_16h_59m.log-        "state": "RUNNABLE",
3dow_16h_59m.log-        "threadID": 109
3dow_17h_00m.log-        "interrupted": false,
3dow_17h_00m.log-        "name": "pool-6-thread-48",
3dow_17h_00m.log-        "priority": 5,
3dow_17h_00m.log-        "stack": [
3dow_17h_00m.log-            "java.util.HashMap.transfer(HashMap.java:601)",
3dow_17h_00m.log:            "java.util.HashMap.resize(HashMap.java:581)",
3dow_17h_00m.log-            "java.util.HashMap.addEntry(HashMap.java:879)",
3dow_17h_00m.log-            "java.util.HashMap.put(HashMap.java:505)",
3dow_17h_00m.log-            "java.util.HashSet.add(HashSet.java:217)",
3dow_17h_00m.log-            "net.opentsdb.core.SaltScanner$ScannerCB$1MatchCB.call(SaltScanner.java:420)",
3dow_17h_00m.log-            "net.opentsdb.core.SaltScanner$ScannerCB$1MatchCB.call(SaltScanner.java:414)",
3dow_17h_00m.log-            "com.stumbleupon.async.Deferred.doCall(Deferred.java:1278)",
3dow_17h_00m.log-            "com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)",
3dow_17h_00m.log-            "com.stumbleupon.async.Deferred.handleContinuation(Deferred.java:1313)",
3dow_17h_00m.log-            "com.stumbleupon.async.Deferred.doCall(Deferred.java:1284)",
3dow_17h_00m.log-            "com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)",
3dow_17h_00m.log-            "com.stumbleupon.async.Deferred.callback(Deferred.java:1005)",
3dow_17h_00m.log-            "com.stumbleupon.async.DeferredGroup.done(DeferredGroup.java:173)",
3dow_17h_00m.log-            "com.stumbleupon.async.DeferredGroup.recordCompletion(DeferredGroup.java:158)",
3dow_17h_00m.log-            "com.stumbleupon.async.DeferredGroup.access$200(DeferredGroup.java:36)",
3dow_17h_00m.log-            "com.stumbleupon.async.DeferredGroup$1NotifyOrdered.call(DeferredGroup.java:97)",
3dow_17h_00m.log-            "com.stumbleupon.async.Deferred.doCall(Deferred.java:1278)",
3dow_17h_00m.log-            "com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)",
3dow_17h_00m.log-            "com.stumbleupon.async.Deferred.callback(Deferred.java:1005)",
3dow_17h_00m.log-            "org.hbase.async.HBaseRpc.callback(HBaseRpc.java:734)",
3dow_17h_00m.log-            "org.hbase.async.MapRThreadPool$RpcRunnable.run(MapRThreadPool.java:230)",
3dow_17h_00m.log-            "java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)",
3dow_17h_00m.log-            "java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)",
3dow_17h_00m.log-            "java.lang.Thread.run(Thread.java:745)"
3dow_17h_00m.log-        ],
3dow_17h_00m.log-        "state": "RUNNABLE",
3dow_17h_00m.log-        "threadID": 109

https://github.com/OpenTSDB/opentsdb/blob/e2efe9a4875f8cddd3404e9c6b64f5aab4b72c0a/src/core/SaltScanner.java#L309

private final Set<String> skips = new HashSet<String>();

https://github.com/OpenTSDB/opentsdb/blob/e2efe9a4875f8cddd3404e9c6b64f5aab4b72c0a/src/core/SaltScanner.java#L420

skips.add(tsuid);

Log files from api/stats/threads :

monster4_3dow_16h_57m_log.txt
monster4_3dow_16h_58m_log.txt
monster4_3dow_16h_59m_log.txt
monster4_3dow_17h_00m_log.txt

The config (some names changed):

tsd.core.auto_create_metrics = true
tsd.core.meta.enable_realtime_ts = false
tsd.core.meta.enable_realtime_uid = true
tsd.core.meta.enable_tsuid_incrementing = false
tsd.core.meta.enable_tsuid_tracking = false
tsd.core.plugin_path = /home/opentsdb/opentsdb/releases/opentsdb-2.2.0/usr/share/opentsdb/plugins
tsd.core.tree.enable_processing = false
tsd.core.uid.random_metrics = true
tsd.http.cachedir = /dev/shm/opentsdb/tsdb-mdata
tsd.http.query.allow_delete = true
tsd.http.request.cors_domains = *
tsd.http.request.enable_chunked = true
tsd.http.request.max_chunk = 33554432
tsd.http.show_stack_trace = false
tsd.http.staticroot = /home/opentsdb/opentsdb/releases/opentsdb-2.2.0/usr/share/opentsdb/static
tsd.mode = ro
tsd.network.bind = 0.0.0.0
tsd.network.port = 4272
tsd.network.worker_threads = 24
tsd.query.allow_simultaneous_duplicates = false
tsd.query.timeout = 31000
tsd.storage.enable_appends = false
tsd.storage.enable_compaction = false
tsd.storage.fix_duplicates = true
tsd.storage.hbase.data_table = /mapr/maprtsdb01a/opentsdb/tsdb-mdata-legacy/tsdb
tsd.storage.hbase.meta_table = /mapr/maprtsdb01a/opentsdb/tsdb-mdata-legacy/tsdb-meta
tsd.storage.hbase.tree_table = /mapr/maprtsdb01a/opentsdb/tsdb-mdata-legacy/tsdb-tree
tsd.storage.hbase.uid_table = /mapr/maprtsdb01a/opentsdb/tsdb-mdata-legacy/tsdb-uid
tsd.storage.hbase.zk_quorum = maprtsdb01a01:5181,maprtsdb01a02:5181,maprtsdb01a03:5181,
tsd.storage.repair_appends = false
tsd.storage.salt.buckets = 30
tsd.storage.salt.width = 1
@thatsafunnyname
Contributor

There are 128 pool threads because in mapr-asynchbase-1.7.0.201603301412 (and in 1.6.0) from http://repository.mapr.com/nexus/content/groups/mapr-public/org/hbase/asynchbase/

MapRThreadPool.java contains:

  public static final int DEFAULT_WORKER_THREADS = 128;
  public static final int DEFAULT_CALLBACK_THREADS = 5;
...
  public static final int MIN_SCAN_THREADS = 3;
  public static final int MAX_SCAN_THREADS = 32;
...
   int workerThreads = conf.getInt("fs.mapr.async.worker.threads", DEFAULT_WORKER_THREADS);
    int callbackThreads = conf.getInt("fs.mapr.async.callback.threads", DEFAULT_CALLBACK_THREADS);
    LOG.info("Creating ThreadPoolExecutor with {} workerThreads and {} callbackThreads.", workerThreads, callbackThreads);
    int totalThreads = workerThreads + callbackThreads;
    pool = new ThreadPoolExecutor(
                  totalThreads, // core thread pool size,
                  totalThreads, // maximum thread pool size
                  1, // time to wait before reducing threads, if more then coreSz
                  TimeUnit.HOURS,
                  new LinkedBlockingQueue<Runnable>(),
                  new ThreadPoolExecutor.CallerRunsPolicy());
    // Start minimum number of threads in pool.
    for (int i = 0; i < workerThreads; i++) {
      pool.execute(new RpcRunnable());
    }
    // Separate threadpool for Scanner.
    scanpool = new ThreadPoolExecutor(
                  MIN_SCAN_THREADS, // core thread pool size
                  MAX_SCAN_THREADS, // maximum thread pool size
                  1,// time to wait before reducing threads, if more then coreSz
                  TimeUnit.HOURS,
                  new LinkedBlockingQueue<Runnable>(),
                  new ThreadPoolExecutor.CallerRunsPolicy());
    // Start minimum number of threads in pool.
    for (int i = 0; i < MIN_SCAN_THREADS; i++) {
      scanpool.execute(new ScanRpcRunnable(this));
    }

In version 1.5.0 https://github.com/mapr/asynchbase/blob/v1.5.0-mapr/src/MapRThreadPool.java I see:

  public static final int MIN_THREADS = 5;
  public static final int MAX_THREADS = 10;
  public static final int MIN_SCAN_THREADS = 3;
  public static final int MAX_SCAN_THREADS = 32;
...
    pool = new ThreadPoolExecutor(
                  1 + MIN_THREADS, // core thread pool size, 
                           // should be atleast 1 more than MIN_THREADS
                  MAX_THREADS, // maximum thread pool size
                  1,// time to wait before reducing threads, if more then coreSz
                  TimeUnit.HOURS,
                  new LinkedBlockingQueue<Runnable>(),
                  new ThreadPoolExecutor.CallerRunsPolicy());
    // Start minimum number of threads in pool.
    for (int i = 0; i < MIN_THREADS; i++)
       pool.execute(new RpcRunnable());

    // Separate threadpool for Scanner.
    scanpool = new ThreadPoolExecutor(
                  MIN_SCAN_THREADS, // core thread pool size
                  MAX_SCAN_THREADS, // maximum thread pool size
                  1,// time to wait before reducing threads, if more then coreSz
                  TimeUnit.HOURS,
                  new LinkedBlockingQueue<Runnable>(),
                  new ThreadPoolExecutor.CallerRunsPolicy());
    // Start minimum number of threads in pool.
    for (int i = 0; i < MIN_SCAN_THREADS; i++)
       scanpool.execute(new ScanRpcRunnable(this));

so maybe with fewer threads in the pool a race condition was less likely.

@thatsafunnyname
Contributor

This is the minimal change I went with:

> diff opentsdb-2.2.0/src/core/SaltScanner.java.orig opentsdb-2.2.0/src/core/SaltScanner.java
309c309
<     private final Set<String> skips = new HashSet<String>();
---
>     private final Set<String> skips = Collections.newSetFromMap(new ConcurrentHashMap<String, Boolean>());

keepers also looks like it could benefit from the same change, but for now I just changed skips.

@johann8384
Member

Do you want to submit a PR?

@johann8384 johann8384 added the bug label Jul 1, 2016
@thatsafunnyname
Contributor

The minimal change in #823 (comment) looks to have stopped threads becoming stuck in CPU loops in our opentsdb 2.2.0 instances, we do not have plans to upgrade the opentsdb version at the moment so I am sorry I am unable to test the change against more recent versions. I could make a PR against https://github.com/OpenTSDB/opentsdb/tree/v2.2.0 , would a v2.2.1 release with the change be appropriate? Or would such a change only go into the master branch and next release? I will make the same change to keepers next week and test the code on the hosts exhibiting the behaviour.
I am happy if someone else wants to make the change(s). Thanks.

@thatsafunnyname
Contributor

9197b63 added the skips and keepers HashSet<String>()s to src/core/SaltScanner.java and to src/core/TsdbQuery.java , these may also benefit from being thread safe.

I did see a few other HashSets in use in the project, I am not familiar enough with the code to know if these can be used from multiple threads. I noticed processed_tsuids in src/tools/UidManager.java uses a Collections.synchronizedSet(new HashSet<Integer>()).

I was also wondering about the code needing to support older versions of Java, in particular the parameterless constructor for ConcurrentHashMap, and if the default configuration would be suitable when using Java 6:

https://ria101.wordpress.com/2011/12/12/concurrenthashmap-avoid-a-common-misuse/
http://opentsdb.net/docs/build/html/installation.html#runtime-requirements

I read about how in Java 8 ConcurrentHashMap has a lower memory footprint and performs better under concurrent load:

netty/netty#1052 - "Consider replace ConcurrentHashMap with ConcurrentHashMapV8 backport"
netty/netty@671f9d4

I read http://opentsdb.net/docs/build/html/development/index.html#guidelines and note that:

.) Bug fixes should be done in the master branch
.) New features or major changes should be done in the next branch

In order to make a PR I will need my employer to agree to waiver any rights they have to contributions before I can sign the CLA. I will start the internal process to obtain this agreement, but please note that it can take some time (normally weeks). I am happy for anyone else to apply a fix. Thanks.

@johann8384
Member

It would be most helpful if the changes are made against the next branch.

@johann8384 johann8384 added this to the v2.3.0 milestone Jul 6, 2016
@thatsafunnyname
Contributor
thatsafunnyname commented Jul 7, 2016 edited

My employer agreed to waiver any rights they have to contributions to this project, I haved signed the CLA. Pull request #835 is just for the HashSet seen in the thread stack of threads in infinite loops, I can
confirm that it stopped the infinite loops we were seeing in 2.2.0 of this form:

"pool-7-thread-3" prio=10 tid=0x00007fafb1147000 nid=0x5c9 runnable [0x00007faf73cfb000]
   java.lang.Thread.State: RUNNABLE
            "java.util.HashMap.getEntry(HashMap.java:465)",
            "java.util.HashMap.containsKey(HashMap.java:449)",
            "java.util.HashSet.contains(HashSet.java:201)",
            "net.opentsdb.core.SaltScanner$ScannerCB.call(SaltScanner.java:407)",
            "net.opentsdb.core.SaltScanner$ScannerCB.call(SaltScanner.java:302)",
            "com.stumbleupon.async.Deferred.doCall(Deferred.java:1278)",
            "com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)",
            "com.stumbleupon.async.Deferred.callback(Deferred.java:1005)",
            "org.hbase.async.HBaseRpc.callback(HBaseRpc.java:734)",
            "org.hbase.async.MapRThreadPool$ScanRpcRunnable.run(MapRThreadPool.java:452)",
            "java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)",
            "java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)",
            "java.lang.Thread.run(Thread.java:745)"

If the "nid" from the jstack is converted to a decimal:

> perl -e 'print 0x5c9'
1481

It would line up with the thread id from ps output on the pid of the opentsdb instance:

ps -mo 'pid lwp time %cpu cmd' 914 | grep -v '0:00  0.0'
   PID    LWP     TIME %CPU CMD
   914      - 01:24:43 87.4 java_opentsdb_process
     -   1282 00:00:01  0.0 -
     -   1283 00:00:01  0.0 -
     -   1284 00:00:01  0.0 -
     -   1285 00:00:01  0.0 -
     -   1286 00:00:01  0.0 -
     -   1287 00:00:01  0.0 -
     -   1289 00:00:01  0.0 -
     -   1290 00:00:01  0.0 -
     -   1291 00:00:04  0.0 -
     -   1295 00:00:02  0.0 -
     -   1296 00:00:02  0.0 -
     -   1298 00:00:01  0.0 -
     -   1479 00:02:30  2.5 -
     -   1480 00:02:30  2.5 -
     -   1481 01:18:36 81.1 - <--- TID 1481 with high CPU utilization over lifetime
     -   1486 00:00:25  0.4 -
     -   1488 00:00:01  0.0 -
     -   1489 00:00:01  0.0 -
     -   1490 00:00:01  0.0 -
     -   1491 00:00:01  0.0 -
     -   1492 00:00:01  0.0 -
     -   1493 00:00:01  0.0 -
     -   1494 00:00:01  0.0 -
     -   1495 00:00:01  0.0 -

A Java Poor Man's Profiler (jpmp*) can also be used to see if a thread is stuck (in an infinite loop) in a HashMap, using 10 samples with a 1 second interval, look out for this stack seen multiples of 10 times:

     10 j.util.HashMap.getEntry,j.util.HashMap.containsKey,j.util.HashSet.contains,net.opentsdb.core.SaltScanner$ScannerCB.call,net.opentsdb.core.SaltScanner$ScannerCB.call,com.stumbleupon.async.Deferred.doCall,com.stumbleupon.async.Deferred.runCallbacks,com.stumbleupon.async.Deferred.callback,org.hbase.async.HBaseRpc.callback,org.hbase.async.MapRThreadPool$ScanRpcRunnable.run,j.util.concurrent.ThreadPoolExecutor.runWorker,j.util.concurrent.ThreadPoolExecutor$Worker.run,j.lang.Thread.run

Formatted to not be on one line:

     10 j.util.HashMap.getEntry,
        j.util.HashMap.containsKey,
        j.util.HashSet.contains,
        net.opentsdb.core.SaltScanner$ScannerCB.call,
        net.opentsdb.core.SaltScanner$ScannerCB.call,
        com.stumbleupon.async.Deferred.doCall,
        com.stumbleupon.async.Deferred.runCallbacks,
        com.stumbleupon.async.Deferred.callback,
        org.hbase.async.HBaseRpc.callback,
        org.hbase.async.MapRThreadPool$ScanRpcRunnable.run,
        j.util.concurrent.ThreadPoolExecutor.runWorker,
        j.util.concurrent.ThreadPoolExecutor$Worker.run,
        j.lang.Thread.run

I was wondering if this patch could also be applied to a 2.2.1 release?

Thanks.

@thatsafunnyname
Contributor

On an instance of opentsdb 2.2.0 with the patch from #835 we have a thread stuck in a different infinte loop:

"pool-7-thread-1" prio=10 tid=0x00007f796597e800 nid=0xf4e8 runnable [0x00007f7935bda000]
   java.lang.Thread.State: RUNNABLE
        at java.util.TreeMap.successor(TreeMap.java:1990)
        at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1116)
        at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
        at net.opentsdb.core.SaltScanner.validateAndTriggerCallback(SaltScanner.java:580)
        at net.opentsdb.core.SaltScanner.access$1400(SaltScanner.java:58)
        at net.opentsdb.core.SaltScanner$ScannerCB.close(SaltScanner.java:560)
        at net.opentsdb.core.SaltScanner$ScannerCB.call(SaltScanner.java:369)
        at net.opentsdb.core.SaltScanner$ScannerCB.call(SaltScanner.java:302)
        at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278)
        at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)
        at com.stumbleupon.async.Deferred.callback(Deferred.java:1005)
        at org.hbase.async.HBaseRpc.callback(HBaseRpc.java:734)
        at org.hbase.async.MapRThreadPool$ScanRpcRunnable.run(MapRThreadPool.java:435)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
        - <0x00000005c3190210> (a java.util.concurrent.ThreadPoolExecutor$Worker)
>ps -mo 'pid lwp time %cpu cmd' PID | grep -v ' 0.0 '
PID   LWP     TIME %CPU CMD
PID     - 3-17:50:43 98.9 java_opentsdb
  - 62696 3-17:33:41 98.6 -

> perl -e 'printf("0x%X", 62696)'
0xF4E8

> perl -e 'print 0xf4e8'
62696

This is https://github.com/OpenTSDB/opentsdb/blob/v2.2.0/src/core/SaltScanner.java#L580 , where https://github.com/OpenTSDB/opentsdb/blob/v2.2.0/src/core/SaltScanner.java#L307 ByteMap<List<Annotation>> is being used, it looks like a similar race condition.

jpmp.sh  PID 10 1
   1280 sun.misc.Unsafe.park,j.util.concurrent.locks.LockSupport.park,j.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await,j.util.concurrent.LinkedBlockingQueue.take,org.hbase.async.MapRThreadPool$RpcRunnable.run,j.util.concurrent.ThreadPoolExecutor.runWorker,j.util.concurrent.ThreadPoolExecutor$Worker.run,j.lang.Thread.run
    960 sun.nio.ch.EPollArrayWrapper.epollWait,sun.nio.ch.EPollArrayWrapper.poll,sun.nio.ch.EPollSelectorImpl.doSelect,sun.nio.ch.SelectorImpl.lockAndDoSelect,sun.nio.ch.SelectorImpl.select,org.jboss.netty.channel.socket.nio.SelectorUtil.select,org.jboss.netty.channel.socket.nio.AbstractNioSelector.select,org.jboss.netty.channel.socket.nio.AbstractNioSelector.run,org.jboss.netty.channel.socket.nio.AbstractNioWorker.run,org.jboss.netty.channel.socket.nio.NioWorker.run,org.jboss.netty.util.ThreadRenamingRunnable.run,org.jboss.netty.util.internal.DeadLockProofWorker$1.run,j.util.concurrent.ThreadPoolExecutor.runWorker,j.util.concurrent.ThreadPoolExecutor$Worker.run,j.lang.Thread.run
     20 sun.misc.Unsafe.park,j.util.concurrent.locks.LockSupport.park,j.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await,j.util.concurrent.LinkedBlockingQueue.take,org.hbase.async.MapRThreadPool.takeFromQ,org.hbase.async.MapRThreadPool$ScanRpcRunnable.run,j.util.concurrent.ThreadPoolExecutor.runWorker,j.util.concurrent.ThreadPoolExecutor$Worker.run,j.lang.Thread.run
     10 sun.nio.ch.EPollArrayWrapper.epollWait,sun.nio.ch.EPollArrayWrapper.poll,sun.nio.ch.EPollSelectorImpl.doSelect,sun.nio.ch.SelectorImpl.lockAndDoSelect,sun.nio.ch.SelectorImpl.select,sun.nio.ch.SelectorImpl.select,org.jboss.netty.channel.socket.nio.NioServerBoss.select,org.jboss.netty.channel.socket.nio.AbstractNioSelector.run,org.jboss.netty.channel.socket.nio.NioServerBoss.run,org.jboss.netty.util.ThreadRenamingRunnable.run,org.jboss.netty.util.internal.DeadLockProofWorker$1.run,j.util.concurrent.ThreadPoolExecutor.runWorker,j.util.concurrent.ThreadPoolExecutor$Worker.run,j.lang.Thread.run
     10 sun.nio.ch.EPollArrayWrapper.epollWait,sun.nio.ch.EPollArrayWrapper.poll,sun.nio.ch.EPollSelectorImpl.doSelect,sun.nio.ch.SelectorImpl.lockAndDoSelect,sun.nio.ch.SelectorImpl.select,org.jboss.netty.channel.socket.nio.SelectorUtil.select,org.jboss.netty.channel.socket.nio.AbstractNioSelector.select,org.jboss.netty.channel.socket.nio.AbstractNioSelector.run,org.jboss.netty.channel.socket.nio.NioClientBoss.run,org.jboss.netty.util.ThreadRenamingRunnable.run,org.jboss.netty.util.internal.DeadLockProofWorker$1.run,j.util.concurrent.ThreadPoolExecutor.runWorker,j.util.concurrent.ThreadPoolExecutor$Worker.run,j.lang.Thread.run
     10 j.util.TreeMap.successor,j.util.TreeMap$PrivateEntryIterator.nextEntry,j.util.TreeMap$KeyIterator.next,net.opentsdb.core.SaltScanner.validateAndTriggerCallback,net.opentsdb.core.SaltScanner.access$1400,net.opentsdb.core.SaltScanner$ScannerCB.close,net.opentsdb.core.SaltScanner$ScannerCB.call,net.opentsdb.core.SaltScanner$ScannerCB.call,com.stumbleupon.async.Deferred.doCall,com.stumbleupon.async.Deferred.runCallbacks,com.stumbleupon.async.Deferred.callback,org.hbase.async.HBaseRpc.callback,org.hbase.async.MapRThreadPool$ScanRpcRunnable.run,j.util.concurrent.ThreadPoolExecutor.runWorker,j.util.concurrent.ThreadPoolExecutor$Worker.run,j.lang.Thread.run
     10 j.lang.Object.wait,j.lang.ref.ReferenceQueue.remove,j.lang.ref.ReferenceQueue.remove,j.lang.ref.Finalizer$FinalizerThread.run
     10 j.lang.Object.wait,j.lang.Object.wait,j.lang.ref.Reference$ReferenceHandler.run
@manolama manolama added a commit to manolama/opentsdb that referenced this issue Jul 13, 2016
@thatsafunnyname @manolama thatsafunnyname + manolama Use thread safe ConcurrentHashMap. (#3)
For OpenTSDB#823
Concurrent use of a HashSet can trigger race conditions and an infinite loop(s), use the thread-safe ConcurrentHashMap to avoid this.
Making the change just for the thread stack seen looping.
f2e7cb2
@manolama manolama added a commit to manolama/opentsdb that referenced this issue Jul 13, 2016
@manolama manolama Attempt to fix #823 by using a proper concurrent byte map. 8b1368a
@manolama manolama added a commit that referenced this issue Jul 13, 2016
@thatsafunnyname @manolama thatsafunnyname + manolama Use thread safe ConcurrentHashMap. (#3)
For #823
Concurrent use of a HashSet can trigger race conditions and an infinite loop(s), use the thread-safe ConcurrentHashMap to avoid this.
Making the change just for the thread stack seen looping.

Signed-off-by: Chris Larsen <clarsen@yahoo-inc.com>
6582775
@manolama manolama added a commit that referenced this issue Jul 13, 2016
@thatsafunnyname @manolama thatsafunnyname + manolama Use thread safe ConcurrentHashMap. (#3)
For #823
Concurrent use of a HashSet can trigger race conditions and an infinite loop(s), use the thread-safe ConcurrentHashMap to avoid this.
Making the change just for the thread stack seen looping.

Signed-off-by: Chris Larsen <clarsen@yahoo-inc.com>
d3ecb53
@manolama
Member

The patch looks good for the first one. For regex and wildcard queries there is the possibility that the sets could be modified from different threads at the same time due to UID resolution. The patch should solve that with concurrent maps and I'll merge that upstream.

For the second issue, we'll try out manolama@8b1368a to see if just wrapping the TreeMap with a synchronized collection doesn't work properly.

@manolama manolama added a commit that referenced this issue Jul 13, 2016
@thatsafunnyname @manolama thatsafunnyname + manolama Use thread safe ConcurrentHashMap. (#3)
For #823
Concurrent use of a HashSet can trigger race conditions and an infinite loop(s), use the thread-safe ConcurrentHashMap to avoid this.
Making the change just for the thread stack seen looping.

Signed-off-by: Chris Larsen <clarsen@yahoo-inc.com>
7242550
@thatsafunnyname
Contributor

I applied the first ("skips" and "keepers") and second ("annotations") patch to the 2.2.0 tagged release and deployed with a staggered release to 10 read-only dev opentsdb instances over 5 dev hosts. The dev hosts have been under a very light load, total test time so far is 18 hours ( 2 read-only instances x ( ( 1 host x 3hrs ) + ( 2 x 2hrs ) + ( 2 x 1hr) ) ), no problems so far. We will test these opentsdb instances under a higher load either tomorrow or next week (probably next week), and report back. Thanks again for the patch for the second issue and for merging the first pull request.

@thatsafunnyname
Contributor

The 10 read-only opentsdb instances with the first and second patch applied have had no problems and no CPUs in infinite loops while operating under a light load. We just switched them over to service a heavier load of queries. Weekend workload is very light, so I will update again on Wednesday next week. Thanks.

@thatsafunnyname
Contributor

I noticed one thread (running with first and second patch against 2.2.0) that remained at high CPU utilization since the restart:

> ps -mo 'pid lwp time %cpu cmd' 33633 | grep -v '0.0'
  PID   LWP     TIME %CPU CMD
33633     - 2-11:46:08 67.5 opensdb_java_process
    - 34274 2-11:25:55 67.1 -
> perl -e 'printf("0x%X", 34274)'
0x85E2
> jstack -l 33633 | grep -A 14 'nid=0x85e2'
"pool-7-thread-2" prio=10 tid=0x00007f4555938000 nid=0x85e2 runnable [0x00007f45229e7000]
   java.lang.Thread.State: RUNNABLE
        at net.opentsdb.core.SaltScanner.validateAndTriggerCallback(SaltScanner.java:584)
        at net.opentsdb.core.SaltScanner.access$1400(SaltScanner.java:59)
        at net.opentsdb.core.SaltScanner$ScannerCB.close(SaltScanner.java:559)
        at net.opentsdb.core.SaltScanner$ScannerCB.call(SaltScanner.java:368)
        at net.opentsdb.core.SaltScanner$ScannerCB.call(SaltScanner.java:302)
        at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278)
        at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)
        at com.stumbleupon.async.Deferred.callback(Deferred.java:1005)
        at org.hbase.async.HBaseRpc.callback(HBaseRpc.java:734)
        at org.hbase.async.MapRThreadPool$ScanRpcRunnable.run(MapRThreadPool.java:435)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
> jstack -l 33633 | grep -A 14 'nid=0x85e2'
"pool-7-thread-2" prio=10 tid=0x00007f4555938000 nid=0x85e2 runnable [0x00007f45229e7000]
   java.lang.Thread.State: RUNNABLE
        at java.util.TreeMap.successor(TreeMap.java:1990)
        at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1116)
        at java.util.TreeMap$EntryIterator.next(TreeMap.java:1151)
        at java.util.TreeMap$EntryIterator.next(TreeMap.java:1146)
        at net.opentsdb.core.SaltScanner.validateAndTriggerCallback(SaltScanner.java:579)
        at net.opentsdb.core.SaltScanner.access$1400(SaltScanner.java:59)
        at net.opentsdb.core.SaltScanner$ScannerCB.close(SaltScanner.java:559)
        at net.opentsdb.core.SaltScanner$ScannerCB.call(SaltScanner.java:368)
        at net.opentsdb.core.SaltScanner$ScannerCB.call(SaltScanner.java:302)
        at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278)
        at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)
        at com.stumbleupon.async.Deferred.callback(Deferred.java:1005)
        at org.hbase.async.HBaseRpc.callback(HBaseRpc.java:734)
> jstack -l 33633 | grep -A 14 'nid=0x85e2'
"pool-7-thread-2" prio=10 tid=0x00007f4555938000 nid=0x85e2 runnable [0x00007f45229e7000]
   java.lang.Thread.State: RUNNABLE
        at java.util.TreeMap.successor(TreeMap.java:1997)
        at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1116)
        at java.util.TreeMap$EntryIterator.next(TreeMap.java:1151)
        at java.util.TreeMap$EntryIterator.next(TreeMap.java:1146)
        at net.opentsdb.core.SaltScanner.validateAndTriggerCallback(SaltScanner.java:579)
        at net.opentsdb.core.SaltScanner.access$1400(SaltScanner.java:59)
        at net.opentsdb.core.SaltScanner$ScannerCB.close(SaltScanner.java:559)
        at net.opentsdb.core.SaltScanner$ScannerCB.call(SaltScanner.java:368)
        at net.opentsdb.core.SaltScanner$ScannerCB.call(SaltScanner.java:302)
        at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278)
        at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)
        at com.stumbleupon.async.Deferred.callback(Deferred.java:1005)
        at org.hbase.async.HBaseRpc.callback(HBaseRpc.java:734

Note that line numbers differ from manolama@8b1368a because I don't have manolama@050d507#diff-4d6a61187451e2070d4b688cd3fb6726 .

558      if (ok && exception == null) {
559        validateAndTriggerCallback(kvs, annotations);
560      } else {
561        completed_tasks.incrementAndGet();
562      }
...
571  private void validateAndTriggerCallback(final List<KeyValue> kvs,
572          final ByteMap<List<Annotation>> annotations) {
573
574    final int tasks = completed_tasks.incrementAndGet();
575    if (kvs.size() > 0) {
576      kv_map.put(tasks, kvs);
577    }
578
579    for (final Entry<byte[], List<Annotation>> entry : annotations.entrySet()) {
580      if (entry.getValue().size() > 0) {
581        // Optimistic write, expecting unique row keys
582        annotation_map.put(entry.getKey(), entry.getValue());
583      }
584    }
@thatsafunnyname
Contributor

Just to update, the first and second patch have been running on opentsdb instances under a heavy load and we have seen high usertime CPU, when we do the threads are seen in these stacks:

            "java.util.TreeMap.getEntryUsingComparator(TreeMap.java:376)",
            "java.util.TreeMap.getEntry(TreeMap.java:340)",
            "java.util.TreeMap.get(TreeMap.java:273)",
            "net.opentsdb.core.SaltScanner$ScannerCB.processRow(SaltScanner.java:494)", 
- this is List<Annotation> notes = annotations.get(key);
            "net.opentsdb.core.SaltScanner$ScannerCB$1MatchCB.call(SaltScanner.java:426)",
            "net.opentsdb.core.SaltScanner$ScannerCB$1MatchCB.call(SaltScanner.java:414)",
            "com.stumbleupon.async.Deferred.doCall(Deferred.java:1278)",
            "com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)",
            "com.stumbleupon.async.Deferred.handleContinuation(Deferred.java:1313)",
            "com.stumbleupon.async.Deferred.doCall(Deferred.java:1284)",
            "com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)",
            "com.stumbleupon.async.Deferred.callback(Deferred.java:1005)",
            "com.stumbleupon.async.DeferredGroup.done(DeferredGroup.java:173)",
            "com.stumbleupon.async.DeferredGroup.recordCompletion(DeferredGroup.java:158)",
            "com.stumbleupon.async.DeferredGroup.access$200(DeferredGroup.java:36)",
            "com.stumbleupon.async.DeferredGroup$1NotifyOrdered.call(DeferredGroup.java:97)",
            "com.stumbleupon.async.Deferred.doCall(Deferred.java:1278)",
            "com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)",
            "com.stumbleupon.async.Deferred.callback(Deferred.java:1005)",
            "org.hbase.async.HBaseRpc.callback(HBaseRpc.java:734)",
            "org.hbase.async.MapRThreadPool$RpcRunnable.run(MapRThreadPool.java:230)",
            "java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)",
            "java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)",
            "java.lang.Thread.run(Thread.java:745)"
            "net.opentsdb.core.SaltScanner.validateAndTriggerCallback(SaltScanner.java:584)", 
- this is the end of the for loop on annotations
            "net.opentsdb.core.SaltScanner.access$1400(SaltScanner.java:59)",
            "net.opentsdb.core.SaltScanner$ScannerCB.close(SaltScanner.java:559)",
            "net.opentsdb.core.SaltScanner$ScannerCB.call(SaltScanner.java:368)",
            "net.opentsdb.core.SaltScanner$ScannerCB.call(SaltScanner.java:302)",
            "com.stumbleupon.async.Deferred.doCall(Deferred.java:1278)",
            "com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)",
            "com.stumbleupon.async.Deferred.callback(Deferred.java:1005)",
            "org.hbase.async.HBaseRpc.callback(HBaseRpc.java:734)",
            "org.hbase.async.MapRThreadPool$ScanRpcRunnable.run(MapRThreadPool.java:435)",
            "java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)",
            "java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)",
            "java.lang.Thread.run(Thread.java:745)"
            "java.util.TreeMap.successor(TreeMap.java:1990)",
            "java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1116)",
            "java.util.TreeMap$EntryIterator.next(TreeMap.java:1151)",
            "java.util.TreeMap$EntryIterator.next(TreeMap.java:1146)",
            "net.opentsdb.core.SaltScanner.validateAndTriggerCallback(SaltScanner.java:579)",
- this is the start of the for loop on annotations
            "net.opentsdb.core.SaltScanner.access$1400(SaltScanner.java:59)",
            "net.opentsdb.core.SaltScanner$ScannerCB.close(SaltScanner.java:559)",
            "net.opentsdb.core.SaltScanner$ScannerCB.call(SaltScanner.java:368)",
            "net.opentsdb.core.SaltScanner$ScannerCB.call(SaltScanner.java:302)",
            "com.stumbleupon.async.Deferred.doCall(Deferred.java:1278)",
            "com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257)",
            "com.stumbleupon.async.Deferred.callback(Deferred.java:1005)",
            "org.hbase.async.HBaseRpc.callback(HBaseRpc.java:734)",
            "org.hbase.async.MapRThreadPool$ScanRpcRunnable.run(MapRThreadPool.java:435)",
            "java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)",
            "java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)",
            "java.lang.Thread.run(Thread.java:745)"

We are currently working on disabling support for annotations, either when being sent from Grafana or in OpenTSDB.

@johann8384 johann8384 modified the milestone: v2.3.1, v2.3.0 Sep 20, 2016
@johann8384
Member

I got a little over zealous moving stuff out of 2.3.0, probably should make sure this is resolved for the 2.3.0 RC2.

@johann8384 johann8384 modified the milestone: v2.3.0, v2.3.1 Sep 21, 2016
@thatsafunnyname
Contributor

Just to update, we continued (and continue) to see the threads spinning in code to do with annotations. A proxy was implemented to disable annotations in queries, we found some access (requesting annotations) was by-passing the proxy, so I think the latest is that there is a plan to disable this by-pass route and force the use of the proxy that disables asking for annotations in queries.
I have a version of OpenTSDB built with annotations support disabled, and if stopping all queries requesting annotations does not stop the threads spinning in the annotations code then we will try the build with annotations disabled, if it works then I could tidy it up and add the option as a config setting.

@johann8384 johann8384 added a commit to johann8384/opentsdb that referenced this issue Sep 21, 2016
@johann8384 johann8384 Disable Annotation RPC
Fix #823
dc19d8e
@thatsafunnyname
Contributor

Thanks, I am not sure if the annotation queries we see are arriving over the "annotation" RPC. Sorry, I am working on this from the inside to the out.

My hack with annotations support disabled has these two diffs, the first has already been applied, and the second initialises some variables (I read that an uninitialised bool in java can be evaluated as false, so I set no_annotations to true), and also hard codes return values from the accessor functions. The intention was just to be able to check if the problem went away with no_annotations always true and with_global_annotations always false.

> diff ~/opentsdb_disable_annotations/opentsdb-2.2.0/src/core/SaltScanner.java~ ~/opentsdb_disable_annotations/opentsdb-2.2.0/src/core/SaltScanner.java
18d17
< import java.util.HashSet;
309,310c308,309
<     private final Set<String> skips = new HashSet<String>();
<     private final Set<String> keepers = new HashSet<String>();
---
>     private final Set<String> skips = Collections.newSetFromMap(new ConcurrentHashMap<String, Boolean>());
>     private final Set<String> keepers = Collections.newSetFromMap(new ConcurrentHashMap<String, Boolean>());
> diff ~/opentsdb_disable_annotations/opentsdb-2.2.0/src/core/TSQuery.java~  ~/opentsdb_disable_annotations/opentsdb-2.2.0/src/core/TSQuery.java
62c62
<   private boolean no_annotations;
---
>   private boolean no_annotations = true;
65c65
<   private boolean with_global_annotations;
---
>   private boolean with_global_annotations = false;
318c318,319
<     return no_annotations;
---
>     return true;
>     /** return no_annotations; */
323c324,325
<     return with_global_annotations;
---
>     return false;
>     /** return with_global_annotations; */
@manolama
Member

Hi @thatsafunnyname Still seeing the spinning threads? And do you actually have some Annotations stored in the data that's scanned?

@thatsafunnyname
Contributor

Yes we still see spinning threads in the annotations code.
We plan to point some query "load" at an instance running with the hack from my previous comment (to disable annotations support) today. A roll out to more instances in more environments will follow if we no longer see thread spinning in the annotations code.
We do not have annotations stored in the data being scanned, as far as we are aware.
Thanks.

@thatsafunnyname
Contributor

Just to update. The query "load" sent to an instance running with the hack from my previous comment (to disable annotations support) was unable to cause threads to get stuck spinning (in annotations code). We have rolled out the .jar file to other development hosts and plan to start rolling out to production in around a week from now. We see threads stuck spinning in annotations code reasonably frequently in production, so I expect to be able to confirm that disabling annotations ( in the manner done in #823 (comment) ) resolves the problem for us.

@johann8384
Member

As a workaround for the problem I can make it possible to diasble annotations cleanly from the configuration. Of course, we'll still want to find the cause of the problem in the annotation code.

@thatsafunnyname
Contributor

Thanks @johann8384 , that would be great. Being able to disable (and enable) annotations cleanly in the config would help us.

@thatsafunnyname
Contributor

Just to provide an update. The changes to disable annotations from #823 (comment) were deployed to a production environment, they helped but did not completely eradicate opentsdb processes seen spinning CPU in code supporting annotations.

The chart below over 3 days shows the reduction in avg percent user CPU time, hosts have 24 CPUs.
Each host is also running mapr with 2 ro and 2 rw opentsdb instances (it is one of the ro ones that handles the queries and has the issue).
We are running a monit process that is restarting the opentsdb instances when they have used over a threshold of CPU for a few minutes, which we have previously judged to be a good enough indicator that the opentsdb process is stuck in an infinite loop.

image

After the deployment our monit logs still showed our monit CPU theshold being hit by opentsdb processes on occasion, the monit threshold is based on usr+sys CPU, so not perfect and the events could well be false positives - as far as infinite loops are concerned.

The below chart over 3 days shows the percent user CPU time for the 24 CPUs on one of the hosts from the previous chart.

image

We are also logging the HTTP responses from the api/stats/threads URL for the opentsdb instance that has had the CPU spinning. The log of thread stacks shows that when the opentsdb processes are using the most CPU they are in annotations code, in particular we have logged 100 threads in:

            "java.util.TreeMap.getEntryUsingComparator(TreeMap.java:376)",
            "java.util.TreeMap.getEntry(TreeMap.java:340)",
            "java.util.TreeMap.get(TreeMap.java:273)",
            "net.opentsdb.core.SaltScanner$ScannerCB.processRow(SaltScanner.java:493)",
            "net.opentsdb.core.SaltScanner$ScannerCB$1MatchCB.call(SaltScanner.java:425)",
            "net.opentsdb.core.SaltScanner$ScannerCB$1MatchCB.call(SaltScanner.java:413)",
            "com.stumbleupon.async.Deferred.doCall(Deferred.java:1278)",

these are the following lines in processRow in SaltScanner.java

493      List<Annotation> notes = annotations.get(key);
494      if (notes == null) {
495        notes = new ArrayList<Annotation>();
496        annotations.put(key, notes);
497      }

I removed this code using the diff below. An if guard around the above code block (driven from the global annotations settings) might have been enough to avoid the block of code, but I opted for a slightly more invasive code removal. The thought being that if we no longer see the

( opentsdb processes spinning CPU ) AND ( the CPU threads in annotations code )

then it will have helped. A jar file with these changes has been deployed to our development environment. If testing is without incident then a production deployment will follow. I will report back.

> diff -dur opentsdb-2.2.0 opentsdb-2.2.0_fix4
diff -dur opentsdb-2.2.0/src/core/SaltScanner.java opentsdb-2.2.0_fix4/src/core/SaltScanner.java
--- opentsdb-2.2.0/src/core/SaltScanner.java    2016-02-14 20:43:12.000000000 -0500
+++ opentsdb-2.2.0_fix4/src/core/SaltScanner.java       2016-10-06 11:22:37.012060985 -0400
@@ -15,7 +15,6 @@
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
-import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
@@ -306,8 +305,8 @@
     private final List<KeyValue> kvs = new ArrayList<KeyValue>();
     private final ByteMap<List<Annotation>> annotations =
             new ByteMap<List<Annotation>>();
-    private final Set<String> skips = new HashSet<String>();
-    private final Set<String> keepers = new HashSet<String>();
+    private final Set<String> skips = Collections.newSetFromMap(new ConcurrentHashMap<String, Boolean>());
+    private final Set<String> keepers = Collections.newSetFromMap(new ConcurrentHashMap<String, Boolean>());

     private long scanner_start = -1;
     /** nanosecond timestamps */
@@ -491,18 +490,12 @@
         tsdb.getClient().delete(del);
       }

-      List<Annotation> notes = annotations.get(key);
-      if (notes == null) {
-        notes = new ArrayList<Annotation>();
-        annotations.put(key, notes);
-      }
-
       final KeyValue compacted;
       // let IllegalDataExceptions bubble up so the handler above can close
       // the scanner
       final long compaction_start = DateTime.nanoTime();
       try {
-        compacted = tsdb.compact(row, notes);
+        compacted = tsdb.compact(row);
       } catch (IllegalDataException idex) {
         compaction_time += (DateTime.nanoTime() - compaction_start);
         close(false);
@@ -556,7 +549,7 @@
       }

       if (ok && exception == null) {
-        validateAndTriggerCallback(kvs, annotations);
+        validateAndTriggerCallback(kvs);
       } else {
         completed_tasks.incrementAndGet();
       }
@@ -568,22 +561,13 @@
    * @param kvs The compacted columns fetched by the scanner
    * @param annotations The annotations fetched by the scanners
    */
-  private void validateAndTriggerCallback(final List<KeyValue> kvs,
-          final Map<byte[], List<Annotation>> annotations) {
+  private void validateAndTriggerCallback(final List<KeyValue> kvs) {

     final int tasks = completed_tasks.incrementAndGet();
     if (kvs.size() > 0) {
       kv_map.put(tasks, kvs);
     }
-
-    for (final byte[] key : annotations.keySet()) {
-      final List<Annotation> notes = annotations.get(key);
-      if (notes.size() > 0) {
-        // Optimistic write, expecting unique row keys
-        annotation_map.put(key, notes);
-      }
-    }
-
+
     if (tasks >= Const.SALT_BUCKETS()) {
       try {
         mergeAndReturnResults();
diff -dur opentsdb-2.2.0/src/core/TSDB.java opentsdb-2.2.0_fix4/src/core/TSDB.java
--- opentsdb-2.2.0/src/core/TSDB.java   2016-02-14 20:43:12.000000000 -0500
+++ opentsdb-2.2.0_fix4/src/core/TSDB.java      2016-10-06 11:29:22.325039247 -0400
@@ -1283,9 +1283,9 @@
   // Compaction helpers //
   // ------------------ //

-  final KeyValue compact(final ArrayList<KeyValue> row,
-      List<Annotation> annotations) {
-    return compactionq.compact(row, annotations);
+  final KeyValue compact(final ArrayList<KeyValue> row){
+    List<Annotation> dummy_annotations = new ArrayList<Annotation>();
+    return compactionq.compact(row, dummy_annotations);
   }

   /**
diff -dur opentsdb-2.2.0/src/core/TSQuery.java opentsdb-2.2.0_fix4/src/core/TSQuery.java
--- opentsdb-2.2.0/src/core/TSQuery.java        2016-02-14 20:43:12.000000000 -0500
+++ opentsdb-2.2.0_fix4/src/core/TSQuery.java   2016-09-05 11:21:18.000000000 -0400
@@ -59,10 +59,10 @@
   private boolean padding;

   /** Whether or not to suppress annotation output */
-  private boolean no_annotations;
+  private boolean no_annotations = true;

   /** Whether or not to scan for global annotations in the same time range */
-  private boolean with_global_annotations;
+  private boolean with_global_annotations = false;

   /** Whether or not to show TSUIDs when returning data */
   private boolean show_tsuids;
@@ -315,12 +315,14 @@

   /** @return whether or not to supress annotatino output */
   public boolean getNoAnnotations() {
-    return no_annotations;
+    return true;
+    /** return no_annotations; */
   }

   /** @return whether or not to load global annotations for the time range */
   public boolean getGlobalAnnotations() {
-    return with_global_annotations;
+    return false;
+    /** return with_global_annotations; */
   }

   /** @return whether or not to display TSUIDs with the results */
diff -dur opentsdb-2.2.0/src/core/TsdbQuery.java opentsdb-2.2.0_fix4/src/core/TsdbQuery.java
--- opentsdb-2.2.0/src/core/TsdbQuery.java      2016-02-14 20:43:12.000000000 -0500
+++ opentsdb-2.2.0_fix4/src/core/TsdbQuery.java 2016-10-06 11:57:27.974019868 -0400
@@ -747,7 +747,7 @@
          }
          final long compaction_start = DateTime.nanoTime();
          final KeyValue compacted =
-           tsdb.compact(row, datapoints.getAnnotations());
+           tsdb.compact(row);
          compaction_time += (DateTime.nanoTime() - compaction_start);
          seenAnnotation |= !datapoints.getAnnotations().isEmpty();
          if (compacted != null) { // Can be null if we ignored all KVs.

The tests needed these changes to pass:

diff -dur opentsdb-2.2.0/test/core/TestSaltScanner.java opentsdb-2.2.0_fix4/test/core/TestSaltScanner.java
--- opentsdb-2.2.0/test/core/TestSaltScanner.java       2016-02-14 20:43:12.000000000 -0500
+++ opentsdb-2.2.0_fix4/test/core/TestSaltScanner.java  2016-10-06 11:58:54.516623000 -0400
@@ -165,7 +165,8 @@
     assertEquals(1356998400000L, span.timestamp(0));
     assertEquals(VALUE_LONG, span.longValue(1));
     assertEquals(1356998401000L, span.timestamp(1));
-    assertEquals(1, span.getAnnotations().size());
+    //assertEquals(1, span.getAnnotations().size());
+    assertEquals(0, span.getAnnotations().size());

     span = spans.get(KEY_B);
     assertEquals(1, span.size());
@@ -201,7 +202,8 @@
     assertEquals(1356998400000L, span.timestamp(0));
     assertEquals(VALUE_LONG, span.longValue(1));
     assertEquals(1356998401000L, span.timestamp(1));
-    assertEquals(1, span.getAnnotations().size());
+    //assertEquals(1, span.getAnnotations().size());
+    assertEquals(0, span.getAnnotations().size());

     span = spans.get(KEY_B);
     assertEquals(1, span.size());
@@ -239,7 +241,8 @@
     assertEquals(1356998400000L, span.timestamp(0));
     assertEquals(VALUE_LONG, span.longValue(1));
     assertEquals(1356998401000L, span.timestamp(1));
-    assertEquals(1, span.getAnnotations().size());
+    //assertEquals(1, span.getAnnotations().size());
+    assertEquals(0, span.getAnnotations().size());

     span = spans.get(KEY_B);
     assertEquals(1, span.size());
@@ -400,7 +403,7 @@
     setupMockScanners(false);

     doThrow(new IllegalDataException("Boo!")).when(
-        tsdb).compact(any(ArrayList.class), any(List.class));
+        tsdb).compact(any(ArrayList.class));

     final SaltScanner scanner = new SaltScanner(tsdb, METRIC_BYTES, scanners,
         spans, filters);
@@ -413,7 +416,7 @@
     setupMockScanners(false);

     doThrow(new RuntimeException("Boo!")).when(
-        tsdb).compact(any(ArrayList.class), any(List.class));
+        tsdb).compact(any(ArrayList.class));

     final SaltScanner scanner = new SaltScanner(tsdb, METRIC_BYTES, scanners,
         spans, filters);
diff -dur opentsdb-2.2.0/test/core/TestTSQuery.java opentsdb-2.2.0_fix4/test/core/TestTSQuery.java
--- opentsdb-2.2.0/test/core/TestTSQuery.java   2016-02-14 20:43:12.000000000 -0500
+++ opentsdb-2.2.0_fix4/test/core/TestTSQuery.java      2016-10-06 16:38:58.692250807 -0400
@@ -424,7 +424,7 @@
     final int hash_a = sub1.hashCode();
     sub1.setNoAnnotations(true);
     final int hash_b = sub1.hashCode();
-    assertTrue(hash_a != hash_b);
+    //assertTrue(hash_a != hash_b);
     sub1.validateAndSetQuery();
     assertEquals(hash_b, sub1.hashCode());

diff -dur opentsdb-2.2.0/test/core/TestTsdbQueryQueries.java opentsdb-2.2.0_fix4/test/core/TestTsdbQueryQueries.java
--- opentsdb-2.2.0/test/core/TestTsdbQueryQueries.java  2016-02-14 20:43:12.000000000 -0500
+++ opentsdb-2.2.0_fix4/test/core/TestTsdbQueryQueries.java     2016-10-06 16:09:51.896627900 -0400
@@ -714,7 +714,7 @@
     query.setTimeSeries(METRIC_STRING, tags, Aggregators.SUM, false);

     final DataPoints[] dps = query.run();
-    assertMeta(dps, 0, false, true);
+    assertMeta(dps, 0, false, false);

     int value = 1;
     long timestamp = 1356998430000L;
@@ -740,7 +740,7 @@
     query.setEndTime(1357041600);
     query.setTimeSeries(METRIC_STRING, tags, Aggregators.SUM, false);
     DataPoints[] dps = query.run();
-    assertMeta(dps, 0, false, true);
+    assertMeta(dps, 0, false, false);

     // this should only compact the rows for the time series that we fetched and
     // leave the others alone
@@ -790,7 +790,7 @@
     }

     dps = query.run();
-    assertMeta(dps, 0, false, true);
+    assertMeta(dps, 0, false, false);

     int value = 1;
     long timestamp = 1356998430000L;
@@ -823,7 +823,7 @@
     query.setTimeSeries(METRIC_STRING, tags, Aggregators.SUM, false);

     final DataPoints[] dps = query.run();
-    assertMeta(dps, 0, false, true);
+    assertMeta(dps, 0, false, false);

     int value = 1;
     long timestamp = 1356998430000L;
@@ -867,11 +867,11 @@
     // TODO - apparently if you only fetch annotations, the metric and tags
     // may not be set. Check this
     //assertMeta(dps, 0, false, true);
-    assertEquals(1, dps[0].getAnnotations().size());
-    assertEquals(NOTE_DESCRIPTION, dps[0].getAnnotations().get(0)
-        .getDescription());
-    assertEquals(NOTE_NOTES, dps[0].getAnnotations().get(0).getNotes());
-    assertEquals(0, dps[0].size());
+    //assertEquals(1, dps[0].getAnnotations().size());
+    //assertEquals(NOTE_DESCRIPTION, dps[0].getAnnotations().get(0)
+    //    .getDescription());
+    //assertEquals(NOTE_NOTES, dps[0].getAnnotations().get(0).getNotes());
+    //assertEquals(0, dps[0].size());
   }

   @Test
@@ -912,7 +912,7 @@
     query.setTimeSeries(METRIC_STRING, tags, Aggregators.SUM, false);

     final DataPoints[] dps = query.run();
-    assertMeta(dps, 0, false, true);
+    assertMeta(dps, 0, false, false);

     assertEquals(1, dps[0].size());
     assertEquals(42, dps[0].longValue(0));
diff -dur opentsdb-2.2.0/test/storage/MockBase.java opentsdb-2.2.0_fix4/test/storage/MockBase.java
--- opentsdb-2.2.0/test/storage/MockBase.java   2016-02-14 20:43:12.000000000 -0500
+++ opentsdb-2.2.0_fix4/test/storage/MockBase.java      2016-10-06 15:34:58.251151647 -0400
@@ -678,7 +678,7 @@
           row.remove(k);
         }
         final KeyValue compacted =
-            Whitebox.invokeMethod(tsdb, "compact", kvs, Collections.EMPTY_LIST);
+            Whitebox.invokeMethod(tsdb, "compact", kvs);
         final TreeMap<Long, byte[]> compacted_value = new TreeMap<Long, byte[]>();
         compacted_value.put(current_timestamp++, compacted.value());
         row.put(compacted.qualifier(), compacted_value);
diff -dur opentsdb-2.2.0/test/tsd/TestHttpJsonSerializer.java opentsdb-2.2.0_fix4/test/tsd/TestHttpJsonSerializer.java
--- opentsdb-2.2.0/test/tsd/TestHttpJsonSerializer.java 2016-02-14 20:43:12.000000000 -0500
+++ opentsdb-2.2.0_fix4/test/tsd/TestHttpJsonSerializer.java    2016-10-06 15:39:54.504012772 -0400
@@ -222,7 +222,7 @@
     assertTrue(json.contains("\"1356998700\":1,"));
     assertTrue(json.contains("\"1357058700\":201"));
     assertFalse(json.contains("\"timeTotal\""));
-    assertTrue(json.contains("\"tsuid\":\"000001000001000001\""));
+    //assertTrue(json.contains("\"tsuid\":\"000001000001000001\""));
     assertFalse(json.contains("\"query\":"));
   }

@@ -245,7 +245,7 @@
     assertTrue(json.contains("\"1356998700\":1,"));
     assertTrue(json.contains("\"1357058700\":201"));
     assertFalse(json.contains("\"timeTotal\""));
-    assertTrue(json.contains("\"tsuid\":\"000001000001000001\""));
+    //assertTrue(json.contains("\"tsuid\":\"000001000001000001\""));
     assertTrue(json.contains("\"query\":"));
   }
@thatsafunnyname
Contributor

The jar file with the diff from #823 (comment) (which removes some code supporting annotations) is in the process of being rolled out to a production environment where user CPU time spikes had been seen at times when many threads were seen in the annotations code.

So far, the user CPU time spikes are not seen once the jar file has been deployed, and the monit actions for the opentsdb processes exceeding CPU thresholds have not fired.

This chart is for 3 host (24 CPUs each), it is the sum of 1min averages of user CPU time.
Over 14 days.

image

So being able to cleanly disable annotations in the opentsdb config ( #873 ) will help.

I just wanted to also mention that we are in the process of making some changes to our systems that could change the number of threads, and may mean that we will not hit the race condition as frequently. So our ability to easily test a 'proper' fix may be reduced.

@chrisbertsch

We are eagerly awaiting this fix, when is it expected for this change to be merged in? Do we have a new expected release date for v2.3.0?

@johann8384 johann8384 added a commit to johann8384/opentsdb that referenced this issue Dec 2, 2016
@johann8384 johann8384 Disable Annotation RPC
Fix #823
ca4f75a
@johann8384 johann8384 added a commit to johann8384/opentsdb that referenced this issue Dec 5, 2016
@thatsafunnyname @johann8384 thatsafunnyname + johann8384 Use thread safe ConcurrentHashMap. (#3)
For OpenTSDB#823
Concurrent use of a HashSet can trigger race conditions and an infinite loop(s), use the thread-safe ConcurrentHashMap to avoid this.
Making the change just for the thread stack seen looping.

Signed-off-by: Chris Larsen <clarsen@yahoo-inc.com>
0368914
@manolama manolama added a commit to manolama/opentsdb that referenced this issue Dec 17, 2016
@manolama manolama Attempt a fix for #823 wherein writes to the annotations byte map
were not synchronized, potentially leading to a race condition and
stuck threads. Now we'll send a fresh list to the compaction code
if notes were found, synchronize on the local byte map before making
any modifications. Thanks to @thatsafunnyname.
ee1bf44
@manolama
Member

manolama@ee1bf44 should handle this by synchronizing on the map before making modifications. Will merge before cutting 2.3.0

@manolama manolama added a commit that referenced this issue Dec 17, 2016
@manolama manolama Attempt a fix for #823 wherein writes to the annotations byte map
were not synchronized, potentially leading to a race condition and
stuck threads. Now we'll send a fresh list to the compaction code
if notes were found, synchronize on the local byte map before making
any modifications. Thanks to @thatsafunnyname.

Signed-off-by: Chris Larsen <clarsen@yahoo-inc.com>
8721acd
@manolama manolama added a commit that referenced this issue Dec 17, 2016
@manolama manolama Attempt a fix for #823 wherein writes to the annotations byte map
were not synchronized, potentially leading to a race condition and
stuck threads. Now we'll send a fresh list to the compaction code
if notes were found, synchronize on the local byte map before making
any modifications. Thanks to @thatsafunnyname.

Signed-off-by: Chris Larsen <clarsen@yahoo-inc.com>
76eb740
@manolama manolama added a commit that referenced this issue Dec 17, 2016
@manolama manolama Attempt a fix for #823 wherein writes to the annotations byte map
were not synchronized, potentially leading to a race condition and
stuck threads. Now we'll send a fresh list to the compaction code
if notes were found, synchronize on the local byte map before making
any modifications. Thanks to @thatsafunnyname.

Signed-off-by: Chris Larsen <clarsen@yahoo-inc.com>
6c005ce
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment