Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CallbackOverflowError with many tag values #839

Open
bstoll opened this issue Jul 17, 2016 · 11 comments · May be fixed by #2204
Open

CallbackOverflowError with many tag values #839

bstoll opened this issue Jul 17, 2016 · 11 comments · May be fixed by #2204
Labels

Comments

@bstoll
Copy link

bstoll commented Jul 17, 2016

I have a metric with 2 tags and 300 * 300 tag values. An API query that returns all of these tag values runs into a callback overflow exception. I've seen this with 2.2/2.3RC1/next.

00:04:32.993 INFO  [QueryStats.<init>] - Executing new query={"query":{"start":"1468642762","end":"1468729162","timezone":null,"options":null,"padding":false,"queries":[{"aggregator":"avg","metric":"test.bug","tsuids":null,"downsample":null,"rate":true,"filters":[{"tagk":"tag1","filter":"*","group_by":true,"type":"wildcard"},{"tagk":"tag2","filter":"*","group_by":true,"type":"wildcard"}],"index":0,"rateOptions":{"counter":false,"dropResets":false,"counterMax":9223372036854775807,"resetValue":0},"filterTagKs":[],"explicitTags":false,"tags":{"tag1":"wildcard(*)","tag2":"wildcard(*)"}}],"delete":false,"noAnnotations":false,"globalAnnotations":false,"showTSUIDs":false,"msResolution":false,"showQuery":false,"showStats":false,"showSummary":false,"useCalendar":false},"exception":"null","executed":1,"user":null,"requestHeaders":{"Origin":"file://","Cache-Control":"no-cache","Accept":"*/*","Connection":"keep-alive","User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Postman/4.4.2 Chrome/51.0.2704.103 Electron/1.2.5 Safari/537.36","Host":"1.1.1.1:4242","Postman-Token":"d70a9c15-c63d-c6fb-7474-75d6c05e4ba7","Accept-Encoding":"gzip, deflate","Accept-Language":"en-US","Content-Length":"722","Content-Type":"text/plain;charset=UTF-8"},"numRunningQueries":6,"httpResponse":null,"queryStartTimestamp":1468731872992,"queryCompletedTimestamp":0,"sentToClient":false,"stats":{}}

00:04:33.688 ERROR [RegionClient.exceptionCaught] - Unexpected exception from downstream on [id: 0x3139ea25, /10.0.0.2:48400 => /10.0.0.3:60020]
com.stumbleupon.async.CallbackOverflowError: Too many callbacks in Deferred@811087221(state=PENDING, result=null, callback=net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@3467b7a6 -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@71653f4e -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@315c24d8 -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@405e4c1b -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@498d0931 -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@114ac001 -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@41e04c14 -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@73eceea7 -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@5289a56a -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@15043df2 -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@56c753aa -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@424a1603 -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@5d7749d5 -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@69a81424 ....(size=16383) when attempting to add cb=net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@31ad1009@833425417, eb=passthrough@922150289
    at com.stumbleupon.async.Deferred.addCallbacks(Deferred.java:669) ~[async-1.4.0.jar:na]
    at com.stumbleupon.async.Deferred.addCallback(Deferred.java:724) ~[async-1.4.0.jar:na]
    at net.opentsdb.tsd.HttpJsonSerializer.formatQueryAsyncV1(HttpJsonSerializer.java:874) ~[tsdb-2.3.0-RC1.jar:]
    at net.opentsdb.tsd.QueryRpc$1QueriesCB.call(QueryRpc.java:260) ~[tsdb-2.3.0-RC1.jar:]
    at net.opentsdb.tsd.QueryRpc$1QueriesCB.call(QueryRpc.java:231) ~[tsdb-2.3.0-RC1.jar:]
    at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278) ~[async-1.4.0.jar:na]
    at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257) ~[async-1.4.0.jar:na]
    at com.stumbleupon.async.Deferred.callback(Deferred.java:1005) ~[async-1.4.0.jar:na]
    at com.stumbleupon.async.DeferredGroup.done(DeferredGroup.java:173) ~[async-1.4.0.jar:na]
    at com.stumbleupon.async.DeferredGroup.recordCompletion(DeferredGroup.java:158) ~[async-1.4.0.jar:na]
    at com.stumbleupon.async.DeferredGroup.access$200(DeferredGroup.java:36) ~[async-1.4.0.jar:na]
    at com.stumbleupon.async.DeferredGroup$1NotifyOrdered.call(DeferredGroup.java:97) ~[async-1.4.0.jar:na]
    at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278) ~[async-1.4.0.jar:na]
    at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257) ~[async-1.4.0.jar:na]
    at com.stumbleupon.async.Deferred.callback(Deferred.java:1005) ~[async-1.4.0.jar:na]
    at net.opentsdb.core.TsdbQuery$1ScannerCB.close(TsdbQuery.java:873) ~[tsdb-2.3.0-RC1.jar:]
    at net.opentsdb.core.TsdbQuery$1ScannerCB.call(TsdbQuery.java:633) ~[tsdb-2.3.0-RC1.jar:]
    at net.opentsdb.core.TsdbQuery$1ScannerCB.call(TsdbQuery.java:575) ~[tsdb-2.3.0-RC1.jar:]
    at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278) ~[async-1.4.0.jar:na]
    at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257) ~[async-1.4.0.jar:na]
    at com.stumbleupon.async.Deferred.access$300(Deferred.java:430) ~[async-1.4.0.jar:na]
    at com.stumbleupon.async.Deferred$Continue.call(Deferred.java:1366) ~[async-1.4.0.jar:na]
    at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278) ~[async-1.4.0.jar:na]
    at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257) ~[async-1.4.0.jar:na]
    at com.stumbleupon.async.Deferred.callback(Deferred.java:1005) ~[async-1.4.0.jar:na]
    at org.hbase.async.HBaseRpc.callback(HBaseRpc.java:698) ~[asynchbase-1.7.1.jar:na]
    at org.hbase.async.RegionClient.decode(RegionClient.java:1516) ~[asynchbase-1.7.1.jar:na]
    at org.hbase.async.RegionClient.decode(RegionClient.java:88) ~[asynchbase-1.7.1.jar:na]
    at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:500) ~[netty-3.9.4.Final.jar:na]
    at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435) ~[netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) [netty-3.9.4.Final.jar:na]
    at org.hbase.async.RegionClient.handleUpstream(RegionClient.java:1206) ~[asynchbase-1.7.1.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.SimpleChannelHandler.messageReceived(SimpleChannelHandler.java:142) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.handler.timeout.IdleStateAwareChannelHandler.handleUpstream(IdleStateAwareChannelHandler.java:36) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.handler.timeout.IdleStateHandler.messageReceived(IdleStateHandler.java:294) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) [netty-3.9.4.Final.jar:na]
    at org.hbase.async.HBaseClient$RegionClientPipeline.sendUpstream(HBaseClient.java:3108) [asynchbase-1.7.1.jar:na]
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [netty-3.9.4.Final.jar:na]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [na:1.8.0_91]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [na:1.8.0_91]
    at java.lang.Thread.run(Unknown Source) [na:1.8.0_91]
@johann8384 johann8384 added the bug label Sep 1, 2016
@travisby
Copy link

travisby commented Dec 13, 2016

We're hitting a similar issue (with mapr instead of hbase, though) and ~ 18k tagvs. Let me know if we can be helpful in reproducing :)

15:49:38.722 DEBUG [MapRThreadPool.run] - ScanRpcRunnable::StackOverflowError: {}, ThreadId: {}
com.stumbleupon.async.CallbackOverflowError: Too many callbacks in Deferred@864693141(state=PENDING, result=null, callback=net.opentsdb.tsd.HttpJsonSerializer......
        at com.stumbleupon.async.Deferred.addCallbacks(Deferred.java:669) ~[async-1.4.0.jar:na]
        at com.stumbleupon.async.Deferred.addCallback(Deferred.java:724) ~[async-1.4.0.jar:na]
        at net.opentsdb.tsd.HttpJsonSerializer.formatQueryAsyncV1(HttpJsonSerializer.java:845) ~[tsdb-2.2.1.jar:]
        at net.opentsdb.tsd.QueryRpc$1QueriesCB.call(QueryRpc.java:234) ~[tsdb-2.2.1.jar:]
        at net.opentsdb.tsd.QueryRpc$1QueriesCB.call(QueryRpc.java:217) ~[tsdb-2.2.1.jar:]
        at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278) ~[async-1.4.0.jar:na]
        at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257) ~[async-1.4.0.jar:na]
        at com.stumbleupon.async.Deferred.callback(Deferred.java:1005) ~[async-1.4.0.jar:na]
        at com.stumbleupon.async.DeferredGroup.done(DeferredGroup.java:173) ~[async-1.4.0.jar:na]
        at com.stumbleupon.async.DeferredGroup.recordCompletion(DeferredGroup.java:158) ~[async-1.4.0.jar:na]
        at com.stumbleupon.async.DeferredGroup.access$200(DeferredGroup.java:36) ~[async-1.4.0.jar:na]
        at com.stumbleupon.async.DeferredGroup$1NotifyOrdered.call(DeferredGroup.java:97) ~[async-1.4.0.jar:na]
        at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278) ~[async-1.4.0.jar:na]
        at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257) ~[async-1.4.0.jar:na]
        at com.stumbleupon.async.Deferred.callback(Deferred.java:1005) ~[async-1.4.0.jar:na]
        at net.opentsdb.core.TsdbQuery$1ScannerCB.close(TsdbQuery.java:868) ~[tsdb-2.2.1.jar:]
        at net.opentsdb.core.TsdbQuery$1ScannerCB.call(TsdbQuery.java:628) ~[tsdb-2.2.1.jar:]
        at net.opentsdb.core.TsdbQuery$1ScannerCB.call(TsdbQuery.java:570) ~[tsdb-2.2.1.jar:]
        at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278) ~[async-1.4.0.jar:na]
        at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257) ~[async-1.4.0.jar:na]
        at com.stumbleupon.async.Deferred.callback(Deferred.java:1005) ~[async-1.4.0.jar:na]
        at org.hbase.async.HBaseRpc.callback(HBaseRpc.java:734) ~[asynchbase-1.7.0-mapr-1607.jar:na]
        at org.hbase.async.MapRThreadPool$ScanRpcRunnable.run(MapRThreadPool.java:442) ~[asynchbase-1.7.0-mapr-1607.jar:na]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_102]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_102]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_102]

@manolama
Copy link
Member

manolama commented Dec 17, 2016

I can reproduce this one and it is really annoying. Three things that will help with it in 2.4 will be query limits and multi-gets. We'll be batching these calls for UID resolution and that will definitely help the issue. Also we'll have the ability to persist the TSD cache so it's reloaded on restarts.

@madjack101
Copy link

madjack101 commented Jan 3, 2017

is there any temporary ways to fetch 2 tags and 300 * 300 tag values as bstoll said? i also encountered this issue , maybe use 'tsdb scan' command?

@yangzj
Copy link

yangzj commented Mar 2, 2018

I have a similar issue in version: opentsdb-2.4.0RC2-1.noarch @manolama

passthrough -> passthrough) (size=16383) when attempting to add cb=net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@31b07b1e@833649438, eb=passthrough@410005023 at com.stumbleupon.async.Deferred.addCallbacks(Deferred.java:669) ~[async-1.4.0.jar:na] at com.stumbleupon.async.Deferred.addCallback(Deferred.java:724) ~[async-1.4.0.jar:na] at net.opentsdb.tsd.HttpJsonSerializer.formatQueryAsyncV1(HttpJsonSerializer.java:935) ~[tsdb-2.4.0RC2.jar:] at net.opentsdb.tsd.QueryRpc$1QueriesCB.call(QueryRpc.java:288) ~[tsdb-2.4.0RC2.jar:] at net.opentsdb.tsd.QueryRpc$1QueriesCB.call(QueryRpc.java:259) ~[tsdb-2.4.0RC2.jar:] at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278) ~[async-1.4.0.jar:na] at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257) ~[async-1.4.0.jar:na] at com.stumbleupon.async.Deferred.callback(Deferred.java:1005) ~[async-1.4.0.jar:na] at com.stumbleupon.async.DeferredGroup.done(DeferredGroup.java:173) ~[async-1.4.0.jar:na] at com.stumbleupon.async.DeferredGroup.recordCompletion(DeferredGroup.java:158) ~[async-1.4.0.jar:na] at com.stumbleupon.async.DeferredGroup.access$200(DeferredGroup.java:36) ~[async-1.4.0.jar:na] at com.stumbleupon.async.DeferredGroup$1NotifyOrdered.call(DeferredGroup.java:97) ~[async-1.4.0.jar:na] at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278) ~[async-1.4.0.jar:na] at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257) ~[async-1.4.0.jar:na] at com.stumbleupon.async.Deferred.callback(Deferred.java:1005) ~[async-1.4.0.jar:na] at net.opentsdb.core.SaltScanner.mergeAndReturnResults(SaltScanner.java:324) ~[tsdb-2.4.0RC2.jar:] at net.opentsdb.core.SaltScanner.validateAndTriggerCallback(SaltScanner.java:947) ~[tsdb-2.4.0RC2.jar:] at net.opentsdb.core.SaltScanner.access$2300(SaltScanner.java:68) ~[tsdb-2.4.0RC2.jar:] at net.opentsdb.core.SaltScanner$ScannerCB.close(SaltScanner.java:910) ~[tsdb-2.4.0RC2.jar:] at net.opentsdb.core.SaltScanner$ScannerCB.call(SaltScanner.java:539) ~[tsdb-2.4.0RC2.jar:] at net.opentsdb.core.SaltScanner$ScannerCB.call(SaltScanner.java:461) ~[tsdb-2.4.0RC2.jar:] at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278) ~[async-1.4.0.jar:na] at com.stumbleupon.async.Deferred.addCallbacks(Deferred.java:688) ~[async-1.4.0.jar:na] at com.stumbleupon.async.Deferred.addCallback(Deferred.java:724) ~[async-1.4.0.jar:na] at net.opentsdb.core.SaltScanner$ScannerCB.scan(SaltScanner.java:525) ~[tsdb-2.4.0RC2.jar:] at net.opentsdb.core.SaltScanner$ScannerCB.call(SaltScanner.java:716) ~[tsdb-2.4.0RC2.jar:] at net.opentsdb.core.SaltScanner$ScannerCB.call(SaltScanner.java:461) ~[tsdb-2.4.0RC2.jar:] at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278) ~[async-1.4.0.jar:na] at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257) ~[async-1.4.0.jar:na] at com.stumbleupon.async.Deferred.callback(Deferred.java:1005) ~[async-1.4.0.jar:na] at org.hbase.async.HBaseRpc.callback(HBaseRpc.java:712) ~[asynchbase-1.8.0.jar:na] at org.hbase.async.RegionClient.decode(RegionClient.java:1536) ~[asynchbase-1.8.0.jar:na] at org.hbase.async.RegionClient.decode(RegionClient.java:88) ~[asynchbase-1.8.0.jar:na] at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:500) ~[netty-3.9.4.Final.jar:na] at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435) ~[netty-3.9.4.Final.jar:na] at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) [netty-3.9.4.Final.jar:na] at org.hbase.async.RegionClient.handleUpstream(RegionClient.java:1226) ~[asynchbase-1.8.0.jar:na] at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [netty-3.9.4.Final.jar:na] at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) [netty-3.9.4.Final.jar:na] at org.jboss.netty.channel.SimpleChannelHandler.messageReceived(SimpleChannelHandler.java:142) [netty-3.9.4.Final.jar:na] at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88) [netty-3.9.4.Final.jar:na] at org.jboss.netty.handler.timeout.IdleStateAwareChannelHandler.handleUpstream(IdleStateAwareChannelHandler.java:36) [netty-3.9.4.Final.jar:na] at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [netty-3.9.4.Final.jar:na] at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) [netty-3.9.4.Final.jar:na] at org.jboss.netty.handler.timeout.IdleStateHandler.messageReceived(IdleStateHandler.java:294) [netty-3.9.4.Final.jar:na] at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) [netty-3.9.4.Final.jar:na] at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [netty-3.9.4.Final.jar:na] at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) [netty-3.9.4.Final.jar:na] at org.hbase.async.HBaseClient$RegionClientPipeline.sendUpstream(HBaseClient.java:3678) [asynchbase-1.8.0.jar:na] at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) [netty-3.9.4.Final.jar:na] at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) [netty-3.9.4.Final.jar:na] at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) [netty-3.9.4.Final.jar:na] at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) [netty-3.9.4.Final.jar:na] at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318) [netty-3.9.4.Final.jar:na] at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) [netty-3.9.4.Final.jar:na] at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) [netty-3.9.4.Final.jar:na] at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [netty-3.9.4.Final.jar:na] at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [netty-3.9.4.Final.jar:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_51] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]

@raunz
Copy link

raunz commented Jun 29, 2018

+1 to the issue. opentsdb 2.4.0RC2.
api/query?start=1h-ago&m=avg:rate:10m-avg:interface.octets.in{host=,interface=}

unique "host"+"interface" sets: 165120
unique tagv "host": 3943
unique tagv "interface": 1705

Looks like having multiple tags with a lot of unique tag values, isn't a scalable metric design

11:02:55.907 ERROR [RegionClient.exceptionCaught] - Unexpected exception from downstream on [id: 0x7b0c3ae3, /127.0.0.1:38832 => /127.0.0.1:60020]
com.stumbleupon.async.CallbackOverflowError: Too many callbacks in Deferred@469514390(state=PENDING, result=null, callback=net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@5efeb58d -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@6dd646d7 -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@6fea6ad4 -> .... and so on....

@asdf2014
Copy link
Contributor

asdf2014 commented Apr 12, 2019

@manolama +1

com.stumbleupon.async.CallbackOverflowError: Too many callbacks in 
Deferred@1416846313(state=PENDING, result=null, 
callback=net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@724af230 -> 
net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@bd85fb4 -> 
net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@225fa408 -> 
net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@6cbd1fb9 -> 
net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@4943bb6c -> 
net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@2f680f -> 
net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@5f8ff3e6 -> 
// .........

@patelh
Copy link

patelh commented May 2, 2019

@asdf2014 Did you try these settings?

tsd.query.multi_get.enable=true
tsd.query.multi_get.batch_size=1024
tsd.query.multi_get.concurrent=20

@suishaojian
Copy link

suishaojian commented Jun 14, 2019

@patelh

Hi , I just tried to set these params but then get the error when I query from webUI :
Request failed: Bad Request: Tags list cannot be null or empty..
I cannot query any thing..

I am wondering if I missed something ? could you pls help me ?

@Betula-L
Copy link

Betula-L commented Sep 26, 2019

@manolama is there any process about this issue? i encounter it on 2.4.0.

@laudukang
Copy link

laudukang commented Nov 13, 2019

+1

same issue in version: 2.4.0

error log:

16:14:13.466 ERROR [RegionClient.exceptionCaught] - Unexpected exception from downstream on [id: 0xbbc6970f, /172.31.250.10:50790 => /172.31.120.131:16020]
com.stumbleupon.async.CallbackOverflowError: Too many callbacks in Deferred@1273979129(state=PENDING, result=null, callback=net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@682fd571 -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@3061522a -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@27fac471 -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@17a290df -> net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@1304b07 -> 
//....
passthrough -> passthrough) (size=16383) when attempting to add cb=net.opentsdb.tsd.HttpJsonSerializer$1DPsResolver@29bbd6fc@700176124, eb=passthrough@1260931085

@ugtar
Copy link

ugtar commented Jun 11, 2021

@manolama A coworker and I have been looking at this issue. The maximum callback chain depth imposed by the async library is 16383, which does not seem restrictively small so I don't think the correct fix is to try to increase this limit, especially since it seems it's possible to reach the limit with a relatively modest number of tags and cardinalities in the hundreds or thousands. We are thinking of a couple of alternatives but would like your guidance/opinion before we embark on anything or submit a patch.

For easy reference, I'll quote the critical sections here (and for reference this section was introduced in the following commit 256c142)

    // We want the serializer to execute serially so we need to create a callback
    // chain so that when one DPsResolver is finished, it triggers the next to
    // start serializing.
    final Deferred<Object> cb_chain = new Deferred<Object>();

    for (DataPoints[] separate_dps : results) {
      for (DataPoints dps : separate_dps) {
        try {
          cb_chain.addCallback(new DPsResolver(dps));
        } catch (Exception e) {
          throw new RuntimeException("Unexpected error durring resolution", e);
        }
      }
    }
    ...
    // trigger the callback chain here
    cb_chain.callback(null);
    return cb_chain.addCallback(new FinalCB());

where each DPsResolver works by building an internal DeferredGroup and finally appending it's resultant json data to the global json result array in its WriteToBuffer callback:

    class DPsResolver implements Callback<Deferred<Object>, Object> {
    ...
    public Deferred<Object> call(final Object obj) throws Exception {
        this.uid_start = DateTime.nanoTime();
        
        resolve_deferreds.add(dps.metricNameAsync()
            .addCallback(new MetricResolver()));
        resolve_deferreds.add(dps.getTagsAsync()
            .addCallback(new TagResolver()));
        resolve_deferreds.add(dps.getAggregatedTagsAsync()
            .addCallback(new AggTagResolver()));
        return Deferred.group(resolve_deferreds)
            .addCallback(new WriteToBuffer(dps));
      }
    }

My understanding is the callback chain is built in order to serialize the building of the global json object since each DPsResolver is responsible for appending its own results.

So, the idea is, rather than each DPsResolver returning nothing and writing its data to the buffer, instead, remove the WriteToBuffer callback from DPsResolver and return the DPsResolver's resultant dps. Then rather than build a callback chain of all the DPsResolvers, instead create a DeferredGroup of all the DPsResolvers and add a modified WriteToBuffer to the FinalCB with the task of gathering all the results and writing the json output to the buffer.

The most obvious issue I see with this method is that we would need to wait for the results of all the DPsResolvers before being able to begin streaming the result back to the requestor (if that is indeed what is currently happening).

An alternative approach would be add an additional inner loop to the cb_chain generation loop which batches the DPsResolver chains into chunks of no more than async.Deferred.MAX_CALLBACK_CHAIN_LENGTH (16383 currently). Although admittedly I'm not quite sure exactly how to do that.

thanks

bshakur8 pushed a commit to bshakur8/opentsdb that referenced this issue Oct 25, 2021
bshakur8 pushed a commit to bshakur8/opentsdb that referenced this issue Oct 28, 2021
bshakur8 pushed a commit to bshakur8/opentsdb that referenced this issue Nov 3, 2021
@ugtar ugtar linked a pull request Nov 3, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.