Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to query large data set with scan-query via broker #4865

Closed
stevenchen3 opened this issue Sep 27, 2017 · 13 comments
Closed

Unable to query large data set with scan-query via broker #4865

stevenchen3 opened this issue Sep 27, 2017 · 13 comments
Labels

Comments

@stevenchen3
Copy link

stevenchen3 commented Sep 27, 2017

Recently, I ran into the issue that I failed to query an interval with about 4 merged and compressed segments, each of which is about 500MB (about 1.9 million rows, extracted from the compressed segment, the size is about 7GB), using scan-query (version 0.10.1). It works for intervals with 2 of such segments.

I ran the query using curl and output the result directly to a local file and observed that the memory consumption of broker and historical processes goes quickly as the query began running, and eventually reaches to the allocated memory limit (memory pressue seems to occur). And I eventually got the following exception from historical (similar errors on broker), and the query failed:

2017-09-27T06:05:18,309 ERROR [qtp1907228381-17[scan_[testDatasource]_67af8aad-6feb-4226-bd94-e9f9602b654a]] io.druid.server.QueryResource - Unable to send query response.
com.fasterxml.jackson.databind.JsonMappingException: [no message for io.druid.query.QueryInterruptedException]
	at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:139) ~[jackson-databind-2.4.6.jar:2.4.6]
	at com.fasterxml.jackson.databind.ObjectWriter._configAndWriteValue(ObjectWriter.java:800) ~[jackson-databind-2.4.6.jar:2.4.6]
	at com.fasterxml.jackson.databind.ObjectWriter.writeValue(ObjectWriter.java:642) ~[jackson-databind-2.4.6.jar:2.4.6]
	at io.druid.server.QueryResource$1.write(QueryResource.java:210) [druid-server-0.10.1-iap3.jar:0.10.1-iap3]
	at com.sun.jersey.core.impl.provider.entity.StreamingOutputProvider.writeTo(StreamingOutputProvider.java:71) [jersey-core-1.19.3.jar:1.19.3]
	at com.sun.jersey.core.impl.provider.entity.StreamingOutputProvider.writeTo(StreamingOutputProvider.java:57) [jersey-core-1.19.3.jar:1.19.3]
	at com.sun.jersey.spi.container.ContainerResponse.write(ContainerResponse.java:302) [jersey-server-1.19.3.jar:1.19.3]
	at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1510) [jersey-server-1.19.3.jar:1.19.3]
	at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419) [jersey-server-1.19.3.jar:1.19.3]
	at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409) [jersey-server-1.19.3.jar:1.19.3]
	at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409) [jersey-servlet-1.19.3.jar:1.19.3]
	at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558) [jersey-servlet-1.19.3.jar:1.19.3]
	at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733) [jersey-servlet-1.19.3.jar:1.19.3]
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) [javax.servlet-api-3.1.0.jar:3.1.0]
	at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:286) [guice-servlet-4.1.0.jar:?]
	at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:276) [guice-servlet-4.1.0.jar:?]
	at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:181) [guice-servlet-4.1.0.jar:?]
	at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) [guice-servlet-4.1.0.jar:?]
	at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85) [guice-servlet-4.1.0.jar:?]
	at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120) [guice-servlet-4.1.0.jar:?]
	at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:135) [guice-servlet-4.1.0.jar:?]
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) [jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) [jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:224) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) [jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.Server.handle(Server.java:534) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) [jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108) [jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) [jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
Caused by: io.druid.query.QueryInterruptedException
	at io.druid.query.scan.ScanQueryEngine$1$1$1.next(ScanQueryEngine.java:163) ~[?:?]
	at io.druid.query.scan.ScanQueryEngine$1$1$1.next(ScanQueryEngine.java:150) ~[?:?]
	at io.druid.java.util.common.guava.BaseSequence.makeYielder(BaseSequence.java:89) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.BaseSequence.toYielder(BaseSequence.java:68) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.ConcatSequence.makeYielder(ConcatSequence.java:95) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.ConcatSequence.toYielder(ConcatSequence.java:75) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingSequence$2.get(WrappingSequence.java:87) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingSequence$2.get(WrappingSequence.java:83) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.SequenceWrapper.wrap(SequenceWrapper.java:55) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingSequence.toYielder(WrappingSequence.java:82) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.LazySequence.toYielder(LazySequence.java:46) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingSequence$2.get(WrappingSequence.java:87) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingSequence$2.get(WrappingSequence.java:83) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.SequenceWrapper.wrap(SequenceWrapper.java:55) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingSequence.toYielder(WrappingSequence.java:82) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.LazySequence.toYielder(LazySequence.java:46) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingSequence$2.get(WrappingSequence.java:87) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingSequence$2.get(WrappingSequence.java:83) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.SequenceWrapper.wrap(SequenceWrapper.java:55) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingSequence.toYielder(WrappingSequence.java:82) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.query.spec.SpecificSegmentQueryRunner$2.toYielder(SpecificSegmentQueryRunner.java:100) ~[druid-processing-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingSequence$2.get(WrappingSequence.java:87) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingSequence$2.get(WrappingSequence.java:83) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.query.spec.SpecificSegmentQueryRunner.doNamed(SpecificSegmentQueryRunner.java:171) ~[druid-processing-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.query.spec.SpecificSegmentQueryRunner.access$200(SpecificSegmentQueryRunner.java:44) ~[druid-processing-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.query.spec.SpecificSegmentQueryRunner$3.wrap(SpecificSegmentQueryRunner.java:151) ~[druid-processing-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingSequence.toYielder(WrappingSequence.java:82) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingSequence$2.get(WrappingSequence.java:87) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingSequence$2.get(WrappingSequence.java:83) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.query.CPUTimeMetricQueryRunner$1.wrap(CPUTimeMetricQueryRunner.java:74) ~[druid-processing-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingSequence.toYielder(WrappingSequence.java:82) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.ConcatSequence.makeYielder(ConcatSequence.java:95) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.ConcatSequence.wrapYielder(ConcatSequence.java:129) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.ConcatSequence.access$000(ConcatSequence.java:29) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.ConcatSequence$3.next(ConcatSequence.java:143) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingYielder$1.get(WrappingYielder.java:53) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingYielder$1.get(WrappingYielder.java:49) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.query.CPUTimeMetricQueryRunner$1.wrap(CPUTimeMetricQueryRunner.java:74) ~[druid-processing-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingYielder.next(WrappingYielder.java:48) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.jackson.DruidDefaultSerializersModule$4.serialize(DruidDefaultSerializersModule.java:128) ~[druid-processing-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.jackson.DruidDefaultSerializersModule$4.serialize(DruidDefaultSerializersModule.java:118) ~[druid-processing-0.10.1-iap3.jar:0.10.1-iap3]
	at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:128) ~[jackson-databind-2.4.6.jar:2.4.6]

If I understand correctly, scan-query should solve the memory pressure issue that select-query has. Any advice, what might go wrong?

Thanks.

@stevenchen3
Copy link
Author

Oops....looks like I must specify a very long timeout which is weird to me in streaming case ;)

@stevenchen3
Copy link
Author

stevenchen3 commented Sep 27, 2017

The error described above can be fixed to explicitly specify a timeout in the query:

{
  ...

  "context": {
    "timeout": 36000000
  }
}

However, when I query larger data set (e.g., tens GB of merged and compressed segments, larger than the memory allocated to broker and historical), I encountered java.lang.OutOfMemoryError with broker, see blow exception log snippet:

java.lang.RuntimeException: com.fasterxml.jackson.databind.JsonMappingException: Query[426a21c5-5420-46de-9fda-7f080fbf7e3b] url[http://master-c457d875.node.local:8083/druid/v2/] failed with exception msg [java.lang.OutOfMemoryError: GC overhead limit exceeded]
	at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
	at io.druid.server.QueryResource$1.write(QueryResource.java:218) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]
	at com.sun.jersey.core.impl.provider.entity.StreamingOutputProvider.writeTo(StreamingOutputProvider.java:71) ~[jersey-core-1.19.3.jar:1.19.3]
	at com.sun.jersey.core.impl.provider.entity.StreamingOutputProvider.writeTo(StreamingOutputProvider.java:57) ~[jersey-core-1.19.3.jar:1.19.3]
	at com.sun.jersey.spi.container.ContainerResponse.write(ContainerResponse.java:302) ~[jersey-server-1.19.3.jar:1.19.3]
	at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1510) ~[jersey-server-1.19.3.jar:1.19.3]
	at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419) ~[jersey-server-1.19.3.jar:1.19.3]
	at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409) ~[jersey-server-1.19.3.jar:1.19.3]
	at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409) ~[jersey-servlet-1.19.3.jar:1.19.3]
	at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558) ~[jersey-servlet-1.19.3.jar:1.19.3]
	at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733) ~[jersey-servlet-1.19.3.jar:1.19.3]
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) ~[javax.servlet-api-3.1.0.jar:3.1.0]
	at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:286) ~[guice-servlet-4.1.0.jar:?]
	at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:276) ~[guice-servlet-4.1.0.jar:?]
	at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:181) ~[guice-servlet-4.1.0.jar:?]
	at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) ~[guice-servlet-4.1.0.jar:?]
	at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120) ~[guice-servlet-4.1.0.jar:?]
	at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:135) ~[guice-servlet-4.1.0.jar:?]
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) ~[jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) [jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:224) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) [jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.Server.handle(Server.java:534) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) [jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108) [jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) [jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Query[426a21c5-5420-46de-9fda-7f080fbf7e3b] url[http://master-c457d875.node.local:8083/druid/v2/] failed with exception msg [java.lang.OutOfMemoryError: GC overhead limit exceeded]
	at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:139) ~[jackson-databind-2.4.6.jar:2.4.6]
	at com.fasterxml.jackson.databind.ObjectWriter._configAndWriteValue(ObjectWriter.java:800) ~[jackson-databind-2.4.6.jar:2.4.6]
	at com.fasterxml.jackson.databind.ObjectWriter.writeValue(ObjectWriter.java:642) ~[jackson-databind-2.4.6.jar:2.4.6]
	at io.druid.server.QueryResource$1.write(QueryResource.java:210) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]
	... 39 more
Caused by: io.druid.java.util.common.RE: Query[426a21c5-5420-46de-9fda-7f080fbf7e3b] url[http://master-c457d875.node.local:8083/druid/v2/] failed with exception msg [java.lang.OutOfMemoryError: GC overhead limit exceeded]
	at io.druid.client.DirectDruidClient$1$3.hasMoreElements(DirectDruidClient.java:286) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]
	at java.io.SequenceInputStream.nextStream(SequenceInputStream.java:109) ~[?:1.8.0_144]
	at java.io.SequenceInputStream.close(SequenceInputStream.java:232) ~[?:1.8.0_144]
	at com.fasterxml.jackson.dataformat.smile.SmileParser._closeInput(SmileParser.java:452) ~[jackson-dataformat-smile-2.4.6.jar:2.4.6]
	at com.fasterxml.jackson.core.base.ParserBase.close(ParserBase.java:334) ~[jackson-core-2.4.6.jar:2.4.6]
	at com.fasterxml.jackson.dataformat.smile.SmileParser.close(SmileParser.java:472) ~[jackson-dataformat-smile-2.4.6.jar:2.4.6]
	at io.druid.client.DirectDruidClient$JsonParserIterator.close(DirectDruidClient.java:653) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.CloseQuietly.close(CloseQuietly.java:39) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.client.DirectDruidClient$3.cleanup(DirectDruidClient.java:531) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.client.DirectDruidClient$3.cleanup(DirectDruidClient.java:521) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.BaseSequence$2.close(BaseSequence.java:142) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.io.Closer.close(Closer.java:206) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.MergeSequence$3.close(MergeSequence.java:158) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.java.util.common.guava.WrappingYielder.close(WrappingYielder.java:81) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.jackson.DruidDefaultSerializersModule$4.serialize(DruidDefaultSerializersModule.java:132) ~[druid-processing-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.jackson.DruidDefaultSerializersModule$4.serialize(DruidDefaultSerializersModule.java:118) ~[druid-processing-0.10.1-iap3.jar:0.10.1-iap3]
	at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:128) ~[jackson-databind-2.4.6.jar:2.4.6]
	at com.fasterxml.jackson.databind.ObjectWriter._configAndWriteValue(ObjectWriter.java:800) ~[jackson-databind-2.4.6.jar:2.4.6]
	at com.fasterxml.jackson.databind.ObjectWriter.writeValue(ObjectWriter.java:642) ~[jackson-databind-2.4.6.jar:2.4.6]
	at io.druid.server.QueryResource$1.write(QueryResource.java:210) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]
	... 39 more

batchSize in my query is set to default 20480, limit is omitted as I need to get all the data for a specific period. Wondering if the memory pressure issue still exists in scan-query?

@stevenchen3
Copy link
Author

Looks like there's no back-pressure protocol implemented between broker and historical. broker consumes data (query results) from historicals and adds to its buffer (i.e., LinkedBlockingQueue), while user client code consumes data from broker via REST API. If the channel between the user and broker is fast enough to consume data, the buffer at broker should be relatively small. However, If this channel is slow and multiple historicals are feeding broker, the buffer will increase significantly until a point that it exceeds the heap size.

If query directly through all historicals to get all raw data for a specific interval, I cannot guarantee (1) segment orders and (2) duplicate segments.

It will be nice to have back-pressure between broker and historical or/and client and broker (e.g., other streaming API), like Akka Stream. Just some thoughts :P

@stevenchen3 stevenchen3 changed the title Unable to query large data set with scan-query Unable to query large data set with scan-query via broker Sep 29, 2017
@gianm
Copy link
Contributor

gianm commented Sep 29, 2017

IIRC there is no real backpressure between broker and historicals. This was discussed in #4229 where maxScatterGatherBytes was introduced as a workaround to at least keep clusters stable (although it artificially limits response sizes).

One thing you could do is query the historicals directly if you are really pulling a huge amount of data. That should work well since there is some backpressure there (the historicals will pause scanning segments if you aren't reading the results fast enough).

I think it's also be good to investigate ways to implement backpressure in the broker. If you are into helping there, @stevenchen3, that would be great :)

@stevenchen3
Copy link
Author

@gianm thanks for the explanation. If I query directly from historicals, for example, to get all raw data, will the results include duplicate entries (segments)? If I understand correctly, segments are replicated among historicals.

@gianm
Copy link
Contributor

gianm commented Jul 19, 2018

@stevenchen3 Sorry for the late response, but if you query directly from historicals, you'd want to specify a specific set of segments to each historical (like the broker does) in order to prevent duplicates.

@himanshug
Copy link
Contributor

while there are multiple PRs trying to solve this by limiting memory used in DirectDruidClient which is great in general.
However, just a thought I had was this could benefit if scan query didn't go through regular merging process at the broker and somehow simply streamed results from historicals to client as they showed up on broker.

@gianm
Copy link
Contributor

gianm commented Aug 1, 2018

However, just a thought I had was this could benefit if scan query didn't go through regular merging process at the broker and somehow simply streamed results from historicals to client as they showed up on broker.

@himanshug But what happens if the results from historicals show up faster than they can be written to the client… where are they buffered?

@himanshug
Copy link
Contributor

@himanshug But what happens if the results from historicals show up faster than they can be written to the client… where are they buffered?

@gianm I think, In that case they would be buffered at broker using the same code that other PRs are doing. the short circuit path would be more to avoid everything else related to merging. but, maybe noop merging is not much overhead for scan queries and in that case we wouldn't need it.

@gianm
Copy link
Contributor

gianm commented Sep 7, 2018

Hope to fix this via #6313.

@stale
Copy link

stale bot commented Jun 20, 2019

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 2 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

@stale stale bot added the stale label Jun 20, 2019
@stale
Copy link

stale bot commented Jul 4, 2019

This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.

@stale stale bot closed this as completed Jul 4, 2019
@gianm
Copy link
Contributor

gianm commented Jul 4, 2019

By the way, this should be fixed now by #6313. You would need to enable it by setting druid.broker.http.maxQueuedBytes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants