Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major bug: GroupBy aggregates incorrect ranges #717

Closed
pdeva opened this issue Aug 30, 2014 · 12 comments
Closed

Major bug: GroupBy aggregates incorrect ranges #717

pdeva opened this issue Aug 30, 2014 · 12 comments

Comments

@pdeva
Copy link
Contributor

pdeva commented Aug 30, 2014

Assume your data begins at time 17:48 and is stored every minute.

You give GroupBy an interval begining at 17:46, with a granularity of 3 mins.

You should get data with timestamps:

  • 17:46
  • 17:49
  • 17:52
    etc..

However, you get:

  • 17:48
  • 17:51
  • 17:54

This is not the data that was asked for. The interval returned should be the one in the former list.

There is currently no workaround for this bug either. If you want data aggregated and looking like the former list, there is no way to ask druid for it

Update

While there is a origin option in granularity, it doesnt actually work since it always makes the realtime node throw an exception (thereby the broker returns a 500):

2014-08-30 23:16:06,507 WARN [qtp2092234533-34] org.eclipse.jetty.servlet.ServletHandler - /druid/v2/
java.util.NoSuchElementException
    at io.druid.granularity.BaseQueryGranularity$1$1.next(BaseQueryGranularity.java:63)
    at io.druid.granularity.BaseQueryGranularity$1$1.next(BaseQueryGranularity.java:49)
    at io.druid.query.groupby.GroupByQueryHelper.createIndexAccumulatorPair(GroupByQueryHelper.java:49)
    at io.druid.query.GroupByParallelQueryRunner.run(GroupByParallelQueryRunner.java:82)
    at io.druid.query.groupby.GroupByQueryQueryToolChest$2.run(GroupByQueryQueryToolChest.java:87)
    at io.druid.query.FinalizeResultsQueryRunner.run(FinalizeResultsQueryRunner.java:96)
    at io.druid.query.BaseQuery.run(BaseQuery.java:80)
    at io.druid.query.BaseQuery.run(BaseQuery.java:75)
    at io.druid.server.QueryResource.doPost(QueryResource.java:119)
    at sun.reflect.GeneratedMethodAccessor48.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:483)
    at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
    at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$VoidOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:167)
    at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
    at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
    at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
    at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
    at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
    at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1511)
    at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1442)
    at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1391)
    at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1381)
    at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
    at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:538)
    at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:716)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
    at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:278)
    at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:268)
    at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:180)
    at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:93)
    at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120)
    at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:132)
    at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:129)
    at com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:206)
    at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:129)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1622)
    at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83)
    at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:298)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1622)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:549)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:219)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1111)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:478)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:183)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1045)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
    at org.eclipse.jetty.server.Server.handle(Server.java:462)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:279)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:232)
    at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:534)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:607)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:536)
    at java.lang.Thread.run(Thread.java:745)

I am using druid 0.6.121

@xvrl
Copy link
Member

xvrl commented Aug 31, 2014

@pdeva I believe this is the same bug we just fixed with #703. Can you try again using Druid 0.6.150?

@xvrl
Copy link
Member

xvrl commented Aug 31, 2014

A possible workaround is to specify a granularity of type "period" with a period of "PT3M", with the same origin. PeriodGranularity should not be affected by this but, since it uses a different algorithm to truncate timestamps.

@xvrl
Copy link
Member

xvrl commented Aug 31, 2014

As far as the timestamps are concerned, the granularity timestamp truncation is always relative to time zero (a.k.a. Jan 1 1970 UTC), and not relative to the query interval.

@pdeva
Copy link
Contributor Author

pdeva commented Aug 31, 2014

So this is indeed a bug then.

On Saturday, August 30, 2014, xvrl notifications@github.com wrote:

As far as the timestamps are concerned, the granularity timestamp
truncation is always relative to time zero (a.k.a. Jan 1 1970 UTC), and not
relative to the query interval.


Reply to this email directly or view it on GitHub
https://github.com/metamx/druid/issues/717#issuecomment-53977218.

Prashant

@xvrl
Copy link
Member

xvrl commented Aug 31, 2014

Yes, there is a bug if you specify an origin. If you do not specify an origin, the behavior is correct.
Can you try using Druid 0.6.150 or use PeriodGranularity as a workaround?

@pdeva
Copy link
Contributor Author

pdeva commented Aug 31, 2014

ok so using period granularity on 0.6.121, I still get a 500, Details:

Query sent:

{
  "queryType": "groupBy",
  "dataSource": "dripstat",
  "intervals": "2014-08-30T21:47:00.000Z/2014-08-31T03:47:01.000Z",
  "granularity": {
    "type": "period",
    "duration": "PT6M",
    "origin": "2014-08-30T21:47:00.000Z"
  },
  "aggregations": [
    {
      "type": "doubleSum",
      "fieldName": "totalTime",
      "name": "totalTime"
    },
    {
      "type": "longSum",
      "fieldName": "callCount",
      "name": "callCount"
    }
  ],
  "postAggregations": null,
  "filter": {
    "type": "and",
    "fields": [
      {
        "type": "selector",
        "dimension": "uppercategory",
        "value": "Txn"
      },
      {
        "type": "selector",
        "dimension": "appid",
        "value": "54020d463004c2cafcb8eab6"
      },
      {
        "type": "selector",
        "dimension": "metricid",
        "value": 22364
      }
    ]
  },
  "dimensions": [
    "name"
  ]
}

This time i see exception only on broker, not realtime node. There are multiple:

2014-08-31 03:47:41,690 WARN [qtp474493839-32] org.eclipse.jetty.servlet.ServletHandler - 
javax.servlet.ServletException: com.fasterxml.jackson.databind.JsonMappingException: Instantiation of [simple type, class io.druid.granularity.PeriodGranularity] value failed: null
    at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:420)
    at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:538)
    at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:716)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
    at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:278)
    at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:268)
    at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:180)
    at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:93)
    at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120)
    at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:132)
    at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:129)
    at com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:206)
    at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:129)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1622)
    at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83)
    at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:298)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1622)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:549)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:219)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1111)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:478)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:183)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1045)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
    at org.eclipse.jetty.server.Server.handle(Server.java:462)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:279)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:232)
    at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:534)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:607)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:536)
    at java.lang.Thread.run(Thread.java:745)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Instantiation of [simple type, class io.druid.granularity.PeriodGranularity] value failed: null
    at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.wrapException(StdValueInstantiator.java:440)
    at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:244)
    at com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build(PropertyBasedCreator.java:158)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:401)
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:977)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:276)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:157)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:123)
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:113)
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:82)
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeWithType(BeanDeserializerBase.java:894)
    at com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:462)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:347)
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:977)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:276)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:157)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:123)
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:113)
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:82)
    at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:106)
    at com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:36)
    at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:2888)
    at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2103)
    at io.druid.server.QueryResource.doPost(QueryResource.java:108)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:483)
    at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
    at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$VoidOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:167)
    at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
    at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
    at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
    at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
    at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
    at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1511)
    at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1442)
    at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1391)
    at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1381)
    at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
    ... 32 more
Caused by: java.lang.NullPointerException
    at io.druid.granularity.PeriodGranularity.isCompoundPeriod(PeriodGranularity.java:270)
    at io.druid.granularity.PeriodGranularity.<init>(PeriodGranularity.java:59)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
    at com.fasterxml.jackson.databind.introspect.AnnotatedConstructor.call(AnnotatedConstructor.java:125)
    at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:240)
    ... 70 more

I guess just using an origin at all screws up. in a completely different way when using period granularity though.

@pdeva
Copy link
Contributor Author

pdeva commented Aug 31, 2014

also while i am doing this, I just want to know if i am on the right track.
Using the example in the first post, if I specify 17:46 as the origin, I should see the granularities in the first list that i was expecting right?

@xvrl
Copy link
Member

xvrl commented Aug 31, 2014

You are specifying your granularity period incorrectly, you need to replace the duration field with period, i.e.

"granularity": {
    "type": "period",
    "period": "PT6M",
    "origin": "2014-08-30T21:47:00.000Z"
}

To answer your second question, yes if you specify 17:46 as the origin, your timestamps will go in increments of 3 minutes starting at 17:46.

@pdeva
Copy link
Contributor Author

pdeva commented Aug 31, 2014

ok, yes with the correction you mentioned, using period granularity indeed works as needed.
Let me try with .150 and i will update this.

@xvrl
Copy link
Member

xvrl commented Sep 8, 2014

Hi @pdeva, were you able to check with a recent version of Druid?

@pdeva
Copy link
Contributor Author

pdeva commented Sep 8, 2014

not yet

Prashant

On Mon, Sep 8, 2014 at 11:31 AM, xvrl notifications@github.com wrote:

Hi @pdeva https://github.com/pdeva, were you able to check with a
recent version of Druid?


Reply to this email directly or view it on GitHub
https://github.com/metamx/druid/issues/717#issuecomment-54865600.

@fjy
Copy link
Contributor

fjy commented Sep 16, 2014

I am closing this. Please reopen if error reproducible.

@fjy fjy closed this as completed Sep 16, 2014
paul-rogers pushed a commit to paul-rogers/druid that referenced this issue Jan 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants