Skip to content

fix time_floor expression performance issue when origin is null#11380

Closed
kaijianding wants to merge 3 commits intoapache:masterfrom
kaijianding:floor
Closed

fix time_floor expression performance issue when origin is null#11380
kaijianding wants to merge 3 commits intoapache:masterfrom
kaijianding:floor

Conversation

@kaijianding
Copy link
Contributor

Description

I noticed that timestamp_floor("__time",'P1D',null,'Asia/Shanghai') is extremely slower(more than 3x in my case) than timestamp_floor("__time",'P1D','','Asia/Shanghai') when I test this sql

select
 sum("count") as c,
 Floor(__time to DAY) as __time
from
 "sms_normal"
where
 __time >= TIMESTAMP '2021-05-19 00:00:00'
 and __time < TIMESTAMP '2021-05-20 23:59:59'
group by
 Floor(__time to DAY)
having
 c>0

This is due to

    if (args.stream().skip(1).allMatch(Expr::isLiteral)) {
      return new TimestampFloorExpr(args);
    } else {
      return new TimestampFloorDynamicExpr(args);
    }

in TimestampFloorExprMacro.java

The "null" expr doesn't match Expr::isLiteral(), then TimestampFloorDynamicExpr is used, and TimestampFloorDynamicExpr.eval() create PeriodGranularity every time not like that only one time in TimestampFloorExpr's constructor.

    public ExprEval eval(final ObjectBinding bindings)
    {
      final PeriodGranularity granularity = computeGranularity(args, bindings);
      return ExprEval.of(granularity.bucketStart(args.get(0).eval(bindings).asLong()));
    }

This patch modifies the translation to timestamp_floor("__time",'P1D','','Asia/Shanghai') rather than timestamp_floor("__time",'P1D',null,'Asia/Shanghai')


Key changed/added classes in this PR
  • TimeFloorOperatorConversion
  • DruidExpression

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant