Skip to content

TimeBoundary: Use cursor when datasource is not a regular table.#14151

Merged
clintropolis merged 2 commits intoapache:masterfrom
gianm:timeboundary-more-conservative
Apr 27, 2023
Merged

TimeBoundary: Use cursor when datasource is not a regular table.#14151
clintropolis merged 2 commits intoapache:masterfrom
gianm:timeboundary-more-conservative

Conversation

@gianm
Copy link
Contributor

@gianm gianm commented Apr 24, 2023

Fixes a bug where TimeBoundary could return incorrect results with INNER Join or inline data.

Fixes a bug where TimeBoundary could return incorrect results with
INNER Join or inline data.
@gianm gianm marked this pull request as ready for review April 24, 2023 21:48
.build()
),
"j0.",
equalsCondition(makeExpression("CAST(\"m1\", 'LONG')"), makeColumnExpression("j0.cnt")),

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [CalciteTestBase.makeExpression](1) should be avoided because it has been deprecated.
.build()
),
"j0.",
equalsCondition(makeExpression("CAST(\"m1\", 'LONG')"), makeColumnExpression("j0.cnt")),

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [CalciteTestBase.makeColumnExpression](1) should be avoided because it has been deprecated.

if (canUseAdapterMinMaxTime(query, adapter)) {
if (!query.isMaxTime()) {
minTime = adapter.getMinTime();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we update the docs or even the method name for StorageAdapter#getMinTime and StorageAdapter#getMaxTime to say that

  • getMinTime can return a value that is lower than the actual min time of the data
  • getMaxTime can return a value that is higher than the actual max time of the data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Javadoc sounds like a good idea to me. It's @PublicApi so we should keep the method name as-is.

I added this to StorageAdapter#getMinTime, and similar text to getMaxTime as well:

  /**
   * Metadata-only operation that returns a lower bound on
   * {@link org.apache.druid.segment.column.ColumnHolder#TIME_COLUMN_NAME} values for this adapter. May be earlier than
   * the actual minimum data timestamp.
   *
   * For {@link QueryableIndexStorageAdapter} and {@link org.apache.druid.segment.incremental.IncrementalIndexStorageAdapter}
   * specifically, which back regular tables (i.e. {@link org.apache.druid.query.TableDataSource}), this method
   * contract is tighter: it does return the actual minimum data timestamp. This fact is leveraged by
   * {@link org.apache.druid.query.timeboundary.TimeBoundaryQuery} to return results using metadata only.
   */

Copy link
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@clintropolis clintropolis merged commit 42c8c84 into apache:master Apr 27, 2023
@abhishekagarwal87 abhishekagarwal87 added this to the 26.0 milestone May 1, 2023
clintropolis pushed a commit to clintropolis/druid that referenced this pull request May 1, 2023
…che#14151)

* TimeBoundary: Use cursor when datasource is not a regular table.

Fixes a bug where TimeBoundary could return incorrect results with
INNER Join or inline data.

* Addl Javadocs.
clintropolis added a commit that referenced this pull request May 2, 2023
) (#14191)

* TimeBoundary: Use cursor when datasource is not a regular table.

Fixes a bug where TimeBoundary could return incorrect results with
INNER Join or inline data.

* Addl Javadocs.

Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants