Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broker: Add ability to inline subqueries. #9533

Merged
merged 3 commits into from
Mar 18, 2020

Conversation

gianm
Copy link
Contributor

@gianm gianm commented Mar 18, 2020

Changes:

  • ClientQuerySegmentWalker: Add ability to inline queries. (This is the main change.)
  • Query: Add "getSubQueryId" and "withSubQueryId" methods.
  • QueryMetrics: Add "subQueryId" dimension.
  • ServerConfig: Add new "maxSubqueryRows" parameter, which is used b y ClientQuerySegmentWalker to limit how many rows can be inlined per query.
  • IndexedTableJoinMatcher: Allow creating keys on top of unknown types, by assuming they are strings. This is useful because not all types are known for fields in query results.
  • InlineDataSource: Store RowSignature rather than component parts. Add more zealous "equals" and "hashCode" methods to ease testing.
  • Moved QuerySegmentWalker test code from CalciteTests and SpecificSegmentsQueryWalker in druid-sql to QueryStackTests in druid-server. Use this to spin up a new ClientQuerySegmentWalkerTest.

The main changes:

- ClientQuerySegmentWalker: Add ability to inline queries.
- Query: Add "getSubQueryId" and "withSubQueryId" methods.
- QueryMetrics: Add "subQueryId" dimension.
- ServerConfig: Add new "maxSubqueryRows" parameter, which is used by
  ClientQuerySegmentWalker to limit how many rows can be inlined per
  query.
- IndexedTableJoinMatcher: Allow creating keys on top of unknown types,
  by assuming they are strings. This is useful because not all types are
  known for fields in query results.
- InlineDataSource: Store RowSignature rather than component parts. Add
  more zealous "equals" and "hashCode" methods to ease testing.
- Moved QuerySegmentWalker test code from CalciteTests and
  SpecificSegmentsQueryWalker in druid-sql to QueryStackTests in
  druid-server. Use this to spin up a new ClientQuerySegmentWalkerTest.
@lgtm-com
Copy link

lgtm-com bot commented Mar 18, 2020

This pull request introduces 1 alert when merging 91ea2e9 into 4c620b8 - view on LGTM.com

new alerts:

  • 1 for Contradictory type checks

@lgtm-com
Copy link

lgtm-com bot commented Mar 18, 2020

This pull request introduces 1 alert when merging 6f73c73 into 4c620b8 - view on LGTM.com

new alerts:

  • 1 for Contradictory type checks

@lgtm-com
Copy link

lgtm-com bot commented Mar 18, 2020

This pull request introduces 1 alert when merging e6461d9 into 4c620b8 - view on LGTM.com

new alerts:

  • 1 for Contradictory type checks

@gianm
Copy link
Contributor Author

gianm commented Mar 18, 2020

1 for Contradictory type checks

It's complaining that a condition in an assert is always false. OK, fair enough, but isn't that the point of an assert? I say yes, so, I think we can ignore this.

Copy link
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this lgtm 👍

I've realized during review that the laning stuff from #9407 is only going to apply to queries that run on the cluster walker and not the local walker (since ClientQuerySegmentWalker was just a wrapper around the cluster walker previously so I didn't consider this really). This doesn't need to be changed now, and I'll look into doing in a follow-up to make these changes.

@@ -183,20 +183,19 @@ public void setup()
log.info("Starting benchmark setup using cacheDir[%s], rows[%,d].", segmentGenerator.getCacheDir(), rowsPerSegment);
final QueryableIndex index = segmentGenerator.generate(dataSegment, schemaInfo, Granularities.NONE, rowsPerSegment);

final Pair<QueryRunnerFactoryConglomerate, Closer> conglomerate = CalciteTests.createQueryRunnerFactoryConglomerate();
closer.register(conglomerate.rhs);
final QueryRunnerFactoryConglomerate conglomerate = QueryStackTests.createQueryRunnerFactoryConglomerate(closer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. My comments are mostly about Javadoc and annotation. I'm fine with adding them in a follow-up PR.

}

@JsonProperty
@JsonInclude(JsonInclude.Include.NON_NULL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add @Nullable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll add this in the next patch.

@@ -131,6 +131,11 @@
@Nullable
String getId();

Query<T> withSubQueryId(String subQueryId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth to add a javadoc explaining what is subQueryId and why we need it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I'll add this in the next patch.

* Historicals, which use {@link org.apache.druid.server.coordination.ServerManager}). Used by {@link QueryStackTests}.
*
* This class's logic is like a mashup of those two classes. With the right abstractions, it may be possible to get rid
* of this class and replace it with the production classes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this class be mentioned in ServerManager and CachingClusteredClient so that these classes are synced up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add this in the next patch.

@@ -100,6 +103,10 @@ public ServerConfig()
@Min(1)
private long maxScatterGatherBytes = Long.MAX_VALUE;

@JsonProperty
@Min(1)
private int maxSubqueryRows = 100000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should probably be added to the documentation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, good point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hoping to update all the docs at once in a future patch

@jon-wei jon-wei merged commit 1ef25a4 into apache:master Mar 18, 2020
@gianm gianm deleted the subqueries-broker branch March 18, 2020 22:08
@jihoonson jihoonson added this to the 0.18.0 milestone Mar 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants