Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[multistage] support database in v2 #12591

Merged
merged 29 commits into from
Mar 22, 2024

Conversation

shounakmk219
Copy link
Collaborator

@shounakmk219 shounakmk219 commented Mar 7, 2024

Description

This PR allows multi stage engine to support the queries with database context
The database context can be passed using

  1. database http header
  2. database query option

database http header

The query request needs to have the http header Database: db1.
User can query the tables under database db1 with

select * from tableName limit 10
-- or
select * from db1.tableName limit 10

database query option

The query option database=db1 must either be passed as part of the request payload or through query as given below

SET database='db1';
select * from tableName limit 10;
-- or
SET database='db1';
select * from db1.tableName limit 10;

default database handling

User can skip passing the database context in this case or can even pass default database context using anyone of the above mentioned options, both will have equivalent behaviour

select * from tableName limit 10
-- or
select * from default.tableName limit 10

@codecov-commenter
Copy link

codecov-commenter commented Mar 7, 2024

Codecov Report

Attention: Patch coverage is 51.85185% with 39 lines in your changes missing coverage. Please review.

Project coverage is 61.55%. Comparing base (59551e4) to head (3294fcf).
Report is 958 commits behind head on master.

Files with missing lines Patch % Lines
...t/controller/api/resources/PinotQueryResource.java 0.00% 10 Missing ⚠️
...a/org/apache/pinot/common/utils/DatabaseUtils.java 0.00% 9 Missing ⚠️
...a/org/apache/pinot/query/catalog/PinotCatalog.java 38.46% 6 Missing and 2 partials ⚠️
...ntroller/helix/core/PinotHelixResourceManager.java 45.45% 1 Missing and 5 partials ⚠️
.../query/planner/logical/RelToPlanNodeConverter.java 50.00% 2 Missing and 2 partials ⚠️
...requesthandler/MultiStageBrokerRequestHandler.java 66.66% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #12591      +/-   ##
============================================
- Coverage     61.75%   61.55%   -0.20%     
+ Complexity      207      198       -9     
============================================
  Files          2436     2456      +20     
  Lines        133233   134388    +1155     
  Branches      20636    20800     +164     
============================================
+ Hits          82274    82721     +447     
- Misses        44911    45495     +584     
- Partials       6048     6172     +124     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration <0.01% <0.00%> (-0.01%) ⬇️
integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration2 0.00% <0.00%> (ø)
java-11 61.48% <51.85%> (-0.23%) ⬇️
java-21 61.44% <51.85%> (-0.19%) ⬇️
skip-bytebuffers-false 61.52% <51.85%> (-0.23%) ⬇️
skip-bytebuffers-true 61.41% <51.85%> (+33.68%) ⬆️
temurin 61.55% <51.85%> (-0.20%) ⬇️
unittests 61.54% <51.85%> (-0.20%) ⬇️
unittests1 46.16% <61.11%> (-0.73%) ⬇️
unittests2 27.94% <11.11%> (+0.20%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Jackie-Jiang Jackie-Jiang added feature multi-stage Related to the multi-stage query engine labels Mar 7, 2024
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we need to support a query option of database? I kind of prefer always keeping it explicit in the query by using <database>.<tableName> if it is not the default database

@gortiz
Copy link
Contributor

gortiz commented Mar 12, 2024

Using the postgres terminology we are not defining databases but schemas. Databases are independent of each other and for example you cannot join tables in two different databases. AFAIK in MySQL database and schema is the same thing. As we can see in PinotCatalog (which extends Calcite Schema) the term in Calcite is also schema.

I would strongly recommend to change the nomenclature to do not use database in that case. The correct SQL term would be schema but given we already use that term we may use namespace instead. BTW, in any schema may contain other schemas. I don't know if we plan to support that as well right now, but even if we don't support that in a first implementation, we should let open that possibility for the future.

@gortiz
Copy link
Contributor

gortiz commented Mar 12, 2024

Do you think we need to support a query option of database? I kind of prefer always keeping it explicit in the query by using . if it is not the default database

I don't think that is mandatory right now. We can have a first version without that. AFAIR in Postgres this is done with an option (in their case, SEARCH_PATH).

Copy link
Contributor

@gortiz gortiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Jackie-Jiang Jackie-Jiang merged commit 00fdc5f into apache:master Mar 22, 2024
19 checks passed
@shounakmk219 shounakmk219 changed the title [multistage] support database/namespace in v2 [multistage] support database in v2 Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature multi-stage Related to the multi-stage query engine
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants