You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Considering those DISTINCT SQL above, we can group those SQL to handle.
Simple SELECT
SQL Parse
It is necessary to analyze the DISTINCT token.
SQL Route
To make sure where SQL should be sent to execute.
SQL Rewrite
Rewrite logic SQL to actual SQL. Let us suppose we rewrite logic SQL to 2 actual SQLs:
SELECT DISTINCT price FROM item_0;
SELECT DISTINCT price FROM item_1;
SQL Execute
Send actual SQLs to some of the shardings to execute. Let us suppose that we get two resultsets, r1 and r2 from two shardings.
SQL Merge
To get distinct results from r1 and r2 by DISTINCT(r1, r2).
Therefore, we can get fianl distinct results from all of the shardings.
COUNT() & SUM()
The processing of COUNT() and SUM() is same, so we take COUNT() for example.
SQL Parse
It is necessary to analyze the DISTINCT token.
SQL Route
To make sure where SQL should be sent to execute.
SQL Rewrite
Rewrite logic SQL to actual SQL. Let us suppose we rewrite logic SQL to 2 actual SQLs:
SELECT DISTINCT price FROM item_0;
SELECT DISTINCT price FROM item_1;
SQL Execute
Send actual SQLs to some of the shardings to execute. Let us suppose that we get two resultsets, r1 and r2 from two shardings.
SQL Merge
To get distinct results r3 from r1 and r2 by DISTINCT(r1, r2), then it is possiable to get final correct COUNT(DISTINCT price) by counting r3.
You can find instead of getting COUNT(DISTINCT price) from shardings and counting those resultsets, we get DISTINCT price from each sharding and distinguish those results again in our procedure. At last, we calculate the COUNT() of TWICE-DISTINCT results. The processing of SUM() is same with this.
AVG
Handling AVG is more complex. We need to get SUM(DISTINCT price) and COUNT(DISTINCT price) by using the processing above. After that, we can get AVG(DISTINCT price) by calculating SUM(DISTINCT price)/COUNT(DISTINCT price).
MIN() & MAX()
It is easier to process two of those functions. Here, we take MIN() for example.
SQL Parse
It is necessary to analyze the DISTINCT token.
SQL Route
To make sure where SQL should be sent to execute.
SQL Rewrite
Rewrite logic SQL to actual SQL. Let us suppose we rewrite logic SQL to 2 actual SQLs:
SELECT MIN(DISTINCT price) FROM item_0;
SELECT MIN(DISTINCT price) FROM item_1;
SQL Execute
Send actual SQLs to some of the shardings to execute. Let us suppose that we get two resultsets, r1 and r2 from two shardings.
SQL Merge
To get final MIN(DISTINCT price) result from r1 and r2 by MIN(r1, r2).
DISTINCT + GROUP BY
Welcome to join us to discuss this processing.
Base on the result of the above analysis, we can get the following tasks:
Parsing the SQL syntax.
Rewriting the logic SQL containing DISTINCT to the relative actual SQLs correctly.
Executing those actual SQLs
Merging those resultSets by using corresponding algorithms.
Support simple DISTINCT syntax
Support DISTINCT + Function
Support Aggretation + DISTINCT
The text was updated successfully, but these errors were encountered:
Supporting
DISTINCT
SQL syntax is on the way.We plan to support
DISTINCT
SQL syntax. The following is our thought aboutDISTINCT
, welcome to discuss this solution.SQL Usage
Processing
The basic processing procedure is as follow:
Considering those
DISTINCT
SQL above, we can group those SQL to handle.Simple SELECT
SQL Parse
It is necessary to analyze the
DISTINCT
token.SQL Route
To make sure where SQL should be sent to execute.
SQL Rewrite
Rewrite logic SQL to actual SQL. Let us suppose we rewrite logic SQL to 2 actual SQLs:
SQL Execute
Send actual SQLs to some of the shardings to execute. Let us suppose that we get two resultsets, r1 and r2 from two shardings.
SQL Merge
To get distinct results from r1 and r2 by DISTINCT(r1, r2).
Therefore, we can get fianl distinct results from all of the shardings.
COUNT() & SUM()
The processing of COUNT() and SUM() is same, so we take COUNT() for example.
SQL Parse
It is necessary to analyze the
DISTINCT
token.SQL Route
To make sure where SQL should be sent to execute.
SQL Rewrite
Rewrite logic SQL to actual SQL. Let us suppose we rewrite logic SQL to 2 actual SQLs:
SQL Execute
Send actual SQLs to some of the shardings to execute. Let us suppose that we get two resultsets, r1 and r2 from two shardings.
SQL Merge
To get distinct results r3 from r1 and r2 by DISTINCT(r1, r2), then it is possiable to get final correct
COUNT(DISTINCT price)
by counting r3.You can find instead of getting
COUNT(DISTINCT price)
from shardings and counting those resultsets, we getDISTINCT price
from each sharding and distinguish those results again in our procedure. At last, we calculate theCOUNT()
of TWICE-DISTINCT results. The processing ofSUM()
is same with this.AVG
Handling
AVG
is more complex. We need to getSUM(DISTINCT price)
andCOUNT(DISTINCT price)
by using the processing above. After that, we can getAVG(DISTINCT price)
by calculatingSUM(DISTINCT price)/COUNT(DISTINCT price)
.MIN() & MAX()
It is easier to process two of those functions. Here, we take
MIN()
for example.SQL Parse
It is necessary to analyze the
DISTINCT
token.SQL Route
To make sure where SQL should be sent to execute.
SQL Rewrite
Rewrite logic SQL to actual SQL. Let us suppose we rewrite logic SQL to 2 actual SQLs:
SQL Execute
Send actual SQLs to some of the shardings to execute. Let us suppose that we get two resultsets, r1 and r2 from two shardings.
SQL Merge
To get final
MIN(DISTINCT price)
result from r1 and r2 by MIN(r1, r2).DISTINCT + GROUP BY
Welcome to join us to discuss this processing.
Base on the result of the above analysis, we can get the following tasks:
DISTINCT
to the relative actual SQLs correctly.The text was updated successfully, but these errors were encountered: