New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CARBONDATA-3273] [CARBONDATA-3274] Fix for SORT_SCOPE in CarbonLoadDataCommand #3103
Conversation
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2441/ |
Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10699/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2669/ |
// If there are Sort Columns given for the table and Sort Scope is not specified, | ||
// we will take it as whichever sort scope given or LOCAL_SORT as default | ||
if (tableProperties.get(CarbonCommonConstants.SORT_COLUMNS) == null || | ||
tableProperties.get(CarbonCommonConstants.SORT_COLUMNS).equals("")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tableProperties.get(CarbonCommonConstants.SORT_COLUMNS).equals("")) { | |
tableProperties.get(CarbonCommonConstants.SORT_COLUMNS).trim.equals("")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String with spaces will be caught before reaching this step. So no need to add trim.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use StringUtils.isBlank()
to avoid retrieve themember twice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
optionsFinal.put(CarbonCommonConstants.SORT_SCOPE, "NO_SORT") | ||
} | ||
else if (tableProperties.get(CarbonCommonConstants.SORT_SCOPE) == null || | ||
tableProperties.get(CarbonCommonConstants.SORT_SCOPE).equals("")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tableProperties.get(CarbonCommonConstants.SORT_SCOPE).equals("")) { | |
tableProperties.get(CarbonCommonConstants.SORT_SCOPE).trim.equals("")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String with spaces will be caught before reaching this step. So no need to add trim.
37488c0
to
74ff915
Compare
a36a71c
to
8b73763
Compare
table.getDatabaseName + "." + table.getTableName, | ||
carbonProperty.getProperty(CarbonLoadOptionConstants.CARBON_OPTIONS_SORT_SCOPE, | ||
carbonProperty.getProperty(CarbonCommonConstants.LOAD_SORT_SCOPE, | ||
"LOCAL_SORT")))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is LOAD_SORT_SCOPE_DEFAULT changed into "NO_SORT" in CarbonCommonConstants and this default value is also "LOCAL_SORT"
the ddl doc also describe that "NO_SORT" is the default value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the SORT_COLUMNS are set and any SORT_SCOPE is not found, then LOCAL_SORT is used.
If no SORT_COLUMNS are set, then take the SORT_SCOPE as NO_SORT.
The default behaviour is NO_SORT if SORT_COLUMNS are not provided, and LOCAL_SORT if SORT_COLUMNS are provided explicetely in CREATE TABLE command. If nothing is specified in CREATE TABLE command, it takes SORT_SCOPE as NO_SORT and SORT_COLUMNS as none.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please avoid using the plain string here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the behavior you mentioned above, is it exactly the doc trying to describe? If not, please optimize the doc in this PR to ensure the modifications' integrity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
besides, please check the comments of this method and ensure its correctness after your modification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xuchuanyin
Changed plain string to enum.
Updated the doc as well. It now includes the priority of Sort Scope checking.
Comments above are correct.
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2445/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2448/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2675/ |
Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10705/ |
// If there are Sort Columns given for the table and Sort Scope is not specified, | ||
// we will take it as whichever sort scope given or LOCAL_SORT as default | ||
if (tableProperties.get(CarbonCommonConstants.SORT_COLUMNS) == null || | ||
tableProperties.get(CarbonCommonConstants.SORT_COLUMNS).equals("")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use StringUtils.isBlank()
to avoid retrieve themember twice
if (tableProperties.get(CarbonCommonConstants.SORT_COLUMNS) == null || | ||
tableProperties.get(CarbonCommonConstants.SORT_COLUMNS).equals("")) { | ||
// If tableProperties.SORT_COLUMNS is null | ||
optionsFinal.put(CarbonCommonConstants.SORT_SCOPE, "NO_SORT") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of plain string 'NO_SORT', do we have Constant Variable or enum for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Used SortScopeOptions.SortScope.LOCAL_SORT.name
for string "LOCAL_SORT"
// If tableProperties.SORT_COLUMNS is null | ||
optionsFinal.put(CarbonCommonConstants.SORT_SCOPE, "NO_SORT") | ||
} | ||
else if (tableProperties.get(CarbonCommonConstants.SORT_SCOPE) == null || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to the previous line.
Please be careful with the code stype
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
} | ||
else if (tableProperties.get(CarbonCommonConstants.SORT_SCOPE) == null || | ||
tableProperties.get(CarbonCommonConstants.SORT_SCOPE).equals("")) { | ||
// If tableProperties.SORT_COLUMNS is not null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please optimize the comment here:
in case SORT_COLUMNS is not set and SORT_SCOPE is set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment
else if (StringUtils.isBlank(tableProperties.get(CarbonCommonConstants.SORT_SCOPE))) {
// If tableProperties.SORT_COLUMNS is not null
// and tableProperties.SORT_SCOPE is null
optionsFinal.put(CarbonCommonConstants.SORT_SCOPE,
is already there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By saying ‘set’ or 'not set', it refers to the business logic while ‘null’ or 'not null' does not refer to that logic directly.
table.getDatabaseName + "." + table.getTableName, | ||
carbonProperty.getProperty(CarbonLoadOptionConstants.CARBON_OPTIONS_SORT_SCOPE, | ||
carbonProperty.getProperty(CarbonCommonConstants.LOAD_SORT_SCOPE, | ||
"LOCAL_SORT")))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please avoid using the plain string here
table.getDatabaseName + "." + table.getTableName, | ||
carbonProperty.getProperty(CarbonLoadOptionConstants.CARBON_OPTIONS_SORT_SCOPE, | ||
carbonProperty.getProperty(CarbonCommonConstants.LOAD_SORT_SCOPE, | ||
"LOCAL_SORT")))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the behavior you mentioned above, is it exactly the doc trying to describe? If not, please optimize the doc in this PR to ensure the modifications' integrity.
table.getDatabaseName + "." + table.getTableName, | ||
carbonProperty.getProperty(CarbonLoadOptionConstants.CARBON_OPTIONS_SORT_SCOPE, | ||
carbonProperty.getProperty(CarbonCommonConstants.LOAD_SORT_SCOPE, | ||
"LOCAL_SORT")))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
besides, please check the comments of this method and ensure its correctness after your modification.
8b73763
to
a33121e
Compare
@xuchuanyin I have updated the Doc as well. Added Priority for Sort_Scope checking. |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2472/ |
Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10730/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2700/ |
retest this please |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2479/ |
Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10737/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2708/ |
// If tableProperties.SORT_COLUMNS is not null | ||
// and tableProperties.SORT_SCOPE is null | ||
optionsFinal.put(CarbonCommonConstants.SORT_SCOPE, | ||
carbonProperty.getProperty(CarbonLoadOptionConstants.CARBON_TABLE_LOAD_SORT_SCOPE + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the first priority should be load options. Please check it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
1. With SORT_COLUMNS=null, loading data was resulting in SORT_SCOPE=LOCAL_SORT. 2. With SORT_COLUMNS!=null and SORT_SCOPE not provided, loading data was not checking CARBON.OPTIONS.SORT.SCOPE.
a33121e
to
eaa42f7
Compare
LGTM |
Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10746/ |
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2718/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2490/ |
retest this please |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2494/ |
Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10752/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2723/ |
…ataCommand Problem1: With no SORT_COLUMNS, loading data was taking SORT_SCOPE=LOCAL_SORT instead of NO_SORT. Solution: Added a check for SORT_COLUMNS in CarbonLoadDataCommand Problem2: On table with some SORT_COLUMNS and SORT_SCOPE not specified, SORT_SCOPE was not considering CARBON.OPTIONS.SORT.SCOPE for SORT_SCOPE. Solution: Added checking of CARBON.OPTIONS.SORT.SCOPE while loading. This closes #3103
…ataCommand Problem1: With no SORT_COLUMNS, loading data was taking SORT_SCOPE=LOCAL_SORT instead of NO_SORT. Solution: Added a check for SORT_COLUMNS in CarbonLoadDataCommand Problem2: On table with some SORT_COLUMNS and SORT_SCOPE not specified, SORT_SCOPE was not considering CARBON.OPTIONS.SORT.SCOPE for SORT_SCOPE. Solution: Added checking of CARBON.OPTIONS.SORT.SCOPE while loading. This closes #3103
…ataCommand Problem1: With no SORT_COLUMNS, loading data was taking SORT_SCOPE=LOCAL_SORT instead of NO_SORT. Solution: Added a check for SORT_COLUMNS in CarbonLoadDataCommand Problem2: On table with some SORT_COLUMNS and SORT_SCOPE not specified, SORT_SCOPE was not considering CARBON.OPTIONS.SORT.SCOPE for SORT_SCOPE. Solution: Added checking of CARBON.OPTIONS.SORT.SCOPE while loading. This closes apache#3103
Problem1: With no SORT_COLUMNS, loading data was taking SORT_SCOPE=LOCAL_SORT instead of NO_SORT.
Solution: Added a check for SORT_COLUMNS in CarbonLoadDataCommand
Problem2: On table with some SORT_COLUMNS and SORT_SCOPE not specified, SORT_SCOPE was not considering CARBON.OPTIONS.SORT.SCOPE for SORT_SCOPE.
Solution: Added checking of CARBON.OPTIONS.SORT.SCOPE while loading.