Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CARBONDATA-2624] Added validations for complex dataType columns in create table command for Local Dictionary Support #2390

Closed

Conversation

praveenmeenakshi56
Copy link
Contributor

@praveenmeenakshi56 praveenmeenakshi56 commented Jun 20, 2018

What is proposed in this PR

Added Validations for Complex DataType command in create table command.
Added Unit Test cases for the same

What changes were made

  • If Duplicate columns exist, the column names were not displayed in the error message
  • Considered to check for duplicates if extra space was the difference between column names for LOCAL_DICTIONARY_INCLUDE and LOCAL_DICTIONARY_EXCLUDE
  • Added validation to check if no child column is of string dataType in create table command when the complex dataType column was specified in LOCA_DICTIONARY_INCLUDE
  • If Duplicate columns were specified in LOCAL_DICTONARY_INCLUDE/EXCLUDE and in DICTONARY_INCLUDE, the column names were not displayed in the error message

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

  • Any interfaces changed?
    NA
  • Any backward compatibility impacted?
    NA
  • Document update required?
    will be updated in another PR
  • Testing done
    Please provide details on
    - Whether new unit test cases have been added or why no new tests are required?
    - How it is tested? Please attach test report.
    - Is it a performance related change? Please attach the performance test report.
    - Any additional information to help reviewers in testing this change.
    Unit Test cases tested and added in this PR
  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6429/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5260/

@kumarvishal09
Copy link
Contributor

retest this please

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6432/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5263/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5366/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5367/

sql(
"""
| CREATE TABLE local1(id int, name string, city string, age int)
| STORED BY 'org.apache.carbondata.format'
| tblproperties('local_dictionary_enable'='true','local_dictionary_include'='name,city',
| 'local_dictionary_exclude'='name')
| 'local_dictionary_exclude'='city')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to change exiting test case only for case sensitive check

case Some(row) => assert(row.get(1).toString.contains("name,st"))
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add one more test case, where two complex columns are there and one is only with int datatype child columns

}
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove empty line here and add a proper comment

.exists(x => x.column.equalsIgnoreCase(dictColm) && x.children.isDefined &&
null != x.children.get &&
!validateChildColumns(x))) {
val errMsg = "No child column is string dataType column in local_dictionary_include."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error message is not correct, give a proper error message

// check if the same column is present in both dictionary include and local dictionary columns
// configuration
if (tableProperties.get(CarbonCommonConstants.DICTIONARY_INCLUDE).isDefined) {
val dictExcludeCoumns = tableProperties.get(CarbonCommonConstants.LOCAL_DICTIONARY_EXCLUDE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this check, because if same column is contained in dictioanry include, local dictionary include, local dictionary exclude, any two will be validated and error will be thrown, better not to check all the three properties

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6435/

@praveenmeenakshi56
Copy link
Contributor Author

retest this please

* this method validates whether any of the child columns of complex dataType column
* is of type string.
*/
def validateChildColumns(field: Field): Boolean = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you change the logic to return true as soon as you find any column with string datatype. No need to check other columns.

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5266/

"Child columns do not exist for the specified complex dataType column(s) in local_dictionary_include."))
}

test(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think local_dictionary_include _001 has already covered this scenario. Please remove all duplicate test cases.

.exists(x => x.column.equalsIgnoreCase(dictColm) && x.children.isDefined &&
null != x.children.get &&
!validateChildColumns(x))) {
val errMsg = "Child columns do not exist for the specified complex dataType column(s) in local_dictionary_include."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throw proper error message.

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6439/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5270/

@praveenmeenakshi56 praveenmeenakshi56 force-pushed the local_dict1 branch 2 times, most recently from 72e3731 to 5ad6e87 Compare June 21, 2018 08:38
@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6443/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5370/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5277/

@akashrn5
Copy link
Contributor

retest this please

@akashrn5
Copy link
Contributor

retest sdv please

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6446/

@ravipesala
Copy link
Contributor

retest this please

@akashrn5
Copy link
Contributor

LGTM

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6449/

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5373/

@praveenmeenakshi56
Copy link
Contributor Author

retest sdv please

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5280/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5376/

@kunal642
Copy link
Contributor

LGTM

@asfgit asfgit closed this in d5e86db Jun 22, 2018
asfgit pushed a commit that referenced this pull request Jul 30, 2018
…reate table command for Local Dictionary Support

1. If Duplicate columns exist, the column names were not displayed in the error message
2. Considered to check for duplicates if extra space was the difference between
column names for LOCAL_DICTIONARY_INCLUDE and LOCAL_DICTIONARY_EXCLUDE
3. Added validation to check if no child column is of string dataType in create
table command when the complex dataType column was specified in LOCA_DICTIONARY_INCLUDE
4. If Duplicate columns were specified in LOCAL_DICTONARY_INCLUDE/EXCLUDE and in
DICTONARY_INCLUDE, the column names were not displayed in the error message

This closes #2390
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants