Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CARBONDATA-2136] Fixed bug related to data load for bad_record_action as REDIRECT or IGNORE and sort scope as NO_SORT #1942

Closed
wants to merge 2 commits into from

Conversation

geetikagupta16
Copy link
Contributor

@geetikagupta16 geetikagupta16 commented Feb 7, 2018

Problem: When data loading is performed with bad_record_action as REDIRECT or IGNORE and with sort_scope option as NO_SORT, it was throwing an error as our row batch was getting filled with null.

Solution: Refactored code for creating carbon row batch for bad_record_action as REDIRECT or IGNORE and sort scope as NO_SORT

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

  • Any interfaces changed? No

  • Any backward compatibility impacted? No

  • Document update required? No

  • Testing done
    Please provide details on
    - Whether new unit test cases have been added or why no new tests are required?
    - How it is tested? Please attach test report.
    - Is it a performance related change? Please attach the performance test report.
    - Any additional information to help reviewers in testing this change.

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2320/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3557/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3558/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2321/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3563/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2326/

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3399/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3400/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3406/

@sraghunandan
Copy link
Contributor

@geetikagupta16 pls update the PR template

@geetikagupta16
Copy link
Contributor Author

@sraghunandan I have updated the PR template please check.

while (rowBatch.hasNext()) {
CarbonRow convertRow = localConverter.convert(rowBatch.next());
rowBatch.setPreviousRow(convertRow);
if (convertRow == null && (badRecordsAction.equals(LoggerAction.IGNORE.toString()) ||
badRecordsAction.equals(LoggerAction.REDIRECT.toString()))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove check (badRecordsAction.equals(LoggerAction.IGNORE.toString()) ||
badRecordsAction.equals(LoggerAction.REDIRECT.toString())
BadRecordsAction check is not required, only null check over convertRow is enough.
The convert method return's null only in case of bad_records_action is either IGNORE or REDIRECT.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geetikagupta16 and @QiangCai
In case of bucketing and Stream Record writer also the same exist.
Please correct there as well.
For your reference:

  1. For bucketing.
    org/apache/carbondata/processing/loading/steps/DataConverterProcessorWithBucketingStepImpl.java:128
  2. For Stream Record Writer
    org/apache/carbondata/hadoop/streaming/CarbonStreamRecordWriter.java:191

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3748/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2508/

@geetikagupta16
Copy link
Contributor Author

@mohammadshahidkhan I have made the changes for bucketing. Please review

@mohammadshahidkhan
Copy link
Contributor

@geetikagupta16
Please handle the same issue in case of Stream Record writer as well.
Currently there no test case for the bucketing flow to validate the issue handled as part of this PR.
Please add the test cases to verify the bucketing and also streaming flow.

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3842/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2597/

@geetikagupta16
Copy link
Contributor Author

@mohammadshahidkhan I have added test cases for bucketing flow. Will be creating a new JIRA to handle streaming flow as it is throwing ClassCastException when data loading is performed with bad_record_action as IGNORE.

@geetikagupta16
Copy link
Contributor Author

retest this please

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2601/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3846/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3626/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3894/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2649/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3671/

@mohammadshahidkhan
Copy link
Contributor

+1 The streaming issue is handled in PR #2014

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3038/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4282/

…n as REDIRECT or IGNORE and sort scope as NO_SORT

2. Added related test cases
@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3513/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4740/

@geetikagupta16
Copy link
Contributor Author

retest this please

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4746/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3519/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4246/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4247/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5175/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3965/

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4459/

@SangeetaGulia
Copy link
Contributor

retest this please

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4610/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5769/

@sraghunandan
Copy link
Contributor

retest this please

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4839/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5998/

@mohammadshahidkhan
Copy link
Contributor

LGTM

1 similar comment
@kunal642
Copy link
Contributor

LGTM

@asfgit asfgit closed this in 2ebfab1 May 22, 2018
anubhav100 pushed a commit to anubhav100/incubator-carbondata that referenced this pull request Jun 22, 2018
…n as REDIRECT or IGNORE and sort scope as NO_SORT

Problem: When data loading is performed with bad_record_action as REDIRECT or IGNORE and
with sort_scope option as NO_SORT, it was throwing an error as our row batch was getting filled with null.

Solution: Refactored code for creating carbon row batch for bad_record_action as REDIRECT or IGNORE and sort scope as NO_SORT

This closes apache#1942
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants