New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CARBONDATA-3070] Fix partition load issue when custom location is added. #2873
Conversation
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1333/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1121/ |
Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9385/ |
2df12b7
to
1a2acdf
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1155/ |
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1372/ |
Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9423/ |
LGTM |
retest this please |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1181/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1394/ |
Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9445/ |
…ded. Problem: Load files from carbonfile format when custom partition location is added Reason: Carbon has its own filename for each carbondata file, it does not use the filename proposed by spark. And also it has extra index file need to be created. In case of custom partition location sparks keep track the files of name which creates and move them. But carbon has different files created and maintained, that creates the filenot found exception. Solution: Use custom protocol to manage commit and folder location for custom partition location. This closes #2873
Problem:
Load files from carbonfile format when custom partition location is added
Reason:
Carbon has its own filename for each carbondata file, it does not use the filename proposed by spark. And also it has extra index file need to be created. In case of custom partition location sparks keep track the files of name which creates and move them. But carbon has different files created and maintained, that creates the filenot found exception.
Solution:
Use custom protocol to manage commit and folder location for custom partition location.
Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:
Any interfaces changed?
Any backward compatibility impacted?
Document update required?
Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.