New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CARBONDATA-2745] Added atomic file operations for S3 #2511
Conversation
8e84566
to
5c632a4
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7211/ |
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5987/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7218/ |
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5879/ |
public class AtomicFileOperationFactory { | ||
|
||
public static AtomicFileOperations getAtomicFileOperations(String filePath) { | ||
if (filePath.startsWith("s3")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check based on file type
import org.apache.carbondata.core.datastore.impl.FileFactory; | ||
import org.apache.carbondata.core.util.CarbonUtil; | ||
|
||
class AtomicFileOperationS3Impl implements AtomicFileOperations { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the purpose of why different impl is required.
5c632a4
to
a7ab11a
Compare
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7283/ |
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6050/ |
a7ab11a
to
a043795
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7286/ |
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6053/ |
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5912/ |
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5916/ |
retest this please |
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7391/ |
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6152/ |
a043795
to
bce49d9
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7413/ |
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6172/ |
@gvramana Build passed. Please review |
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5974/ |
LGTM |
1 similar comment
LGTM |
S3 supports atomic file overwrite. hdfs rename is atomic, while overwrite is not atomic and can result in empty file read temporarily. So separate implementations for both hdfs and S3 to ensure consistancy of overwrite and read This closes #2511
Problem: AtomicFileOperationImpl creates a temporary file and then renames the file to actual file name. This is risky in S3 storage as the file has to be deleted and then recreated.
Solution: Create a seperate implementaion for s3 and while writing write with the same name with overwrite mode.
Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:
Any interfaces changed?
Any backward compatibility impacted?
Document update required?
Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.