New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CARBONDATA-4027] Fix the wrong modifiedtime of loading files in inse… #3977
Conversation
008e235
to
37ca6d5
Compare
retest this please |
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2610/ |
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4360/ |
retest this please |
@marchpure If setLastModifiedTime operation do not take effect on S3, in other places also, we need to check and update |
// Try to recreate loading files if the loading file exists | ||
// or create loading files directly if the loading file doesn't exist | ||
// set isFailed to be false when (delete and) createfile success | ||
var isFailed = if (stageLoadingFile.exists()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isFailed will always be false. can remove it and update the comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if IOException happend in stageLoadingFile.createNewFile(), createNewFile() will return FALE, make isFailed to be TRUE
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4394/ |
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2642/ |
37ca6d5
to
3a43207
Compare
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4398/ |
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2647/ |
retest this please |
// Try to recreate loading files if the loading file exists | ||
// or create loading files directly if the loading file doesn't exist | ||
// set isFailed to be false when (delete and) createfile success | ||
var isFailed = if (stageLoadingFile.exists()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var isFailed = if (stageLoadingFile.exists()) { | |
val isFailed = if (stageLoadingFile.exists()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have modified code according to your suggestion
FileFactory.createNewFile( | ||
this.schemaIndexFilePath, | ||
new FsPermission(FsAction.ALL, FsAction.ALL, FsAction.ALL)); | ||
if (FileFactory.isFileExist(this.schemaIndexFilePath)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (FileFactory.isFileExist(this.schemaIndexFilePath)) { | |
CarbonFile schemaIndexFile = FileFactory.getCarbonFile(this.schemaIndexFilePath); | |
if (schemaIndexFile.exists()) { | |
schemaIndexFile.delete(); | |
} | |
schemaIndexFile.createNewFile(new FsPermission(FsAction.ALL, FsAction.ALL, FsAction.ALL)); | |
this.lastModifiedTime = schemaIndexFile.getLastModifiedTime(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have modified code according to your suggestion
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4405/ |
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2652/ |
3a43207
to
232fff4
Compare
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4409/ |
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2655/ |
232fff4
to
96f4074
Compare
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4413/ |
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2659/ |
96f4074
to
8995066
Compare
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4417/ |
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2663/ |
retest this please |
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4419/ |
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2665/ |
retest this please |
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2791/ |
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4545/ |
7fe6cb9
to
1dee759
Compare
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4553/ |
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2802/ |
1dee759
to
182a8cc
Compare
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2812/ |
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4563/ |
retest this please |
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4575/ |
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2826/ |
retest this please |
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2894/ |
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4650/ |
@marchpure please update the PR description for MergeIndex changes also |
@@ -43,11 +43,6 @@ class MergeIndexEventListener extends OperationEventListener with Logging { | |||
override def onEvent(event: Event, operationContext: OperationContext): Unit = { | |||
event match { | |||
case preStatusUpdateEvent: LoadTablePreStatusUpdateEvent => | |||
// skip merge index in case of insert stage flow | |||
if (null != operationContext.getProperty(CarbonCommonConstants.IS_INSERT_STAGE) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This property also has to be removed from CarbonCommonConstants, as no more usage will be found
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have modified code according to your suggestion
…rt stage Why is this PR needed? In the insertstage flow, there is a empty file with suffix '.loading' to mark the stage in the status of 'in processing'. We update the modifiedtime of '.loading' file for monitoring the insertstage start time, which can be used for calculate TIMEOUT, help to retry and recovery. Before, we use setModifiedTime function to update the modifiedtime, which has a serious bug. For S3 file, setModifiedTime operation do not take effect. leading to the incorrect inserstage starttime of 'loading' file. What changes were proposed in this PR? Update the modifiedtime of loading files based on recreating files. Does this PR introduce any user interface change? No Is any new testcase added? No
182a8cc
to
e8706db
Compare
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4670/ |
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2914/ |
retest this please |
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4676/ |
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2919/ |
I have updated the PR description for MergeIndex changes |
retest this please |
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4692/ |
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2935/ |
LGTM |
…rt stage
Why is this PR needed?
ISSUE1: In the insertstage flow, there is a empty file with suffix '.loading' to mark the stage in the status of 'in processing'. We update the modifiedtime of '.loading' file for monitoring the insertstage start time, which can be used for calculate TIMEOUT, help to retry and recovery.
Before, we use setModifiedTime function to update the modifiedtime, which has a serious bug.
For S3 file, setModifiedTime operation do not take effect. leading to the incorrect inserstage starttime of 'loading' file.
ISSUE2: For now, Insertstage non-parttion table will not merge index files, which will degrade the query performance heavily.
What changes were proposed in this PR?
Does this PR introduce any user interface change?
Is any new testcase added?