Skip to content

Conversation

@wg1026688210
Copy link
Contributor

@wg1026688210 wg1026688210 commented Aug 4, 2021

This pr is to solve the problem of deleting files by mistake in rewriting table if we can’t judge whether the submission is successful when commiting.
The solution is that we only print some logs and do not clean up the files if throw CommitStateUnknownException when commiting ,leave the cleanup work to RemoveOrphanFiles

@github-actions github-actions bot added the core label Aug 4, 2021
Copy link
Contributor

@stevenzwu stevenzwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left one minor comment

@github-actions github-actions bot added the spark label Aug 5, 2021
@wg1026688210 wg1026688210 changed the title delete file by mistake when rewriting iceberg table when replaceDataFiles throw CommitStateUnknownException delete file by mistake when rewriting iceberg table throw CommitStateUnknownException Aug 5, 2021
@wg1026688210
Copy link
Contributor Author

cc @stevenzwu @openinx @rdblue

left one minor comment

thanks for your concern, it's ready for review

return new RewriteDataFilesActionResult(currentDataFiles, addedDataFiles);
}


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove unnecessary changes like this and reformatting lines that don't need to change. We avoid changes that aren't required to avoid git commit conflicts.

@rdblue
Copy link
Contributor

rdblue commented Aug 6, 2021

This looks valid to me. The action will clean up data files for any exception, but we should not do that when the commit state is unknown. FYI @RussellSpitzer.

@wg1026688210, can you remove any unnecessary formatting and whitespace changes? It would also be great to update the description with the case that this fixes. Thanks for working on this!

@RussellSpitzer
Copy link
Member

Yeah this is definitely the right thing to do, we are handling this in the new code (

public void commitOrClean(Set<RewriteFileGroup> rewriteGroups) {
try {
commitFileGroups(rewriteGroups);
} catch (CommitStateUnknownException e) {
LOG.error("Commit state unknown for {}, cannot clean up files because they may have been committed successfully.",
rewriteGroups, e);
throw e;
} catch (Exception e) {
LOG.error("Cannot commit groups {}, attempting to clean up written files", rewriteGroups, e);
rewriteGroups.forEach(this::abortFileGroup);
throw e;
}
}
) as well and while this path is deprecated now I think we should definitely fix it to avoid causing issues while the path is still available.

@wg1026688210
Copy link
Contributor Author

This looks valid to me. The action will clean up data files for any exception, but we should not do that when the commit state is unknown. FYI @RussellSpitzer.

@wg1026688210, can you remove any unnecessary formatting and whitespace changes? It would also be great to update the description with the case that this fixes. Thanks for working on this!

ok

@wg1026688210
Copy link
Contributor Author

Yeah this is definitely the right thing to do, we are handling this in the new code (

public void commitOrClean(Set<RewriteFileGroup> rewriteGroups) {
try {
commitFileGroups(rewriteGroups);
} catch (CommitStateUnknownException e) {
LOG.error("Commit state unknown for {}, cannot clean up files because they may have been committed successfully.",
rewriteGroups, e);
throw e;
} catch (Exception e) {
LOG.error("Cannot commit groups {}, attempting to clean up written files", rewriteGroups, e);
rewriteGroups.forEach(this::abortFileGroup);
throw e;
}
}

) as well and while this path is deprecated now I think we should definitely fix it to avoid causing issues while the path is still available.

looking forward to using the new version rewrite

long startingSnapshotId) {
private void replaceDataFiles(
Iterable<DataFile> deletedDataFiles, Iterable<DataFile> addedDataFiles,
long startingSnapshotId) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wg1026688210, please remove changes like this that are unnecessary.

.onFailure((location, exc) -> LOG.warn("Failed to delete: {}", location, exc))
.run(fileIO::deleteFile);
if (e instanceof CommitStateUnknownException) {
LOG.warn("for unknown commiting state ,we should not delete file ");
Copy link
Contributor

@rdblue rdblue Aug 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please copy the other warning messages here. This needs to conform more to the error message expectations that we use:

  • Messages should use sentence case
  • Messages should not end in whitespace
  • Do not use personal pronouns
  • Be clear about what happened and what action is being taken

This should be something more like "Skipping commit cleanup: commit state is unknown". But check the other places where we catch CommitStateUnknownException and use a similar message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, learned a lot for me

.suppressFailureWhenFinished()
.onFailure((location, exc) -> LOG.warn("Failed to delete: {}", location, exc))
.run(fileIO::deleteFile);
if (e instanceof CommitStateUnknownException) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not a separate catch block? I think it should be a catch instead of using instanceof.

2.Remove redundant code format
doReplace(deletedDataFiles, addedDataFiles, startingSnapshotId);
} catch (CommitStateUnknownException e) {
LOG.warn("Commit state unknown for files: {}, cannot clean up files because they may have been committed " +
"successfully.", Lists.newArrayList(addedDataFiles), e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the list of data files from logs. That is going to be too long. Instead, let's use "Commit state unknown, cannot clean up files that may have been committed"

"successfully.", Lists.newArrayList(addedDataFiles), e);
throw e;
} catch (Exception e) {
LOG.warn("Failed to rewrite, files {} will be delete", Lists.newArrayList(addedDataFiles));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the file list here as well. Instead, use "Failed to commit rewrite, cleaning up rewritten files"

@wg1026688210
Copy link
Contributor Author

Hi,@rdblue @RussellSpitzer .the pr appears conflicts. Do we need to maintain the rewrite api of old version next release. If necessary, I will resolve this conflict.

@rdblue
Copy link
Contributor

rdblue commented Nov 1, 2021

I'll defer to @RussellSpitzer on whether we should fix the old API.

@RussellSpitzer
Copy link
Member

Sorry for taking so long to get to this, I think this would be worth keeping just because of how difficult it is to deal with this situation. Please update the PR and I will merge it in.

…elete_file_by_mistake

� Conflicts:
�	spark/src/test/java/org/apache/iceberg/actions/TestRewriteDataFilesAction.java
@wg1026688210
Copy link
Contributor Author

Ok, I will resolve this conflict today

@wg1026688210
Copy link
Contributor Author

Sorry for taking so long to get to this, I think this would be worth keeping just because of how difficult it is to deal with this situation. Please update the PR and I will merge it in.

hi~ @RussellSpitzer the conflict has been resolved

Copy link
Member

@RussellSpitzer RussellSpitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

@RussellSpitzer RussellSpitzer merged commit c6e68d8 into apache:master Dec 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants