Skip to content

Concurrency security issue during the process of rename metadata for metajson file #14857

@jams-xin

Description

@jams-xin

Apache Iceberg version

1.4.2

Query engine

Spark

Please describe the bug 🐞

Hi.
There are many spark jobs are concurrently writing partition data for one table. During the final metadata commit phase (HadoopTableOperations::commit), if multiple processes concurrently execute rename file operations, the job may be have concurrency security problem, which can cause the overwrite of the metadata.json file and the partition data is writen failed, but the spark jobs is successful.
Relate souce code:
`
private void renameToFinal(FileSystem fs, Path src, Path dst, int nextVersion) {
try {
lockManager.acquire(dst.toString(), src.toString());
if (fs.exists(dst)) {
throw new CommitFailedException("Version %d already exists: %s", nextVersion, dst);
}

  if (!fs.rename(src, dst)) {
    CommitFailedException cfe =
        new CommitFailedException("Failed to commit changes using rename: %s", dst);
    RuntimeException re = tryDelete(src);
    if (re != null) {
      cfe.addSuppressed(re);
    }
    throw cfe;
  }
} catch (IOException e) {
  CommitFailedException cfe =
      new CommitFailedException(e, "Failed to commit changes using rename: %s", dst);
  RuntimeException re = tryDelete(src);
  if (re != null) {
    cfe.addSuppressed(re);
  }
  throw cfe;
} finally {
  lockManager.release(dst.toString(), src.toString());
}

}
`

May I ask if anyone has encountered a similar issue and how it was resolved,thanks!

Willingness to contribute

  • I can contribute a fix for this bug independently
  • I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • I cannot contribute a fix for this bug at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions