Skip to content

Conversation

@WencongLiu
Copy link
Contributor

What is the purpose of the change

Add retry to read hints in SnapshotManager

Brief change log

  • Introduce the retry logic in SnapshotManager

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no )
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)
  • If yes, how is the feature documented? (JavaDocs)

return Long.parseLong(FileUtils.readFileUtf8(path));
while (retryNumber++ < READ_HINT_RETRY_NUM) {
if (path.getFileSystem().exists(path)) {
return Long.parseLong(FileUtils.readFileUtf8(path));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to catch exception and ignore exception here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@JingsongLi JingsongLi Jan 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int retryNumber = 0;
while (retryNumber++ < READ_HINT_RETRY_NUM) {
        try {
             return Long.parseLong(FileUtils.readFileUtf8(path));
        } catch (Exception ignored) {
        }
       TimeUnit.MILLISECONDS.sleep(READ_HINT_RETRY_INTERVAL);
}

We should skip inner exception, then we can retry to read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I've added a fix-up commit. Currently only exception of "Long.parseLong(FileUtils.readFileUtf8(path))" is catched, and others are thrown to invoker.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can just remove if (path.getFileSystem().exists(path)) {, we already have catch the exception for FileUtils.readFileUtf8(path).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

public static final String EARLIEST = "EARLIEST";
public static final String LATEST = "LATEST";
private static final int READ_HINT_RETRY_NUM = 10;
private static final int READ_HINT_RETRY_INTERVAL = 200;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the better choice is:

  • READ_HINT_RETRY_NUM is 3.
  • READ_HINT_RETRY_INTERVAL is 1ms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

} catch (Exception e) {
LOG.info(
"Failed to read hint file " + fileName + ". Falling back to listing files.", e);
TimeUnit.MILLISECONDS.sleep(READ_HINT_RETRY_INTERVAL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catch and ignore the InterruptedException

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

while (retryNumber++ < READ_HINT_RETRY_NUM) {
try {
return Long.parseLong(FileUtils.readFileUtf8(path));
} catch (IOException ignored) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exception is more safe, after all, read hint is just an optimization

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @WencongLiu thanks for the update, just one comment.

Copy link
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks @WencongLiu

@JingsongLi JingsongLi merged commit 2586516 into apache:master Jan 4, 2023
JingsongLi pushed a commit that referenced this pull request Jan 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants