Skip to content

RATIS-1479.Make the min gap to take snapshot configurable#572

Merged
szetszwo merged 3 commits intoapache:masterfrom
codings-dan:snapshotgap
Dec 29, 2021
Merged

RATIS-1479.Make the min gap to take snapshot configurable#572
szetszwo merged 3 commits intoapache:masterfrom
codings-dan:snapshotgap

Conversation

@codings-dan
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

subtask of Support snapshot command: Make the min gap to take snapshot configurable

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/RATIS-1479

How was this patch tested?

UT

@codings-dan
Copy link
Copy Markdown
Contributor Author

The CI test error seems to have nothing to do with this code change.

Error:  testMultipleStreamsMultipleServersStepDownLeader(org.apache.ratis.datastream.TestNettyDataStreamChainTopologyWithGrpcCluster)  Time elapsed: 100.012 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 100 seconds
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
	at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:426)
	at java.util.concurrent.FutureTask.get(FutureTask.java:204)
	at org.junit.internal.runners.statements.FailOnTimeout.getResult(FailOnTimeout.java:141)
	at org.junit.internal.runners.statements.FailOnTimeout.evaluate(FailOnTimeout.java:127)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.lang.Thread.run(Thread.java:748)

Copy link
Copy Markdown
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codings-dan , thanks for working on this. Just have some comment on the conf.

    /** The log index gap between to two snapshot creations. */
    String CREATION_GAP_KEY = PREFIX + ".creation.gap";
    long CREATION_GAP_DEFAULT = 1024;
    static long creationGap(RaftProperties properties) {
      return getLong(properties::getLong,
          CREATION_GAP_KEY, CREATION_GAP_DEFAULT, getDefaultLog(), requireMin(1L));
    }
    static void setCreationGap(RaftProperties properties, long minGapTakeSnapshot) {
      setLong(properties::setLong, CREATION_GAP_KEY, minGapTakeSnapshot);
    }

BTW, we should replace the term "takeSanpshot" with "createSnapshot" since file systems (e.g. HDFS) use the term "create". Also, we may have "creation gap", "creation time", etc.

See if you want to do the rename in this pull request.

}

/** min gap to take snapshot async */
String MIN_GAP_TAKE_SNAPSHOT_KEY = PREFIX + ".min.gap.takeSnapshot";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call it "creation.gap".


/** min gap to take snapshot async */
String MIN_GAP_TAKE_SNAPSHOT_KEY = PREFIX + ".min.gap.takeSnapshot";
long MIN_GAP_TAKE_SNAPSHOT_DEFAULT = 5;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default should be larger. How about 1024?

static long minGapTakeSnapshot(RaftProperties properties) {
return getLong(
properties::getLong, MIN_GAP_TAKE_SNAPSHOT_KEY, MIN_GAP_TAKE_SNAPSHOT_DEFAULT,
getDefaultLog(), requireMin(5L));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The min should be 1L.

@codings-dan
Copy link
Copy Markdown
Contributor Author

@szetszwo Thanks for reviewing the code, I don’t know much about the term on the file system. In Alluxio, it’s called takesnapshot/checkpoint. I have changed the code according to your comment. In addition, 1024 is really a good number, haha

Copy link
Copy Markdown
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 the change looks good.

@szetszwo
Copy link
Copy Markdown
Contributor

szetszwo commented Dec 29, 2021

... In Alluxio, it’s called takesnapshot/checkpoint. ...

@codings-dan , Good to know. Thanks.

@szetszwo szetszwo merged commit 94db58b into apache:master Dec 29, 2021
@codings-dan codings-dan deleted the snapshotgap branch December 29, 2021 10:42
symious pushed a commit to symious/ratis that referenced this pull request Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants