Optimistic opening of a Realm file#3013
Conversation
| } | ||
|
|
||
| // Returns the time to sleep before retrying opening the SharedGroup. | ||
| private long getSleepTime(int tries) { |
There was a problem hiding this comment.
It looks like this could be static
| // Returns the time to sleep before retrying opening the SharedGroup. | ||
| private long getSleepTime(int tries) { | ||
| if (INCREMENTAL_BACKOFF_MS == null) { | ||
| return 0; |
There was a problem hiding this comment.
I think returning INCREMENTAL_BACKOFF_LIMIT_MS is better. Otherwise you will get many calls to Thread.sleep().
|
Awesome job! 👍 (It would be x100 awesome if there is an multi-process test, but oh well... 😢) |
| // Keep these public so we can ask users to experiment with these values if needed. | ||
| // Should be locked down as soon as possible. | ||
| public static long[] INCREMENTAL_BACKOFF_MS = new long[] {1, 3, 10, 20}; // Will keep re-using last value until LIMIT is hit | ||
| public static long INCREMENTAL_BACKOFF_LIMIT_MS = 1000; |
There was a problem hiding this comment.
I would like make the default time out longer, like 3 seconds. The time before ANR to kill is 5 seconds. As long as we don't trigger ANR, we will always get the accurate results.
There was a problem hiding this comment.
Yes, I could probably live with that. Retrying every 20 ms for 3 seconds is kinda massive though, so I'll probably cap at 50 ms instead then? So 1,3, 10, 20, 50 ?
There was a problem hiding this comment.
[1, 10, 100, 200, 400] makes more sense IMO. The magnitudes of retry times should be enough differentiated. And also, we should give some free cpu time for kernel to do scheduling.
There was a problem hiding this comment.
Adjust to {1, 10, 20, 50, 100, 200, 400} and 3 sec timeout. This is a good comprise between being responsive and allowing the scheduler to work.
|
You made it!! 👍 |
| assertTrue(lockFile.exists()); | ||
| FileOutputStream fooStream = new FileOutputStream(lockFile, false); | ||
| fooStream.write("Boom".getBytes()); | ||
| fooStream.close(); |
There was a problem hiding this comment.
maybe add a flush to ensure the file is modified before proceeding with the rest of the test
There was a problem hiding this comment.
flush is a no-op for FileOutputStream:
/**
* Flushes this stream. Implementations of this method should ensure that
* any buffered data is written out. This implementation does nothing.
*
* @throws IOException
* if an error occurs while flushing this stream.
*/
public void flush() throws IOException {
/* empty */
}
|
LGTM 👍 |
This PR is the result of investigating #2459
We have anecdotal evidence that multiple processes might be open for the same app. We have not been able to verify this, but the result matches the "IncompatibleLockFile" errors reported in #2459. This happens if the two processes have two versions of Realm (e.g. during app upgrades).
This PR is based on our assumption that such an overlap is not intentional and only happens because one process is started before the previous was completely shut down.
So this PR introduces an optimistic opening scheme where we retry for 1 second before crashing as before.
I also added logging so users can log if they run into this issue.
The settings for this are currently public in SharedGroup, which will allow users to modify the timeout and retry-policy without us needing to release a new version of Realm Java.
@realm/java