New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OUT_OF_SYNC #1045
Comments
As you mentioned, the simplest and recommended way to avoid OUT_OF_SYNC is to increase binlog capacity, which can be configured in ssdb.conf(it is a hidden configuration item, not included in ssdb.conf template):
|
hello, I'm already using this: replication: But i still get OUT_OF_SYNC. Would it help is I further increase the capacity, let's say to 200000000 ? |
@saveriocastellano Heavy writes may cause the OUT_OF_SYNC problem, especially when the network band width is not large enough. To solve this problem, one way is to increase binlog.capacity, another way is to make the network condition more better(increase bandwidth for your server). |
ok, I'm pretty sure it is not because of lack of bandwidth. |
but 100000000 is normally a large number. you should try and tell us the result later. |
By the way in my case doubling the binlog capacity solved the issue and now the two ssdb instances manage to sync. |
Hello
v 1.9.4
We did already experienced 3 times the OUT_OF_SYNC error due to high write to one of the ssdb master-master node.
The first-to see problem:
The biggest problem:
It is getting unacceptable to stop your database for a few hours to recreate the synchronization. Could you consider some re-sync mechanism that would recover the faulty sync? It is almost sure that the sync-lag would happen again, and unless it can fix itself, the whole concept is getting production-useless :(
It would be great if we don't have to stop all ssdb instances in case of OUT_OF_SYNC,
but assume we still have to replicate whole data to instance that is out of sync - still if this can be done semi-automatically, with some 'ssd-rebuild' command, this would be acceptable.
Would you consider introducing logic that would re-sync the faulty data-set's?
Technical problems that We see:
reasonable fast bulk upload
requirement. Not sure how much SSDB contributes to this problem, but since SSDB locks the LevelDB we can't insert directly (to LevelDB) anyway without shutting down SSDB (and this would also break the replication).multi_hset
s are replicated as singlehset
s). Not sure how much work it would be to change this (increasing thebinlog.capacity
wouldn't really solve the problem).The bugs list (mostly Chinese - I do not get the context):
https://github.com/ideawu/ssdb/issues?utf8=%E2%9C%93&q=is%3Aissue%20OUT_OF_SYNC
The text was updated successfully, but these errors were encountered: