OUT_OF_SYNC #1045

sirkubax · 2017-01-23T10:22:14Z

Hello

v 1.9.4

We did already experienced 3 times the OUT_OF_SYNC error due to high write to one of the ssdb master-master node.
The first-to see problem:

replication sync can take up to 2 minutes - not real-time any-more.
The biggest problem:
OUT_OF_SYNC that does not fix itself - you need to perform the whole procedure with deleting the meta folders :/ SSDB new node in OUT_OF_SYNC / INIT state #975

It is getting unacceptable to stop your database for a few hours to recreate the synchronization. Could you consider some re-sync mechanism that would recover the faulty sync? It is almost sure that the sync-lag would happen again, and unless it can fix itself, the whole concept is getting production-useless :(

It would be great if we don't have to stop all ssdb instances in case of OUT_OF_SYNC,
but assume we still have to replicate whole data to instance that is out of sync - still if this can be done semi-automatically, with some 'ssd-rebuild' command, this would be acceptable.
Would you consider introducing logic that would re-sync the faulty data-set's?

Technical problems that We see:

LevelDB is optimized for latency but not throughput, which seems to be a bit of a problem regarding the reasonable fast bulk upload requirement. Not sure how much SSDB contributes to this problem, but since SSDB locks the LevelDB we can't insert directly (to LevelDB) anyway without shutting down SSDB (and this would also break the replication).
The replication model of SSDB basically breaks the whole bulk upload idea (multi_hsets are replicated as single hsets). Not sure how much work it would be to change this (increasing the binlog.capacity wouldn't really solve the problem).
increaseign binlog.capacity would give us more time before going in OUT_OF_SYNC state, which is hard to recover, so we would consider increasing it anyway

The bugs list (mostly Chinese - I do not get the context):
https://github.com/ideawu/ssdb/issues?utf8=%E2%9C%93&q=is%3Aissue%20OUT_OF_SYNC

The text was updated successfully, but these errors were encountered:

ideawu · 2017-02-03T05:14:29Z

As you mentioned, the simplest and recommended way to avoid OUT_OF_SYNC is to increase binlog capacity, which can be configured in ssdb.conf(it is a hidden configuration item, not included in ssdb.conf template):

replication:
	binlog: yes
		capacity: 100000000

saveriocastellano · 2020-01-16T13:02:03Z

hello,
I'm having the same problems... i cannot get two masters with sync=mirror to fully sync.
Every time the sync up to 70%-80% and then they go OUT_OF_SYNC. And this happens
all the time.
The size of the database is 80GB, and the database is doing heavy writes also during sync..
is this the problem?

I'm already using this:

replication:
binlog: yes
capacity: 100000000

But i still get OUT_OF_SYNC.

Would it help is I further increase the capacity, let's say to 200000000 ?

ideawu · 2020-01-17T01:29:50Z

@saveriocastellano Heavy writes may cause the OUT_OF_SYNC problem, especially when the network band width is not large enough. To solve this problem, one way is to increase binlog.capacity, another way is to make the network condition more better(increase bandwidth for your server).

saveriocastellano · 2020-01-17T10:57:05Z

ok, I'm pretty sure it is not because of lack of bandwidth.
Do you think I can try raising the binlog capacity from 100000000 to 200000000?
How does that effect memory?

ideawu · 2020-01-17T11:18:24Z

binlog.capacity does not impact memory, it just impact disk usage, so you can increase it without worrying about memory.

but 100000000 is normally a large number.

you should try and tell us the result later.

saveriocastellano · 2020-02-18T13:20:50Z

By the way in my case doubling the binlog capacity solved the issue and now the two ssdb instances manage to sync.

sirkubax mentioned this issue Jan 27, 2017

export going up to 280% or more #1046

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OUT_OF_SYNC #1045

OUT_OF_SYNC #1045

sirkubax commented Jan 23, 2017 •

edited

ideawu commented Feb 3, 2017

saveriocastellano commented Jan 16, 2020

ideawu commented Jan 17, 2020

saveriocastellano commented Jan 17, 2020

ideawu commented Jan 17, 2020

saveriocastellano commented Feb 18, 2020

OUT_OF_SYNC #1045

OUT_OF_SYNC #1045

Comments

sirkubax commented Jan 23, 2017 • edited

ideawu commented Feb 3, 2017

saveriocastellano commented Jan 16, 2020

ideawu commented Jan 17, 2020

saveriocastellano commented Jan 17, 2020

ideawu commented Jan 17, 2020

saveriocastellano commented Feb 18, 2020

sirkubax commented Jan 23, 2017 •

edited