Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replication stops with excessive clock skew #853

Closed
389-ds-bot opened this issue Sep 12, 2020 · 10 comments
Closed

replication stops with excessive clock skew #853

389-ds-bot opened this issue Sep 12, 2020 · 10 comments
Labels
closed: fixed Migration flag - Issue
Milestone

Comments

@389-ds-bot
Copy link

Cloned from Pagure issue: https://pagure.io/389-ds-base/issue/47516


If the CSN generator clock skew is over 1 day, replication stops. Users need to be able to continue to replicate with the high clock skew. There should be a configuration attr that allows replication to continue despite excessive clock skew.

This is becoming a much bigger problem now that many users are using VMs, which are notorious for having system clock/time/ntp issues.

@389-ds-bot 389-ds-bot added the closed: fixed Migration flag - Issue label Sep 12, 2020
@389-ds-bot 389-ds-bot added this to the 1.2.11.23 milestone Sep 12, 2020
@389-ds-bot
Copy link
Author

Comment from rmeggins (@richm) at 2013-09-18 00:04:59

Ticket has been cloned to Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1009122

@389-ds-bot
Copy link
Author

Comment from nhosoi (@nhosoi) at 2013-09-19 00:44:24

It's okay since the default setting is OFF (#define LDAP_OFF 0) && (from comment "If there's no default value, the value will be NULL if it's not set in dse.ldif"), but it'd be nice to "initialize" the value in FrontendConfig_init like "cfg->ldapi_map_entries = LDAP_OFF;"?

@389-ds-bot
Copy link
Author

@389-ds-bot
Copy link
Author

Comment from rmeggins (@richm) at 2013-09-19 03:19:49

To ssh://git.fedorahosted.org/git/389/ds.git
f513bc3..9dc7a46 389-ds-base-1.2.11 -> 389-ds-base-1.2.11
commit 9dc7a46
Author: Rich Megginson richm@redhat.com
Date: Wed Sep 18 12:32:23 2013 -0600
bdca415..e61009e 389-ds-base-1.3.0 -> 389-ds-base-1.3.0
commit e61009e
Author: Rich Megginson richm@redhat.com
Date: Wed Sep 18 12:32:23 2013 -0600
6829200..1f2c151 389-ds-base-1.3.1 -> 389-ds-base-1.3.1
commit 1f2c151
Author: Rich Megginson richm@redhat.com
Date: Wed Sep 18 12:32:23 2013 -0600
c410b87..bde4372 master -> master
commit bde4372
Author: Rich Megginson richm@redhat.com
Date: Wed Sep 18 12:32:23 2013 -0600

@389-ds-bot
Copy link
Author

Comment from lkrispen (@elkris) at 2013-09-19 13:13:04

It's ok, but if replication continues the log skew will be logged again and again and float the error log.

An other option would be to make the max log skew configurable with the current 1d as default and eg -1 as no limit

@389-ds-bot
Copy link
Author

Comment from rmeggins (@richm) at 2014-01-17 00:47:51

The previous fix makes replication ignore time skew errors, but does not ensure that the CSN generator will continue to issue CSNs that exceed its built-in time skew limit. We need to make sure that the CSN generator will never issue duplicate CSNs or regress CSNs.

@389-ds-bot
Copy link
Author

Comment from rmeggins (@richm) at 2014-01-17 01:00:16

Another problem with the fix - it only handles the case where the supplier time skew is too great - it does not take into consideration the case where the consumer time skew is too great:
repl5_inc_protocol.c:

              case EXAMINE_RUV_OK:
                  /* update our csn generator state with the consumer's ruv data */
                  dev_debug("repl5_inc_run(STATE_SENDING_UPDATES) -> examine_update_vector OK");
                  object_acquire(prp->replica_object);
                  replica = object_get_data(prp->replica_object);
                  rc = replica_update_csngen_state (replica, ruv);
                  object_release (prp->replica_object);
                  replica = NULL;
                  if (rc == CSN_LIMIT_EXCEEDED) /* too much skew */ {
                      slapi_log_error(SLAPI_LOG_FATAL, repl_plugin_name,
                          "%s: Incremental protocol: fatal error - too much time skew between replicas!\n",
                          agmt_get_long_name(prp->agmt));
                      next_state = STATE_STOP_FATAL_ERROR;

@389-ds-bot
Copy link
Author

Comment from rmeggins (@richm) at 2014-01-18 05:18:00

0001-Ticket-47516-replication-stops-with-excessive-clock-.patch
0001-Ticket-47516-replication-stops-with-excessive-clock-.patch

@389-ds-bot
Copy link
Author

Comment from rmeggins (@richm) at 2014-01-21 00:01:09

To ssh://git.fedorahosted.org/git/389/ds.git
962de25..d128dbd 389-ds-base-1.2.11 -> 389-ds-base-1.2.11
commit d128dbd
Author: Rich Megginson richm@redhat.com
Date: Thu Jan 16 12:57:22 2014 -0700
075a54e..b51a57b 389-ds-base-1.3.0 -> 389-ds-base-1.3.0
commit b51a57b20386e506a7eb484b62d39bf249ef995f
Author: Rich Megginson richm@redhat.com
Date: Thu Jan 16 12:57:22 2014 -0700
7738016..51c1b2a 389-ds-base-1.3.1 -> 389-ds-base-1.3.1
commit 51c1b2a
Author: Rich Megginson richm@redhat.com
Date: Thu Jan 16 12:57:22 2014 -0700
668903c..a6ec074 389-ds-base-1.3.2 -> 389-ds-base-1.3.2
commit a6ec074
Author: Rich Megginson richm@redhat.com
Date: Thu Jan 16 12:57:22 2014 -0700
9c41a36..9f2b104 master -> master
commit 9f2b104
Author: Rich Megginson richm@redhat.com
Date: Thu Jan 16 12:57:22 2014 -0700

@389-ds-bot
Copy link
Author

Comment from rmeggins (@richm) at 2017-02-11 22:49:25

Metadata Update from @richm:

  • Issue assigned to richm
  • Issue set to the milestone: 1.2.11.23

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed: fixed Migration flag - Issue
Projects
None yet
Development

No branches or pull requests

1 participant