Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Jepsen] Snapshot Isolation (SI) violation under huge clock skew even if it is bounded by max_clock_skew_usec flag #975

Closed
mbautin opened this Issue Mar 12, 2019 · 6 comments

Comments

2 participants
@mbautin
Copy link
Collaborator

commented Mar 12, 2019

@ttyusupov caught an issue with bank account test - total sum diverges under huge clock skew nemesis even if clock skew is bounded by max_clock_skew_usec flag.
How to reproduce:

  • Use Jepsen test suite from jepsen-io/jepsen@2d7e042
  • Update yugabyte/src/auto.clj: add setting max clock skew to 1000000 secounds:
--max_clock_skew_usec 1000000000000
  • Run test in a loop:
lein run test --os debian --nodes-file /home/centos/code/jepsen/nodes-debian --username admin --ssh-private-key '<private_key_path>' --url https://download_url/yugabyte-ce-1.1.15.0-b16-centos-x86_64.tar.gz --trace-cql -w bank --time-limit 300 --concurrency 4n --nemesis clock-skew --nemesis-interval 10

on a GCP cluster - 5 separate debian jessie instances.

This test introduces random clock bumps up to 262 seconds in both directions and random clock strobes from 4 ms to 262 seconds, with a period of 1 ms to 1 second, for a duration of 0-32 seconds (every period ms, adjusts the clock forward by delta ms, or, alternatively, back by delta ms. Does this for duration seconds).

If we reduce delta for both bumps and strobes to 131 seconds - test passes.

@ttyusupov

This comment has been minimized.

Copy link
Contributor

commented Mar 12, 2019

I was able to reproduce bank account test failure under high clock skew and tracked down to a transaction which broke the total sum along with its read_time and other info, both conflicting transactions were applied.

HT:  ('1551809242992604', 0, 0)
Changed:  0 ('1551809242992604', 0, 0) -> 70
Changed:  1 ('1551809242992604', 0, 0) -> -7

HT:  ('1551809243479084', 0, 0)
Changed:  1 ('1551809243479084', 0, 0) -> -10
Changed:  2 ('1551809243479084', 0, 0) -> -10

HT:  ('1551809243893568', '54', 0)
Changed:  1 ('1551809243893568', '54', 0) -> -13
Changed:  3 ('1551809243893568', '54', 0) -> 17

HT:  ('1551809244478109', '15', 0)
Changed:  1 ('1551809244478109', '15', 0) -> -12
Changed:  3 ('1551809244478109', '15', 0) -> 16

1551809382271574:
0 -> 70
1 -> -12

HT:  ('1551809391856309', '807', 0)
Changed:  0 ('1551809391856309', '807', 0) -> 75
Changed:  3 ('1551809391856309', '807', 0) -> 11

TXN: 1aa7ccd8-fd49-4803-8b9b-9d6fc1a68d43, read_time = 1551809382271574
CQL: UPDATE jepsen.accounts SET balance = balance - 1 WHERE id = 1;UPDATE jepsen.accounts SET balance = balance + 1 WHERE id = 0
HT:  ('1551809393777148', '107', 0)
Changed:  0 ('1551809393777148', '107', 0) -> 71
Changed:  1 ('1551809393777148', '107', 0) -> -13
Sum diverged:  95

Transaction 1aa7ccd8-fd49-4803-8b9b-9d6fc1a68d43 is using read_time=1551809382271574 at which we have values: 0 -> 70, 1 -> - 12, it updates them to 71 and -13 respectively and committed succesfully.

But between read_time=1551809382271574 and transaction 1aa7ccd8-fd49-4803-8b9b-9d6fc1a68d43 commit time another transaction is committed with commit time 1551809391856309 which also have modified key 0.

Some related logs:

yb-tserver-n2.INFO:I0305 18:08:17.032371  4495 tablet_rpc.cc:333] Operation failed. Try again. (yb/docdb/conflict_resolution.cc:451): Failed Write(tablet: a64519f536e24ccb86130626261017da, num_ops: 1, num_attempts: 1, txn: fab280b9-3bc1-49d1-af8d-be38f3960ecb) to tablet a64519f536e24ccb86130626261017da on tablet server { uuid: f9ca006445024cd591b1ab10edc983e0 private: [host: "10.150.0.71" port: 9100] cloud_info: placement_cloud: "cloud1" placement_region: "datacenter1" placement_zone: "rack1" } after 1 attempt(s): Value write after transaction start: { physical: 1551809393777148 logical: 107 } >= { physical: 1551809388985167 logical: 819 }
yb-tserver-n2.INFO:0305 18:08:44.674108 (+    70us) async_rpc.cc:543] ProcessResponseFromTserver(Operation failed. Try again.: Value write after transaction start: { physical: 1551809393777148 logical: 107 } >= { physical: 1551809386097503 logical: 341 })
yb-tserver-n3.INFO:I0305 18:08:31.978543  4263 tablet_rpc.cc:333] Operation failed. Try again. (yb/docdb/conflict_resolution.cc:451): Failed Write(tablet: a64519f536e24ccb86130626261017da, num_ops: 1, num_attempts: 1, txn: 34c621ca-63fe-4f33-9016-a5eecfae52a2) to tablet a64519f536e24ccb86130626261017da on tablet server { uuid: f9ca006445024cd591b1ab10edc983e0 private: [host: "10.150.0.71" port: 9100] cloud_info: placement_cloud: "cloud1" placement_region: "datacenter1" placement_zone: "rack1" } after 1 attempt(s): Value write after transaction start: { physical: 1551809393777148 logical: 107 } >= { physical: 1551809389717342 logical: 13 }
yb-tserver-n3.INFO:W0305 18:08:37.963451  4517 transaction_participant.cc:890] 838fb0a6286447cc9c0ac3d99d2c3531: Apply of unknown transaction: { transaction_id: 1aa7ccd8-fd49-4803-8b9b-9d6fc1a68d43 op_id: term: 2 index: 546 commit_ht: { physical: 1551809393777148 logical: 107 } log_ht: { physical: 1551809400140791 } status_tablet: 827908a812a248bd8b1dccfd1bc8afa2 }
yb-tserver-n4.INFO:I0305 18:08:31.571141 11758 tablet_rpc.cc:333] Operation failed. Try again. (yb/docdb/conflict_resolution.cc:451): Failed Write(tablet: a64519f536e24ccb86130626261017da, num_ops: 1, num_attempts: 1, txn: c79f983d-f018-49d2-acc5-740e1a5b9277) to tablet a64519f536e24ccb86130626261017da on tablet server { uuid: f9ca006445024cd591b1ab10edc983e0 private: [host: "10.150.0.71" port: 9100] cloud_info: placement_cloud: "cloud1" placement_region: "datacenter1" placement_zone: "rack1" } after 1 attempt(s): Value write after transaction start: { physical: 1551809393777148 logical: 107 } >= { physical: 1551809391697354 logical: 22 }
@ttyusupov

This comment has been minimized.

Copy link
Contributor

commented Mar 12, 2019

Finally found a bug in LockBatch which can cause doing operations without lock in case lock was failed. The reason is that LockBatch is moveable-only, but MoveFrom function doesn't move status:

void LockBatch::MoveFrom(LockBatch* other) {
  Reset();
  key_to_type_ = std::move(other->key_to_type_);
  shared_lock_manager_ = other->shared_lock_manager_;
  other->key_to_type_.clear();
  other->shared_lock_manager_ = nullptr;
}

So, in case LockBatch constructor failed to do a lock within deadline here:

LockBatch::LockBatch(SharedLockManager* lock_manager, LockBatchEntries&& key_to_intent_type,
                     CoarseTimePoint deadline)
    : key_to_type_(std::move(key_to_intent_type)),
      shared_lock_manager_(lock_manager) {
  if (!empty() && !lock_manager->Lock(&key_to_type_, deadline)) {
    shared_lock_manager_ = nullptr;
    key_to_type_.clear();
    status_ = STATUS_FORMAT(TimedOut, "Failed to obtain locks until deadline: $0", deadline);
  }
}

failed status won't be saved into result.lock_batch and function PrepareDocWriteOperation won't return with failed status at RETURN_NOT_OK_PREPEND, it will continue it execution beyond these lines and return success finally:

  result.lock_batch = LockBatch(
      lock_manager, std::move(determine_keys_to_lock_result.lock_batch), deadline);
  RETURN_NOT_OK_PREPEND(
      result.lock_batch.status(), Format("Timeout: $0", deadline - ToCoarse(start_time)));
@ttyusupov

This comment has been minimized.

Copy link
Contributor

commented Mar 12, 2019

Commit: db46ef9

@ttyusupov

This comment has been minimized.

Copy link
Contributor

commented Mar 12, 2019

After the fix (db46ef9) the bug is no longer reproducible even with clock skew higher than max_clock_skew_usec.

@ttyusupov ttyusupov self-assigned this Mar 12, 2019

@ttyusupov ttyusupov added this to To do in Jepsen Testing via automation Mar 12, 2019

@ttyusupov ttyusupov added this to To Do in YBase features via automation Mar 12, 2019

@ttyusupov ttyusupov added the bug label Mar 12, 2019

@ttyusupov ttyusupov closed this Mar 12, 2019

YBase features automation moved this from To Do to Done Mar 12, 2019

Jepsen Testing automation moved this from To do to Done Mar 12, 2019

@ttyusupov

This comment has been minimized.

Copy link
Contributor

commented Mar 13, 2019

Tested with lein run test --os debian --url https://downloads.yugabyte.com/new/yugabyte-ce-1.2.0.0-b7-centos-x86_64.tar.gz -w bank --time-limit 300 --concurrency 4n --nemesis clock-skew,kill-tserver,pause-tserver --nemesis-interval 10 - test passed 110 times in a row.

@ttyusupov

This comment has been minimized.

Copy link
Contributor

commented Mar 13, 2019

In this test max_clock_skew_usec is not being set, but this test introduces clock skew/strobe up to +-262 seconds per each nemesis operation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.