Skip to content

[Jepsen] Snapshot Isolation (SI) violation under huge clock skew even if it is bounded by max_clock_skew_usec flag #975

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mbautin opened this issue Mar 12, 2019 · 6 comments
Assignees
Labels
kind/bug This issue is a bug

Comments

@mbautin
Copy link
Contributor

mbautin commented Mar 12, 2019

@ttyusupov caught an issue with bank account test - total sum diverges under huge clock skew nemesis even if clock skew is bounded by max_clock_skew_usec flag.
How to reproduce:

  • Use Jepsen test suite from jepsen-io/jepsen@2d7e042
  • Update yugabyte/src/auto.clj: add setting max clock skew to 1000000 secounds:
--max_clock_skew_usec 1000000000000
  • Run test in a loop:
lein run test --os debian --nodes-file /home/centos/code/jepsen/nodes-debian --username admin --ssh-private-key '<private_key_path>' --url https://download_url/yugabyte-ce-1.1.15.0-b16-centos-x86_64.tar.gz --trace-cql -w bank --time-limit 300 --concurrency 4n --nemesis clock-skew --nemesis-interval 10

on a GCP cluster - 5 separate debian jessie instances.

This test introduces random clock bumps up to 262 seconds in both directions and random clock strobes from 4 ms to 262 seconds, with a period of 1 ms to 1 second, for a duration of 0-32 seconds (every period ms, adjusts the clock forward by delta ms, or, alternatively, back by delta ms. Does this for duration seconds).

If we reduce delta for both bumps and strobes to 131 seconds - test passes.

@ttyusupov
Copy link
Contributor

I was able to reproduce bank account test failure under high clock skew and tracked down to a transaction which broke the total sum along with its read_time and other info, both conflicting transactions were applied.

HT:  ('1551809242992604', 0, 0)
Changed:  0 ('1551809242992604', 0, 0) -> 70
Changed:  1 ('1551809242992604', 0, 0) -> -7

HT:  ('1551809243479084', 0, 0)
Changed:  1 ('1551809243479084', 0, 0) -> -10
Changed:  2 ('1551809243479084', 0, 0) -> -10

HT:  ('1551809243893568', '54', 0)
Changed:  1 ('1551809243893568', '54', 0) -> -13
Changed:  3 ('1551809243893568', '54', 0) -> 17

HT:  ('1551809244478109', '15', 0)
Changed:  1 ('1551809244478109', '15', 0) -> -12
Changed:  3 ('1551809244478109', '15', 0) -> 16

1551809382271574:
0 -> 70
1 -> -12

HT:  ('1551809391856309', '807', 0)
Changed:  0 ('1551809391856309', '807', 0) -> 75
Changed:  3 ('1551809391856309', '807', 0) -> 11

TXN: 1aa7ccd8-fd49-4803-8b9b-9d6fc1a68d43, read_time = 1551809382271574
CQL: UPDATE jepsen.accounts SET balance = balance - 1 WHERE id = 1;UPDATE jepsen.accounts SET balance = balance + 1 WHERE id = 0
HT:  ('1551809393777148', '107', 0)
Changed:  0 ('1551809393777148', '107', 0) -> 71
Changed:  1 ('1551809393777148', '107', 0) -> -13
Sum diverged:  95

Transaction 1aa7ccd8-fd49-4803-8b9b-9d6fc1a68d43 is using read_time=1551809382271574 at which we have values: 0 -> 70, 1 -> - 12, it updates them to 71 and -13 respectively and committed succesfully.

But between read_time=1551809382271574 and transaction 1aa7ccd8-fd49-4803-8b9b-9d6fc1a68d43 commit time another transaction is committed with commit time 1551809391856309 which also have modified key 0.

Some related logs:

yb-tserver-n2.INFO:I0305 18:08:17.032371  4495 tablet_rpc.cc:333] Operation failed. Try again. (yb/docdb/conflict_resolution.cc:451): Failed Write(tablet: a64519f536e24ccb86130626261017da, num_ops: 1, num_attempts: 1, txn: fab280b9-3bc1-49d1-af8d-be38f3960ecb) to tablet a64519f536e24ccb86130626261017da on tablet server { uuid: f9ca006445024cd591b1ab10edc983e0 private: [host: "10.150.0.71" port: 9100] cloud_info: placement_cloud: "cloud1" placement_region: "datacenter1" placement_zone: "rack1" } after 1 attempt(s): Value write after transaction start: { physical: 1551809393777148 logical: 107 } >= { physical: 1551809388985167 logical: 819 }
yb-tserver-n2.INFO:0305 18:08:44.674108 (+    70us) async_rpc.cc:543] ProcessResponseFromTserver(Operation failed. Try again.: Value write after transaction start: { physical: 1551809393777148 logical: 107 } >= { physical: 1551809386097503 logical: 341 })
yb-tserver-n3.INFO:I0305 18:08:31.978543  4263 tablet_rpc.cc:333] Operation failed. Try again. (yb/docdb/conflict_resolution.cc:451): Failed Write(tablet: a64519f536e24ccb86130626261017da, num_ops: 1, num_attempts: 1, txn: 34c621ca-63fe-4f33-9016-a5eecfae52a2) to tablet a64519f536e24ccb86130626261017da on tablet server { uuid: f9ca006445024cd591b1ab10edc983e0 private: [host: "10.150.0.71" port: 9100] cloud_info: placement_cloud: "cloud1" placement_region: "datacenter1" placement_zone: "rack1" } after 1 attempt(s): Value write after transaction start: { physical: 1551809393777148 logical: 107 } >= { physical: 1551809389717342 logical: 13 }
yb-tserver-n3.INFO:W0305 18:08:37.963451  4517 transaction_participant.cc:890] 838fb0a6286447cc9c0ac3d99d2c3531: Apply of unknown transaction: { transaction_id: 1aa7ccd8-fd49-4803-8b9b-9d6fc1a68d43 op_id: term: 2 index: 546 commit_ht: { physical: 1551809393777148 logical: 107 } log_ht: { physical: 1551809400140791 } status_tablet: 827908a812a248bd8b1dccfd1bc8afa2 }
yb-tserver-n4.INFO:I0305 18:08:31.571141 11758 tablet_rpc.cc:333] Operation failed. Try again. (yb/docdb/conflict_resolution.cc:451): Failed Write(tablet: a64519f536e24ccb86130626261017da, num_ops: 1, num_attempts: 1, txn: c79f983d-f018-49d2-acc5-740e1a5b9277) to tablet a64519f536e24ccb86130626261017da on tablet server { uuid: f9ca006445024cd591b1ab10edc983e0 private: [host: "10.150.0.71" port: 9100] cloud_info: placement_cloud: "cloud1" placement_region: "datacenter1" placement_zone: "rack1" } after 1 attempt(s): Value write after transaction start: { physical: 1551809393777148 logical: 107 } >= { physical: 1551809391697354 logical: 22 }

@ttyusupov
Copy link
Contributor

Finally found a bug in LockBatch which can cause doing operations without lock in case lock was failed. The reason is that LockBatch is moveable-only, but MoveFrom function doesn't move status:

void LockBatch::MoveFrom(LockBatch* other) {
  Reset();
  key_to_type_ = std::move(other->key_to_type_);
  shared_lock_manager_ = other->shared_lock_manager_;
  other->key_to_type_.clear();
  other->shared_lock_manager_ = nullptr;
}

So, in case LockBatch constructor failed to do a lock within deadline here:

LockBatch::LockBatch(SharedLockManager* lock_manager, LockBatchEntries&& key_to_intent_type,
                     CoarseTimePoint deadline)
    : key_to_type_(std::move(key_to_intent_type)),
      shared_lock_manager_(lock_manager) {
  if (!empty() && !lock_manager->Lock(&key_to_type_, deadline)) {
    shared_lock_manager_ = nullptr;
    key_to_type_.clear();
    status_ = STATUS_FORMAT(TimedOut, "Failed to obtain locks until deadline: $0", deadline);
  }
}

failed status won't be saved into result.lock_batch and function PrepareDocWriteOperation won't return with failed status at RETURN_NOT_OK_PREPEND, it will continue it execution beyond these lines and return success finally:

  result.lock_batch = LockBatch(
      lock_manager, std::move(determine_keys_to_lock_result.lock_batch), deadline);
  RETURN_NOT_OK_PREPEND(
      result.lock_batch.status(), Format("Timeout: $0", deadline - ToCoarse(start_time)));

@ttyusupov
Copy link
Contributor

Commit: db46ef9

@ttyusupov
Copy link
Contributor

After the fix (db46ef9) the bug is no longer reproducible even with clock skew higher than max_clock_skew_usec.

@ttyusupov ttyusupov self-assigned this Mar 12, 2019
@ttyusupov ttyusupov added the kind/bug This issue is a bug label Mar 12, 2019
@ttyusupov
Copy link
Contributor

Tested with lein run test --os debian --url https://downloads.yugabyte.com/new/yugabyte-ce-1.2.0.0-b7-centos-x86_64.tar.gz -w bank --time-limit 300 --concurrency 4n --nemesis clock-skew,kill-tserver,pause-tserver --nemesis-interval 10 - test passed 110 times in a row.

@ttyusupov
Copy link
Contributor

ttyusupov commented Mar 13, 2019

In this test max_clock_skew_usec is not being set, but this test introduces clock skew/strobe up to +-262 seconds per each nemesis operation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This issue is a bug
Projects
None yet
Development

No branches or pull requests

2 participants