Introduce Snapshot Isolation OCC to DBTransaction #19

CMCDragonkai · 2022-05-24T10:54:51Z

Derived from #18

Description

This implements the snapshot isolation DB transaction.

This means DBTransaction will be automatically snapshot isolated, which means most locking will be unnecessary.

Instead when performing a transaction, there's a chance for a ErrorDBTransactionConflict exception which means there was a write conflict with another transaction.

Users can then decide on their discretion to retry the operation if they need to (assuming any non-DB side-effects are idempotent, noops or can be compensated). This should reduce the amount of locking overhead we need to do in Polykey. We may bubble up the conflict exception to the user, so the user can re-run their command, or in some cases, in-code we will automatically perform a retry. The user in this case can be the PK client, or the another PK agent or the PK GUI.

There is still one situation where user/application locks are needed, and that's where there may be a write-skew. See snapshot isolation https://en.wikipedia.org/wiki/Snapshot_isolation for more details and also https://www.cockroachlabs.com/blog/what-write-skew-looks-like/.

In the future we may upgrade to SSI (serializable snapshot isolation) which will eliminate this write-skew possibility.

Additionally this PR will also enable the keyAsBuffer and valueAsBuffer options on the iterators, enabling easier usage of the iterators without having to use dbUtils.deserialize<T>(value) where it can be configured ahead of time. - already merged

See this https://www.fluentcpp.com/2019/08/30/how-to-disable-a-warning-in-cpp/ as to how to disable warnings in C++ cross platform.

Also see: https://nodejs.github.io/node-addon-examples/special-topics/context-awareness/

Issues Fixed

Fixes DBTransaction with Pessimistic Locking & Optimistic Locking and Deadlock Detection #17
Fixes Upgrade to abstract-level interface #11 - we won't bother with abstract-level anymore since we are forking classic-level
Related Updating keyPathToKey to escape key parts #22
Related [Bug]: toStrictEqual doesn't work on simple arrays returned by C++ addon with node-api jestjs/jest#12814

Tasks

Final checklist

CMCDragonkai · 2022-05-24T10:56:40Z

Bringing in the new changes from TypeScript-Demo-Lib-Native. But without the application builds because this is a pure library. And also removing deployment jobs.

CMCDragonkai · 2022-05-24T11:32:38Z

Ok it's time to finally bring in the leveldb source code and start hacking C++.

CMCDragonkai · 2022-05-25T08:07:25Z

I'm also going to solve the problem with key path here and probably prepare it for merging by cherry picking into staging, this can go along with a number of other CI/CD changes too.

@emmacasolin

CMCDragonkai · 2022-05-25T10:11:23Z

@emmacasolin I'm changing the DBIterator type to be like this and introducing DBIteratorOptions:

/**
 * Iterator options
 * The `keyAsBuffer` property controls
 * whether DBIterator returns KeyPath as buffers or as strings
 * It should be considered to default to true
 * The `valueAsBuffer` property controls value type
 * It should be considered to default to true
 */
type DBIteratorOptions = {
  gt?: KeyPath | Buffer | string;
  gte?: KeyPath | Buffer | string;
  lt?: KeyPath | Buffer | string;
  lte?: KeyPath | Buffer | string;
  limit?: number;
  keys?: boolean;
  values?: boolean;
  keyAsBuffer?: boolean;
  valueAsBuffer?: boolean
};

/**
 * Iterator
 */
type DBIterator<K extends KeyPath | undefined, V> = {
  seek: (k: KeyPath | string | Buffer) => void;
  end: () => Promise<void>;
  next: () => Promise<[K, V] | undefined>;
  [Symbol.asyncIterator]: () => AsyncGenerator<[K, V]>;
};

This means now KeyPath becomes pre-eminent, and anywhere I have KeyPath | Buffer | string, the Buffer or string is intepreted as a singleton KeyPath.

CMCDragonkai · 2022-05-25T10:12:57Z

This also allows the key returned by the iterator to be later used by seek or the range options.

This will impact downstream EFS and PK usage though. But find and replace should be sufficient.

CMCDragonkai · 2022-05-25T14:59:37Z

I've updated the DB._iterator to use the new DBIteratorOptions and DBIterator. I haven't tested yet, only type checked.

CMCDragonkai · 2022-05-25T15:00:52Z

I've also created a utils.toKeyPath function that can be used to easily convert possible keypaths into keypaths. This can be used in our get, put, del functions to Buffer and string into KeyPath.

The keyAsBuffer option now means that the returned KeyPath is converted to an array of string compared to an array of buffers as it would normally be.

tegefaulkes · 2022-05-26T01:29:38Z

If the problem is double encoding then couldn't this be solved by having clear barriers to where and when the encoding is applied? We need the encoded form internally but the the user needs the un-encoded form. Doesn't this mean encoding conversion only needs to happen when we pass to and from the user?

CMCDragonkai · 2022-05-26T01:31:08Z

If the problem is double encoding then couldn't this be solved by having clear barriers to where and when the encoding is applied? We need the encoded form internally but the the user needs the un-encoded form. Doesn't this mean encoding conversion only needs to happen when we pass to and from the user?

There are clear barriers. It's only applied when user passes input into the system. The problem is only for the iterator. Because iterator returns your the "key" that you're supposed to use later for other operations. Then the solution is not encode the key then again while inside the iterator. Solving the double escaping problem.

CMCDragonkai · 2022-05-26T03:01:42Z

Something I recently discovered the effect of the empty key.

That is if you have a KeyPath of [] this becomes [''].

Therefore if you use the empty key, in master I believe you would not be able to see this key when iterating.

To solve this, I had to change the default iterator option to instead of using gt to use gte.

For example:

    if (options_.gt == null && options_.gte == null) {
      options_.gte = utils.levelPathToKey(levelPath);
    }

This now ensures that the empty key shows up within the level during iteration.

CMCDragonkai · 2022-05-26T04:52:50Z

Extra test cases were added for empty keys now, and this actually resolves another bug involving empty keys.

describe('utils', () => {
  const keyPaths: Array<KeyPath> = [
    // Normal keys
    ['foo'],
    ['foo', 'bar'],
    // Empty keys are possible
    [''],
    ['', ''],
    ['foo', ''],
    ['foo', '', ''],
    ['', 'foo', ''],
    ['', '', ''],
    ['', '', 'foo'],
    // Separator can be used in key part
    ['foo', 'bar', Buffer.concat([utils.sep, Buffer.from('key'), utils.sep])],
    [utils.sep],
    [Buffer.concat([utils.sep, Buffer.from('foobar')])],
    [Buffer.concat([Buffer.from('foobar'), utils.sep])],
    [Buffer.concat([utils.sep, Buffer.from('foobar'), utils.sep])],
    [Buffer.concat([utils.sep, Buffer.from('foobar'), utils.sep, Buffer.from('foobar')])],
    [Buffer.concat([Buffer.from('foobar'), utils.sep, Buffer.from('foobar'), utils.sep]),
    ],
    // Escape can be used in key part
    [utils.esc],
    [Buffer.concat([utils.esc, Buffer.from('foobar')])],
    [Buffer.concat([Buffer.from('foobar'), utils.esc])],
    [Buffer.concat([utils.esc, Buffer.from('foobar'), utils.esc])],
    [Buffer.concat([utils.esc, Buffer.from('foobar'), utils.esc, Buffer.from('foobar')])],
    [Buffer.concat([Buffer.from('foobar'), utils.esc, Buffer.from('foobar'), utils.esc])],
    // Separator can be used in level parts
    [Buffer.concat([utils.sep, Buffer.from('foobar')]), 'key'],
    [Buffer.concat([Buffer.from('foobar'), utils.sep]), 'key'],
    [Buffer.concat([utils.sep, Buffer.from('foobar'), utils.sep]), 'key'],
    [Buffer.concat([utils.sep, Buffer.from('foobar'), utils.sep, Buffer.from('foobar')]), 'key'],
    [Buffer.concat([Buffer.from('foobar'), utils.sep, Buffer.from('foobar'), utils.sep]), 'key'],
    // Escape can be used in level parts
    [Buffer.concat([utils.sep, utils.esc, utils.sep]), 'key'],
    [Buffer.concat([utils.esc, utils.esc, utils.esc]), 'key'],
  ];
  test.each(keyPaths.map(kP => [kP]))(
    'parse key paths %s',
    (keyPath: KeyPath) => {
      const key = utils.keyPathToKey(keyPath);
      const keyPath_ = utils.parseKey(key);
      expect(keyPath.map((b) => b.toString())).toStrictEqual(
        keyPath_.map((b) => b.toString()),
      );
    }
  );
});

CMCDragonkai · 2022-05-26T07:18:29Z

Needs rebase on top of staging now.

CMCDragonkai · 2022-05-26T07:58:31Z

Time to rebase.

ghost · 2022-05-26T08:02:51Z

👇 Click on the image for a new way to code review

Make big changes easier — review code in small groups of related files
Know where to start — see the whole change at a glance
Take a code tour — explore the change with an interactive tour
Make comments and review — all fully sync’ed with github

Try it now!

Legend

CMCDragonkai · 2022-05-30T05:38:14Z

I was looking at 2 codebases to understand how to integrate leveldb.

It appears the classic-level is still using leveldb 1.20 which is 5 years old. The rocksdb is bit more recent.

The leveldb codebase has a bit more of a complicated build process.

The top level binding.gyp includes a lower level gyp file:

    "dependencies": [
      "<(module_root_dir)/deps/leveldb/leveldb.gyp:leveldb"
    ],

The deps/leveldb/leveldb.gyp file contains all the settings to actually compile the leveldb as a shared object.

Note that the binding.cc does import leveldb headers like:

#include <leveldb/db.h>

These headers are not specified by the binding.gyp, it's possible that by specifying the dependencies, the inclusion headers made available to the top-level target.

The leveldb.gyp also specifies a dependency in snappy:

    "dependencies": [
      "../snappy/snappy.gyp:snappy"
    ],

These are all organised under deps. These are not git submodules. Except for the snappy.

I think for us, we should just copy the structure of deps, as well as the submodule configuration. We can preserve the leveldb.gyp and snappy.gyp, and then just write our own binding.gyp that uses it. Then things should proceed as normal.

So it seems that leveldb has alot of legacy aspects, rocksdb compilation is alot cleaner. In fact lots of tooling has stopped using gyp file. We can explore that later.

CMCDragonkai · 2022-05-30T05:50:00Z

The presence of the snappy submodule means cloning has to be done now with git clone --recursive. If you have already cloned, setup the git submodule with git submodule update --init --recursive. It should bring in data into deps/snappy/snappy.

CMCDragonkai · 2022-05-30T06:22:27Z

While porting over the binding.gyp from classic-level, we have to be aware of: MatrixAI/TypeScript-Demo-Lib#38 (comment)

The cflags and cflags_cc both apply to g++ and g++ is used when the file is cpp.

However both c and cpp files may be used at the same time, so we should be setting relevant flags for both cflags and cflags_cc.

I'm not sure if this is true for non-linux platforms. The upstream classic-level does not bother with cflags_cc. However we will set it just so that we can get the proper standard checks.

CMCDragonkai · 2022-05-30T06:33:48Z

I found out what cflags+ means. It is based on:

If the key ends with a plus sign (+), the policy is for the source list contents to be prepended to the destination list. Mnemonic: + for addition or concatenation.

So cflags+ will prepend rather than appending as normal. This sets the visibility=hidden to be true.

This is required due to: https://github.com/nodejs/node-addon-api/blob/main/doc/setup.md. The reason this is required is documented here: nodejs/node-addon-api#460 (comment). This should go into TS-Demo-Lib-Native too.

…transactions

…t demonstrating race condition thrashing

CMCDragonkai · 2022-06-26T11:23:00Z

I've added a test PCC locking to prevent thrashing for racing counters that demonstrates how to address racing counters atm. This is particularly relevant to @tegefaulkes.

Note that since the DBTransaction doesn't have any native locking yet as per task 26., this means locking has to be done outside of transaction construction. Let me know if this will cause problems. If it does, we have to address task 26, otherwise I can push that to be done later.

The main thing for implementing task 26, is to integrate LockBox into DBTransaction. It would need to ensure that locks are only released when the transaction is committed or rollbacked. Deadlock detection is an optional feature on top of that. Any implementation of this should also look into if we can also optimise the ability to retry transactions.

Retrying transactions is inefficient. If a transaction conflict occurs, it is currently necessary to recreate the entire transaction object. This is because the C++ code itself destroys the properties early if committing was not allowed. If we loosen this, then allow resetting the transaction snapshot, it should be possible to "retry" a transaction just be recalling the commit. (Actually I might try this now to see if it can be easily implemented after getting all the tests passing).

CMCDragonkai · 2022-06-26T11:27:16Z

Task 17 can only be done after merging to staging.

CMCDragonkai · 2022-06-27T07:48:59Z

Ok to solve task 26, we need to introduce the LockBox.

It has to be shared between all the transactions, which means it's going to be stored on DB.

Each transaction will then expose a Transaction.lock method.

So it may look like:

t1 = withF([db.transaction()], async ([tran]) => {
  await tran.lock(['counter', Lock], ['someotherkey', Lock]);
});

t2 = withF([db.transction()], async ([tran]) => {
  await tran.lock(['counter', Lock]);
});

await Promise.allSettled([t1, t2]);

Note that tran.lock would be the LockBox.lock.

However LockBox.lock takes Array<LockRequest>.

I'm wondering if its worth simplifying this. By convention we should be locking based on key paths, but really anything could be used. If we simplify things, one has to then be able to take just a string, and then the default lock constructor can be Lock to ensure mutex.

Furthermore if RWLock is used, then you get alot of flexibility with how you want to lock things.

Most important is that the locks are only released after you commit or rollback.

No deadlock detection, this can be addressed later.

As for re-entrant locking, that's something that should also be done, but can also be done later.

So future work:

Deadlock detection
Allow re-entrant locking, by tracking what has been locked within a transaction. Use a Set<ToString> to check for membership.
Allow lock upgrading - this is more for js-async-locks, to allow one to upgrade locks from read to write locks or downgrading

But I do want to add in default lock constructor if it is not specified, it should just be Lock. This will need update on js-async-locks. This means await tran.lock('abc', 'foo') means an automatic lock on abc and foo with just Lock.

CMCDragonkai · 2022-06-27T08:19:29Z

Yea so LockBox.lock returns a ResourceAcquire. This means it's a bit clunky to use, requiring you to use it with the withF internally.

I reckon this is a bit incorrect, since locks are not a separate resource within the transaction, instead transaction locks are properties of the transaction itself. Therefore Transaction.lock will instead just return Promise<void>.

This would allow one to do things like:

await tran.lock('abc');
tran.unlock('abc');

In addition to this, the tran.lock('abc') may return a Promise<[ResourceRelease]>. Alloying you keep a function reference to the release.

To be able to do tran.unlock('abc') would require us to keep a reference to all releasers in a Map. If we need a Map to do re-entrant locking, we might as well make this possible.

CMCDragonkai · 2022-06-28T14:43:18Z

Ok turns out I have some problems with the design of LockBox, and I'm making changes to MatrixAI/js-async-locks#14.

In our DBTransaction, we want to be able to imperatively unlock keys or a subset of keys that are locked by the transaction.

I originally programmed this by adding the ability to LockBox.unlock.

However it turns out that this doesn't work if the LockBox is used with RWLockWriter or RWLockReader. The reason is that there could be multiple possible releases for multiple reader locks on the same key. The Lockbox.unlock is only given a key, it doesn't know which of the reader locks to unlock.

This means the tracking of imperative unlocking has to be done by the user/owner of the LockBox, but not the LockBox itself.

Since locked keys are isolated to each transaction, then we can have each DBTransaction instance keep track of the releasers for each locked key. With lock re-entrancy, it doesn't make sense for the Transaction to be able to acquire 2 read locks on the same key. Therefore a transaction can maintain a map of lockReleasers: Map<string, ResourceRelease> rather than the LockBox.

However the issue is that the LockBox.lock does not expose this information. Each call to the method returns a single ResourceRelease that release all the locks held. We could eliminate the ability to lock multiple keys in one go in DBTransaction.lock, then we could store each 1 release for each key locked. But then we do lose some of the niceties that LockBox.lock provides like sorting, unique, and key.toString(). However I also noticed that we have to redo these functions anyway in DBTransaction.lock, and the reason is that we have to keep track of all locks locked, so we can unlock them in reverse order during DBTransaction.destroy. If this is the case, then our DBTransaction.lock replicates the API of LockBox.lock, but internally it would have to use LockBox.lock one at a time.

Furthermore I found that it's not a good idea to use setupSnapshot() in the DBTransaction.lock method. Definitely not before, as it prevents its ability to solve the counter racing problem, since the snapshots set before locking results in a conflict. I'm not sure about setting snapshots after locking, so I just removed lazy snapshot setting from DBTransaction.lock entirely.

CMCDragonkai · 2022-06-30T10:19:23Z

Ok js-async-locks is queued for 3.0.0 update.

It's all done, PCC locking is now available in DBTransaction.

CMCDragonkai · 2022-06-30T10:44:56Z

Subsequent work must take place on the staging branch in order to run the builds and test all the release jobs.

CMCDragonkai · 2022-06-30T10:48:40Z

This should trigger v5 of js-db.

CMCDragonkai mentioned this pull request May 24, 2022

Introduce Snapshot Isolation OCC to DBTransaction #18

Closed

15 tasks

CMCDragonkai mentioned this pull request May 25, 2022

Updating keyPathToKey to escape key parts #22

Closed

9 tasks

CMCDragonkai self-assigned this May 25, 2022

CMCDragonkai requested a review from emmacasolin May 25, 2022 08:07

CMCDragonkai force-pushed the feature-control branch 2 times, most recently from c6731ec to fbf8ab3 Compare May 25, 2022 15:08

CMCDragonkai mentioned this pull request May 25, 2022

Upgrading lib dependencies and node.js version MatrixAI/Polykey#374

Merged

40 tasks

CMCDragonkai mentioned this pull request May 26, 2022

Fix Iterator KeyPath and Enable Iterator keyAsBuffer and valueAsBuffer #23

Merged

15 tasks

CMCDragonkai force-pushed the feature-control branch from 0bfd7db to a8ba7b0 Compare May 26, 2022 08:01

CMCDragonkai force-pushed the feature-control branch 3 times, most recently from bd9ca76 to 7b69ba9 Compare May 26, 2022 08:20

CMCDragonkai force-pushed the feature-control branch from 731d1e0 to af0e92a Compare June 26, 2022 10:45

CMCDragonkai added 2 commits June 26, 2022 21:19

Re-enabled DBTransaction and integrated it with rocksdb's optimistic …

b8a6083

…transactions

Added DBTransaction.getForUpdate to address write skew, and added tes…

58a62f2

…t demonstrating race condition thrashing

CMCDragonkai force-pushed the feature-control branch from 5bf710c to 58a62f2 Compare June 26, 2022 11:19

CMCDragonkai mentioned this pull request Jun 27, 2022

Integrating LockBox to DBTransaction - introducing lockMulti MatrixAI/js-async-locks#14

Merged

10 tasks

CMCDragonkai marked this pull request as draft June 27, 2022 08:37

CMCDragonkai requested a review from tegefaulkes June 27, 2022 09:03

CMCDragonkai force-pushed the feature-control branch from e4471c7 to a12afb4 Compare June 30, 2022 10:24

CMCDragonkai added 3 commits June 30, 2022 20:42

Introducing PCC locking for DBTransaction

a06ccdf

Export RocksDB and RocksDBP interfaces

cb25e1f

Updated docs

567526b

CMCDragonkai force-pushed the feature-control branch from a1a1fa6 to 567526b Compare June 30, 2022 10:42

CMCDragonkai marked this pull request as ready for review June 30, 2022 10:45

Updated benchmarks

879c5e2

CMCDragonkai changed the title ~~WIP: Introduce Snapshot Isolation OCC to DBTransaction~~ Introduce Snapshot Isolation OCC to DBTransaction Jun 30, 2022

CMCDragonkai merged commit 8fd0c9a into staging Jun 30, 2022

This was referenced Jun 30, 2022

Deadlock Detection #39

Closed

Serializable Snapshot Isolation #40

Open

CMCDragonkai mentioned this pull request Nov 10, 2022

Using QUIC/HTTP3 to replace utp-native for the Data Transfer Layer in the networking domain MatrixAI/Polykey#234

Closed

19 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce Snapshot Isolation OCC to DBTransaction #19

Introduce Snapshot Isolation OCC to DBTransaction #19

CMCDragonkai commented May 24, 2022 •

edited

Loading

CMCDragonkai commented May 24, 2022

CMCDragonkai commented May 24, 2022

CMCDragonkai commented May 25, 2022 •

edited

Loading

CMCDragonkai commented May 25, 2022 •

edited

Loading

CMCDragonkai commented May 25, 2022

CMCDragonkai commented May 25, 2022

CMCDragonkai commented May 25, 2022

tegefaulkes commented May 26, 2022

CMCDragonkai commented May 26, 2022 •

edited

Loading

CMCDragonkai commented May 26, 2022

CMCDragonkai commented May 26, 2022

CMCDragonkai commented May 26, 2022

CMCDragonkai commented May 26, 2022

ghost commented May 26, 2022 •

edited by ghost

Loading

CMCDragonkai commented May 30, 2022 •

edited

Loading

CMCDragonkai commented May 30, 2022

CMCDragonkai commented May 30, 2022

CMCDragonkai commented May 30, 2022

CMCDragonkai commented Jun 26, 2022 •

edited

Loading

CMCDragonkai commented Jun 26, 2022

CMCDragonkai commented Jun 27, 2022 •

edited

Loading

CMCDragonkai commented Jun 27, 2022

CMCDragonkai commented Jun 28, 2022 •

edited

Loading

CMCDragonkai commented Jun 30, 2022

CMCDragonkai commented Jun 30, 2022

CMCDragonkai commented Jun 30, 2022

Introduce Snapshot Isolation OCC to DBTransaction #19

Introduce Snapshot Isolation OCC to DBTransaction #19

Conversation

CMCDragonkai commented May 24, 2022 • edited Loading

Description

Issues Fixed

Tasks

Final checklist

CMCDragonkai commented May 24, 2022

CMCDragonkai commented May 24, 2022

CMCDragonkai commented May 25, 2022 • edited Loading

CMCDragonkai commented May 25, 2022 • edited Loading

CMCDragonkai commented May 25, 2022

CMCDragonkai commented May 25, 2022

CMCDragonkai commented May 25, 2022

tegefaulkes commented May 26, 2022

CMCDragonkai commented May 26, 2022 • edited Loading

CMCDragonkai commented May 26, 2022

CMCDragonkai commented May 26, 2022

CMCDragonkai commented May 26, 2022

CMCDragonkai commented May 26, 2022

ghost commented May 26, 2022 • edited by ghost Loading

Legend

CMCDragonkai commented May 30, 2022 • edited Loading

CMCDragonkai commented May 30, 2022

CMCDragonkai commented May 30, 2022

CMCDragonkai commented May 30, 2022

CMCDragonkai commented Jun 26, 2022 • edited Loading

CMCDragonkai commented Jun 26, 2022

CMCDragonkai commented Jun 27, 2022 • edited Loading

CMCDragonkai commented Jun 27, 2022

CMCDragonkai commented Jun 28, 2022 • edited Loading

CMCDragonkai commented Jun 30, 2022

CMCDragonkai commented Jun 30, 2022

CMCDragonkai commented Jun 30, 2022

CMCDragonkai commented May 24, 2022 •

edited

Loading

CMCDragonkai commented May 25, 2022 •

edited

Loading

CMCDragonkai commented May 25, 2022 •

edited

Loading

CMCDragonkai commented May 26, 2022 •

edited

Loading

ghost commented May 26, 2022 •

edited by ghost

Loading

CMCDragonkai commented May 30, 2022 •

edited

Loading

CMCDragonkai commented Jun 26, 2022 •

edited

Loading

CMCDragonkai commented Jun 27, 2022 •

edited

Loading

CMCDragonkai commented Jun 28, 2022 •

edited

Loading