Upgrade to abstract-level interface #11

CMCDragonkai · 2022-03-17T06:50:46Z

Specification

The abstract-level https://github.com/Level/abstract-level is a new foundation of LevelDB. Upgrading to this will help us resolve all other issues in DB.

Use bufferWrap to support ArrayBuffer, Uint8Array and Buffer #3 - abstract-level supports it directly
Integrate DB into the level db interface so db levels are the same as db #8 - Our DB abstraction should be rebuilt as an abstract-level layer, that is a class extending AbstractLevel
```
class DB extends AbstractLevel { ... }
```
It will still need to use classic-level (which is just leveldown atm) in NodeJS because NodeJS doesn't support indexed DB atm. But this should then basically make DB just an implementation of the abstract-level interface.
DB would then be a specialised classic-level that sets up root sublevels of data, transactions, and index and performs semi-transparent encryption & decryption (since it doesn't have a block level yet).
DB would then return sublevels as wrapped DBs as well...
Implement True Snapshot Isolation for LevelDB #4 Should already be resolved, since we end up using iterator() for read-committed level transactions, and true snapshots aren't really needed.
Integrate Automatic Indexing into DB #1 becomes possible now that we have a better idea of what kind of indexing needs we need, and we can put all the index data under the index root level

#5 still cannot be solved until IDB becomes available, then we can use browser-level or level-js and have the transparent encryption be done at the block level. Or done at the C++ level inside leveldb, or rocksdb.

Wait until level has released version 8, although this can be started now, since we may not need to use the level package at all. The only dependencies of js-db may be abstract-level and classic-level.

Additional context

https://github.com/Level/abstract-level/blob/main/UPGRADING.md
https://github.com/Level/community#what-is-abstract-level
Add support for IndexedDB denoland/deno#1699 - Deno may have IDB one day
Fine Grained Concurrency Control Standardisation Polykey#294 - Expected downstream usage of DB
A way to get a deeply nested sublevel in one-go Level/subleveldown#112 - Getting deeply nested sublevel in one go

Tasks

...
...
...

The text was updated successfully, but these errors were encountered:

CMCDragonkai · 2022-03-22T05:47:06Z

I want to change the API interface to:

get(keys: string | Buffer | Array<string | Buffer>)

So that way get('key') or get(['level1', 'key']) can work.

And a way to create deeply nested levels more easily: Level/subleveldown#112

CMCDragonkai · 2022-03-23T04:03:53Z

We should remove any usage and management of sublevels. Unless the sublevels themselves are just DB objects. But mostly it should be possible to just directly use the DB object instance and use key paths.

The APIs between DB and DBTransaction is still different because we want to eventually move to where you can just specify a key path instead of directly maintaining particular sublevel objects. Once all methods can just take a full keypath, there's no need to manage a bunch of sublevel objects, and that should reduce the object maintenance overhead.

// root iterator at dataDb
DB.iterator(options);
// same
DB.iterator(options, []);
// down level 1
DB.iterator(options, ['level1']);
DB.iterator(options, ['level1', 'level2]);
DBTransaction.iterator(options, ['level1', 'level2']);

DB.count();
DB.count(['level1', 'level2']);
DBTransaction.count();
DBTransaction.count(['level1']);

DB.clear();
DB.clear(['level1']);
DBTransaction.clear(['level1']); // this should be a transactional clear, to do this, we must iterator and then do `db.del`

DB.get('key');
DB.get(['level1', 'key']);
// these are not allowed, or they are equivalent to asking for undefined
DB.get([]);
DB.get();

DB.put('key', 'value');
DB.put(['level1', 'key'], 'value');

DB.del('key');
DB.del(['level1', 'key']);
DBTransaction.del('key');
DBTransaction.del(['level1', 'key']);

CMCDragonkai · 2022-03-23T04:12:43Z

We will need to test and benchmark sublevel creation multiple times to see how fast we can do this though, because somethings like iterator and count will require creating those sublevels.

CMCDragonkai · 2022-03-23T04:36:10Z

Some benchmarking results show that naively creating sublevels in order to support the key paths API or the domain APIs is quite slow:

  create 1 sublevels:
    87 320 ops/s, ±0.79%   | fastest

  create 2 sublevels:
    45 325 ops/s, ±0.55%   | 48.09% slower

  create 3 sublevels:
    30 294 ops/s, ±0.39%   | 65.31% slower

  create 4 sublevels:
    22 764 ops/s, ±2.34%   | slowest, 73.93% slower

  get via sublevel:
    25 055 ops/s, ±1.36%   | 71.31% slower

  get via key path concatenation:
    54 992 ops/s, ±2.68%   | 37.02% slower

Therefore it appears that if we want to have an efficient implementation, instead of just creating intermediate sublevel objects, we instead should be doing key path/prefix concatenation.

Now this is what we do when we use get and put and del, however the methods like level and count and dump and iterator are not, and are just creating sublevel objects.

For level, count, dump and iterator, we fundamentally need subleveldown to support taking the concatenated prefix, or an array type as the prefix in Level/subleveldown#112 (comment). But this is prevented by subleveldown's RangeError that is thrown Level/subleveldown#111 (comment).

Basically it seems like db.sublevel(['foo', bar']) would be implemented efficiently as db.sublevel('foo!bar') so that the key path is just concatenated rather than creating intermediate sublevel objects.

Our in our case:

subleveldown(dbLevel, `foo${utils.prefix}bar`)
// or
subleveldown(dbLevel, ['foo', 'bar'])

CMCDragonkai · 2022-03-26T03:58:33Z

I suggest that we remove subleveldown entirely, and just allow key paths for both iterators and get, put, del and clear... etc. Most custom operations like dump, clear, count all rely on the iterator. And we don't need subleveldown if we run our iterator based on a specific key path for gt and lt`.

I reckon it would be something like this:

iterator({
  // gt means greater than this key
  // since !A!B! would be the domain path, any key added would always be 1 character more
  gt: '!A!B!',
  // this would have to be the very next byte number above A!B, but that which doesn't produce an extra character, we could create a buffer and just increment by 1 the last byte
  // if `!` is the separator which is 0x21, then plus 1 would be `"` which is 0x22
  lt: '!A!B"'
})

Then we probably solve a number of problems.

CMCDragonkai · 2022-03-26T04:19:10Z

This script proves we can just use gt and lt to get a proper iteration:

import DB from './src/DB';

async function main () {
  const db = await DB.createDB({
    dbPath: './tmp/db',
    fresh: true
  });

  const l1 = await db.level('level1');
  await db.put(['level1'], 'k', 'v');
  await db.put(['level1'], 'a', 'v');
  await db.put(['level1'], 'c', 'v');

  await db.level('level2');
  await db.put(['level2'], 'k', 'v');
  await db.put(['level2'], 'a', 'v');
  await db.put(['level2'], 'c', 'v');

  await db.put([], '!level1"', 'v');
  await db.put([], '!level1 ', 'v');

  console.log('USING SUBLEVELDOWN');
  for await (const k of l1.createKeyStream()) {
    console.log(k.toString());
  }

  console.log('ITERATOR');
  const i = db.dataDb.iterator({
    gt: '!level1!',
    lt: '!level1"'
  });
  // @ts-ignore
  for await (const [k, v] of i) {
    console.log(k.toString());
  }

  console.log('KEY STREAM');
  for await (const k of db.dataDb.createKeyStream({
    gt: '!level1!',
    lt: '!level1"'
  })) {
    console.log(k.toString());
  }

  console.log('DUMP');
  console.log(await db.dump());

  await db.stop();
}

main();

I just realised that dump doesn't preserve order, and that it assumes keys can just be strings. We can change this to an array of tuples, where the first tuple is always a buffer, or a binary string, and the value is decrypted as usual. If I want to use binary string I can use Buffer.toString('binary'), not this is the same as if I used String.fromCharCode(...key), but I used that in js-id due to Uint8Array.

The raw === true can be used to return buffers for both.

CMCDragonkai · 2022-03-26T04:19:52Z

With this prototype, I might just go ahead with getting rid of subleveldown entirely, and then that would make this 3.0.0 as the APIs are again different. Then that would affect PK quite significantly.

CMCDragonkai · 2022-03-26T04:26:12Z

Even without using subleveldown, the DB can still use abstract-level, and we sort of rely less on what the underlying system can do.

CMCDragonkai · 2022-03-26T05:07:53Z

I missed one thing, sublevels are actually indexed like this:

!A!!B!k

So between sublevels we have !! not just a single !.

CMCDragonkai · 2022-03-27T13:24:13Z

Due to #12, the 3.0.0 DB has been released and the interface shouldn't change all that much even with upgrade to abstract-level.

CMCDragonkai · 2022-03-31T06:19:53Z

#13 has enabled the ability to arbitrarily use level parts without worrying about reserved bytes.

CMCDragonkai added development Standard development epic Big issue with multiple subissues labels Mar 17, 2022

CMCDragonkai changed the title ~~Upgrade to https://github.com/Level/abstract-level~~ Upgrade to abstract-level interface Mar 17, 2022

CMCDragonkai mentioned this issue Mar 17, 2022

Enable the usage of withF and withG context for DB transactions and provide transactional iterators #10

Merged

14 tasks

CMCDragonkai mentioned this issue Mar 26, 2022

Subleveldown is now unnecessary, and it can be replaced with keypaths #12

Merged

9 tasks

CMCDragonkai mentioned this issue May 2, 2022

DBTransaction with Pessimistic Locking & Optimistic Locking and Deadlock Detection #17

Closed

4 tasks

CMCDragonkai mentioned this issue May 26, 2022

Introduce Snapshot Isolation OCC to DBTransaction #19

Merged

38 tasks

CMCDragonkai self-assigned this Jun 7, 2022

CMCDragonkai closed this as completed in #19 Jun 30, 2022

teebirdy added the r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management label Jul 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade to abstract-level interface #11

Upgrade to abstract-level interface #11

CMCDragonkai commented Mar 17, 2022 •

edited

Loading

CMCDragonkai commented Mar 22, 2022

CMCDragonkai commented Mar 23, 2022

CMCDragonkai commented Mar 23, 2022

CMCDragonkai commented Mar 23, 2022 •

edited

Loading

CMCDragonkai commented Mar 26, 2022

CMCDragonkai commented Mar 26, 2022

CMCDragonkai commented Mar 26, 2022

CMCDragonkai commented Mar 26, 2022

CMCDragonkai commented Mar 26, 2022

CMCDragonkai commented Mar 27, 2022

CMCDragonkai commented Mar 31, 2022

Upgrade to abstract-level interface #11

Upgrade to abstract-level interface #11

Comments

CMCDragonkai commented Mar 17, 2022 • edited Loading

Specification

Additional context

Tasks

CMCDragonkai commented Mar 22, 2022

CMCDragonkai commented Mar 23, 2022

CMCDragonkai commented Mar 23, 2022

CMCDragonkai commented Mar 23, 2022 • edited Loading

CMCDragonkai commented Mar 26, 2022

CMCDragonkai commented Mar 26, 2022

CMCDragonkai commented Mar 26, 2022

CMCDragonkai commented Mar 26, 2022

CMCDragonkai commented Mar 26, 2022

CMCDragonkai commented Mar 27, 2022

CMCDragonkai commented Mar 31, 2022

CMCDragonkai commented Mar 17, 2022 •

edited

Loading

CMCDragonkai commented Mar 23, 2022 •

edited

Loading