Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use bufferWrap to support ArrayBuffer, Uint8Array and Buffer #3

Open
2 of 11 tasks
CMCDragonkai opened this issue Oct 7, 2021 · 17 comments
Open
2 of 11 tasks
Labels
development Standard development r&d:polykey:core activity 2 Cross Platform Cryptography for JavaScript Platforms

Comments

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Oct 7, 2021

Specification

The ArrayBuffer type is more generalised in the JS ecosystem. The DB right now focuses on taking Node Buffers.

Node buffers can be converted to ArrayBuffer easily, and ArrayBuffer can be wrapped in Node buffers.

According to this comment: Level/level-js#34 (comment) it is possible to use ArrayBuffers directly in leveldb, it just not documented.

It says that the asBuffer has to be turned false.

I'm not sure how this would work with keyEncoding and valueEncoding that we have set to 'binary'.

Our raw is also specifying that we would get Buffer, would that mean we instead say that we return ArrayBuffer instead, and with keys it would be string | ArrayBuffer?

Furthermore this would impact js-encryptedfs, so it's worthwhile to explore the implications of this.

The primary reason to do this would be cross-platform compatibility for mobile platforms that may not have the Node buffer.

The alternative would be use https://github.com/feross/buffer everywhere as a dependency so that way everything just uses the buffer library. This will mean that any usage of import { Buffer } from 'buffer'; will be resolved by feross/buffer first though, so one should beware of that.

Note that all Node Buffer is Uint8Array which is ArrayBuffer.

If something supports ArrayBuffer, they would support Uint8Array and Buffer at the same time.

One benefit would be integration of js-id MatrixAI/js-id#1 can be simplified since Id as Uint8Array can be stored directly in the DB without first wrapping them as Buffer.

Additional context

Tasks

  1. - Investigate ArrayBuffer compatibility in leveldb
  2. - Investigate to what extent can we enable compatibility with ArrayBuffer in DB without losing Buffer support
  3. - Replace all uses of string | Buffer with string | ArrayBuffer for keys, but retain Buffer on returned output.
  4. - Change this.crypto.key to be assumed to be ArrayBuffer
  5. - Change toArrayBuffer to an explicit slice copy of all possible instances of ArrayBuffer type
  6. - The fromArrayBuffer is still useful since we return Buffer.
  7. - Change serialize and deserialize to use ArrayBuffer and TextEncoder and TextDecoder which can do the utf-8 encoding/decoding.
  8. - Check if PK is using any of the utility functions here, some uses of deserialize and serialize might need to change.
  9. - Ensure that most signatures are still compatible
  10. - Update all tests to include Uint8Array and ArrayBuffer usage, and ensure that slice-copying is working.
  11. - Check that there's no performance regression on the benchmarks
@CMCDragonkai CMCDragonkai added the development Standard development label Oct 7, 2021
@CMCDragonkai CMCDragonkai changed the title Support ArrayBuffer instead of Buffer Support ArrayBuffer instead of (or in addition to) Buffer Oct 7, 2021
@CMCDragonkai
Copy link
Member Author

Note that node module resolution would load locally first before loading from anywhere else.

I'm not sure how this applies to non-node runtimes like in NativeScript. Recommend not doing anything here until we have understood the platform properly.

The usage of ArrayBuffer would be a standardisation across other JS libraries and anything claiming to support ecmascript remember.

@CMCDragonkai
Copy link
Member Author

Note that all Node Buffer is Uint8Array which is ArrayBuffer.

If something supports ArrayBuffer, they would support Uint8Array and Buffer at the same time.

We are also using alot of Uint8Array where Buffer was normally used. But ArrayBuffer is always useful from a "storage" POV. And returning Uint8Array means you always returning something that can be ArrayBuffer too either by using it directly or with bytes.buffer.

Therefore we can be satisfied with supporting Uint8Array as well. But maximum compatibility is going to be the ArrayBuffer.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Oct 19, 2021

The type we use in js-db is LevelDB<string | Buffer, Buffer>. This means the values returned satisfy the interface of Buffer, Uint8Array and ArrayBuffer all together.

As for string | Buffer in src/DB.ts, if it works just by changing Buffer to ArrayBuffer, then this would solve this issue, and one could pass Buffer, ArrayBuffer or Uint8Array as the key.

@CMCDragonkai
Copy link
Member Author

The only issue would be the domainPath that uses prefixer. Does that work with just ArrayBuffer, or would that need Uint8Array?

@CMCDragonkai
Copy link
Member Author

No it doesn't because it uses Buffer.isBuffer to check. It throws this error Error: key should be a string or a buffer if it is not Buffer.

However one could wrap it in our domainPath function. We should be able to zero-copy it into a Node Buffer.

@CMCDragonkai
Copy link
Member Author

The domainPath can be resolved by this change:

function domainPath(levels: DBDomain, key: string | ArrayBuffer): string | Buffer {
  // prefixer only takes string or Node Buffer
  // convert key into key_ which wraps ArrayBuffer as Buffer
  let key_: string | Buffer;
  if (typeof key === 'string') {
    key_ = key;
  } else if (ArrayBuffer.isView(key)) {
    key_ = Buffer.from(key.buffer, key.byteOffset, key.byteLength);
  } else {
    key_ = Buffer.from(key);
  }
  if (!levels.length) {
    return key_;
  }
  let prefix = key_;
  for (let i = levels.length - 1; i >= 0; i--) {
    prefix = prefixer(levels[i], prefix);
  }
  return prefix;
}

However this doesn't fix everything. We also use Buffer for this.crypto.key and there are complications with when using worker manager.

When using worker manager, we want to copy the ArrayBuffer, but if the actual instance is Node Buffer, the slice call creates a view, not a copy. So we have to distinguish by checking with Buffer.is and ArrayBuffer.isView and doing a set or copy or slice call. Ensuring that we are in fact copying the ArrayBuffer.

But yes this issue looks like it is totally possible.

@CMCDragonkai
Copy link
Member Author

Note that this would only present an ArrayBuffer interface to the outside world. Internally Node Buffer would still be used by the domainPath. And of course LevelDB still returns Buffer. So primarily this change would be about increasing the compatibility of DB as we progressively move towards more usage of Uint8Array and ArrayBuffer such as js-id and webcrypto.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Oct 19, 2021

The TextDecoder works on ArrayBuffer and Uint8Array.

[nix-shell:~/Projects/js-db]$ node
Welcome to Node.js v14.17.3.
Type ".help" for more information.
> u = new Uint8Array([0x61, 0x62, 0x63])
Uint8Array(3) [ 97, 98, 99 ]
> u.buffer
ArrayBuffer { [Uint8Contents]: <61 62 63>, byteLength: 3 }
> new TextDecoder().decode(u.buffer)
'abc'
> new TextDecoder().decode(u)
'abc'

@CMCDragonkai
Copy link
Member Author

Need to ensure that this doesn't result in a performance regression with respect to benchmarks.

@CMCDragonkai
Copy link
Member Author

Level/subleveldown#111 - seems like we should be able to use Buffer, further checking required.

@CMCDragonkai
Copy link
Member Author

This is somewhat more complicated now that we have alot of functions relying on the Buffer interface.

Applications that use this library right now has to use the feross buffer package if their runtime environment doesn't have the Buffer class.

Having an ArrayBuffer compatible system should make this alot more compatible, however unless this package is made generic over the underlying DB, that is browser indexeddb or node leveldb, then doing this doesn't really change the portability alot. However it does enable the usage of our system to be bit easier, so that when downstream libraries like PK uses ArrayBuffer, they don't need to convert it to a Buffer just so that it can be work inside the DB.

Although the new abstract-level has support for Uint8Array, so that may be a better way to support it https://github.com/Level/browser-level/blob/98c13d4356fdb6ca5668ddd35859e6c9b326b5cb/UPGRADING.md#uint8array-support rather than directly supporting ArrayBuffer.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Jul 22, 2022

We are no longer attached to abstract-level anymore. We are directly using rocksdb. As it forces the NAPI API, we might as well stick with buffer. Any need to use ArrayBuffer should just use the feross buffer.

The js-db will likely have native mobile bindings later, in which case even there I'd say we should continue using feross buffer. The ArrayBufferis just quite low level.

One way to quickly solve this issue is to just take ArrayBuffer type, and if it is an array buffer, wrap it up as a Buffer automatically. That way it avoids users having to wrap it themselves. And opens us up to directly using ArrayBuffer in the future.

@CMCDragonkai
Copy link
Member Author

If we receive an ArrayBuffer. All we need to do is Buffer.from(arrayBuffer). That's it. That will be zero-copy and just wrap the buffer.

If we use bufferWrap as in Polykey, then we can also support Uint8Array.

So let's just do that. Here's the bufferWrap function.

/**
 * Zero-copy wraps ArrayBuffer-like objects into Buffer
 * This supports ArrayBuffer, TypedArrays and the NodeJS Buffer
 */
function bufferWrap(
  array: BufferSource,
  offset?: number,
  length?: number,
): Buffer {
  if (Buffer.isBuffer(array)) {
    return array;
  } else if (ArrayBuffer.isView(array)) {
    return Buffer.from(
      array.buffer,
      offset ?? array.byteOffset,
      length ?? array.byteLength,
    );
  } else {
    return Buffer.from(array, offset, length);
  }
}

@CMCDragonkai CMCDragonkai changed the title Support ArrayBuffer instead of (or in addition to) Buffer Use bufferWrap to support ArrayBuffer, Uint8Array and Buffer Oct 17, 2022
@CMCDragonkai
Copy link
Member Author

This means when we are passing Ids into the DB, we no longer have to do const idBuffer = id.toBuffer();.

@CMCDragonkai CMCDragonkai self-assigned this Jul 10, 2023
@CMCDragonkai CMCDragonkai added the r&d:polykey:core activity 2 Cross Platform Cryptography for JavaScript Platforms label Jul 10, 2023
@CMCDragonkai
Copy link
Member Author

@amydevs Since in js-db we are planning to continue using Buffer, you can use it in js-ws as well, and just wrap your ArrayBuffer as a regular buffer.

Do this: #3 (comment) when you check instanceof ArrayBuffer.

@CMCDragonkai
Copy link
Member Author

Use the bufferWrap:

function bufferWrap(
  array: BufferSource,
  offset?: number,
  length?: number,
): Buffer {
  if (Buffer.isBuffer(array)) {
    return array;
  } else if (ArrayBuffer.isView(array)) {
    return Buffer.from(
      array.buffer,
      offset ?? array.byteOffset,
      length ?? array.byteLength,
    );
  } else {
    return Buffer.from(array, offset, length);
  }
}

Now the issue is that Buffer is not a regular type in browsers. But we deal with this by introducing feross/buffer as the polyfill.

However this should only be done in the application context. So either the application remembers to bundle up with Buffer when using such libraries. OR... the package.json could specify feross/buffer as some sort of polyfill-dependency? Not sure, could use peerDependencies.

@CMCDragonkai
Copy link
Member Author

Actually for js-ws it makes more sense to always convert to Uint8Array and not deal with node buffers.

So that means if it is an ArrayBuffer, you can do new Uint8Array(ab). If it is node buffer... it is already a Uint8Array.

@CMCDragonkai CMCDragonkai removed their assignment Sep 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development r&d:polykey:core activity 2 Cross Platform Cryptography for JavaScript Platforms
Development

No branches or pull requests

1 participant