Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: fast ArrayBuffer encoding #566

Merged
merged 10 commits into from
May 11, 2022

Conversation

ninegua
Copy link
Member

@ninegua ninegua commented Apr 22, 2022

Description

At the moment trying to encode large binary data is very inefficient because one has to convert them to Array first before passing to the IDL encode function.
In fact trying to convert 100MB ArrayBuffer to Array will result in OOM in Node.js.

This patch adds a special case to type check and allows directly passing ArrayBuffer (e.g. Uint8Array) to the encode function and solves the problem.

How Has This Been Tested?

A test case is added.

Checklist:

  • My changes follow the guidelines in CONTRIBUTING.md.
  • The title of this PR complies with Conventional Commits.
  • I have edited the CHANGELOG accordingly.
  • I have made corresponding changes to the documentation.

@ninegua ninegua changed the title feature: fast ArrayBuffer encoding feat: fast ArrayBuffer encoding Apr 22, 2022
@@ -152,6 +152,20 @@ test('IDL encoding (tuple)', () => {
);
});

test('IDL encoding (arraybuffer)', () => {
// ArrayBuffer, encode only.
testEncode(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't change the decoding behavior because it will be a breaking change.

But I will suggest decoding should be changed too. What do people think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit a1e97d6 adds a fast decoding path too. This changes the return type from Array to ArrayBuffers, and will likely break user code when they upgrade.

@ninegua ninegua marked this pull request as ready for review April 22, 2022 17:44
packages/candid/src/idl.test.ts Outdated Show resolved Hide resolved
);
test_(
IDL.Vec(IDL.Nat64),
new BigUint64Array([BigInt(0), BigInt(1), BigInt(1) << BigInt(60), BigInt(13)]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I didn't know BigUint64Array exists.

// Special case for ArrayBuffer
const bits = this._type instanceof FixedNatClass ? this._type.bits : (this._type instanceof FixedIntClass ? this._type._bits : 0);
return (ArrayBuffer.isView(x) && bits == (x as any).BYTES_PER_ELEMENT * 8)
|| (Array.isArray(x) && x.every(v => this._type.covariant(v)));
}

public encodeValue(x: T[]) {
const len = lebEncode(x.length);
if (this._blobOptimization) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems _blobOptimization is not needed anymore?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is still needed if people pass plain array in.

If we remove this, then old code will probably get worse performance unless they also switch from Array to Uint8Array.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment in the code. I think eventually, people want to use typedArray for large vector anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


if (this._type instanceof FixedNatClass) {
if (this._type.bits == 8) {
return new Uint8Array(b.read(len)) as unknown as T[];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a breaking change? If so, we need to document it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a doc string to VecClass.

Not sure how to add to CHANGELOG.md since there is none in this repo.

ninegua and others added 2 commits April 22, 2022 11:25
Co-authored-by: Yan Chen <48968912+chenyan-dfinity@users.noreply.github.com>
Copy link
Contributor

@chenyan-dfinity chenyan-dfinity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@krpeacock krpeacock merged commit 974c06a into dfinity:main May 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants