Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer: improve btoa performance #52427

Merged
merged 2 commits into from
Apr 11, 2024
Merged

Conversation

anonrig
Copy link
Member

@anonrig anonrig commented Apr 8, 2024

Improves the performance of the legacy btoa function. It's not recommended to be used in production.

                                            confidence improvement accuracy (*)    (**)   (***)
buffers/buffer-btoa.js n=1000000 size=1024        ***   1089.00 %      ±15.27% ±20.57% ±27.29%
buffers/buffer-btoa.js n=1000000 size=128         ***    571.99 %       ±6.85%  ±9.22% ±12.22%
buffers/buffer-btoa.js n=1000000 size=16          ***    306.04 %       ±4.24%  ±5.70%  ±7.53%
buffers/buffer-btoa.js n=1000000 size=256         ***    870.77 %      ±13.85% ±18.67% ±24.78%
buffers/buffer-btoa.js n=1000000 size=32          ***    350.58 %       ±4.61%  ±6.18%  ±8.15%
buffers/buffer-btoa.js n=1000000 size=64          ***    462.57 %       ±5.60%  ±7.51%  ±9.91%

Disclaimer

"btoa" is a legacy functionality, and not recommended to be used in production applications.

fast and deprecated

Thanks Matthieu Riegler for the poster

@nodejs-github-bot nodejs-github-bot added buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. labels Apr 8, 2024
src/node_buffer.cc Outdated Show resolved Hide resolved
src/node_buffer.cc Outdated Show resolved Hide resolved
src/node_buffer.cc Outdated Show resolved Hide resolved
src/node_buffer.cc Outdated Show resolved Hide resolved
src/node_buffer.cc Outdated Show resolved Hide resolved
@lemire
Copy link
Member

lemire commented Apr 9, 2024

@anonrig I recommend trying something like this... We have three cases: one external one-byte string (easy), one non-external one byte string (we use WriteOneByte) and the general case where we go to 16-bit words, we try converting to latin1 and then we see what happens. In this scenario, there might be an error condition, currently in the code I suggest, it returns the empty string.

static void Btoa(const FunctionCallbackInfo<Value>& args) {
  CHECK_EQ(args.Length(), 1);
  Environment* env = Environment::GetCurrent(args);
  THROW_AND_RETURN_IF_NOT_STRING(env, args[0], "argument");

  Local<String> input = args[0].As<String>();
  MaybeStackBuffer<char> buffer;
  size_t written;

  if (input->IsExternalOneByte()) {  // 8-bit case
    auto ext = input->GetExternalOneByteStringResource();
    size_t expected_length = simdutf::base64_length_from_binary(ext->length());
    buffer.AllocateSufficientStorage(expected_length + 1);
    buffer.SetLengthAndZeroTerminate(expected_length);
    written = simdutf::binary_to_base64(
        ext->data(), ext->length(), buffer.out(), simdutf::base64_default);
  } if(input->IsOneByte()) {
    MaybeStackBuffer<uint8_t> stack_buf(input->Length());
        input->WriteOneByte(env->isolate(),
                           stack_buf.out(),
                           0,
                           input->Length(),
                           String::NO_NULL_TERMINATION);

    size_t expected_length = simdutf::base64_length_from_binary(input->Length());
    buffer.AllocateSufficientStorage(expected_length + 1);
    buffer.SetLengthAndZeroTerminate(expected_length);
    written = simdutf::binary_to_base64(
        reinterpret_cast<const char*>(*stack_buf), input->Length(), buffer.out());
  } else {
    String::Value value(env->isolate(), input);
    MaybeStackBuffer<char> stack_buf(value.length());
    size_t out_len = simdutf::convert_utf16_to_latin1(reinterpret_cast<const char16_t *>(*value), value.length(), stack_buf.out());
    if(out_len == 0) { // error
      return args.GetReturnValue().SetEmptyString();
    }
    size_t expected_length = simdutf::base64_length_from_binary(out_len);
    buffer.AllocateSufficientStorage(expected_length + 1);
    buffer.SetLengthAndZeroTerminate(expected_length);
    written = simdutf::binary_to_base64(
        *stack_buf, out_len, buffer.out());
  }

  auto value =
      String::NewFromOneByte(env->isolate(),
                             reinterpret_cast<const uint8_t*>(buffer.out()),
                             NewStringType::kNormal,
                             written)
          .ToLocalChecked();
  return args.GetReturnValue().Set(value);
}

This should be considered ChatGTP-quality code: I did not even try to test it... It is only as an idea.

@lemire
Copy link
Member

lemire commented Apr 9, 2024

@anonrig Lint. Lint. Lint.

@anonrig
Copy link
Member Author

anonrig commented Apr 9, 2024

@lemire I've updated the benchmark results as well :-)

@lemire
Copy link
Member

lemire commented Apr 10, 2024

@anonrig My proposed code would return the empty string on error, but we need to throw, I pushed a commit that does that. (Very minor change.)

@lemire lemire requested a review from joyeecheung April 10, 2024 17:42
@anonrig anonrig added the request-ci Add this label to start a Jenkins CI on a PR. label Apr 10, 2024
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Apr 10, 2024
@nodejs-github-bot
Copy link
Collaborator

@anonrig anonrig requested a review from benjamingr April 11, 2024 01:08
@nodejs-github-bot
Copy link
Collaborator

@anonrig anonrig added the commit-queue Add this label to land a pull request using GitHub Actions. label Apr 11, 2024
@nodejs-github-bot nodejs-github-bot added commit-queue-failed An error occurred while landing this pull request using GitHub Actions. and removed commit-queue Add this label to land a pull request using GitHub Actions. labels Apr 11, 2024
@nodejs-github-bot
Copy link
Collaborator

Commit Queue failed
- Loading data for nodejs/node/pull/52427
✔  Done loading data for nodejs/node/pull/52427
----------------------------------- PR info ------------------------------------
Title      buffer: improve `btoa` performance (#52427)
Author     Yagiz Nizipli  (@anonrig)
Branch     anonrig:improve-btoa -> nodejs:main
Labels     buffer, c++, needs-ci
Commits    2
 - buffer: improve `btoa` performance
 - fix: return exception
Committers 2
 - Yagiz Nizipli 
 - Daniel Lemire 
PR-URL: https://github.com/nodejs/node/pull/52427
Reviewed-By: Daniel Lemire 
Reviewed-By: James M Snell 
------------------------------ Generated metadata ------------------------------
PR-URL: https://github.com/nodejs/node/pull/52427
Reviewed-By: Daniel Lemire 
Reviewed-By: James M Snell 
--------------------------------------------------------------------------------
   ℹ  This PR was created on Mon, 08 Apr 2024 17:24:56 GMT
   ✔  Approvals: 2
   ✔  - Daniel Lemire (@lemire): https://github.com/nodejs/node/pull/52427#pullrequestreview-1992265694
   ✔  - James M Snell (@jasnell) (TSC): https://github.com/nodejs/node/pull/52427#pullrequestreview-1993089038
   ⚠  GitHub cannot link the author of 'fix: return exception' to their GitHub account.
   ⚠  Please suggest them to take a look at https://github.com/nodejs/node/blob/99b1ada/doc/guides/contributing/pull-requests.md#step-1-fork
   ✔  Last GitHub CI successful
   ℹ  Last Full PR CI on 2024-04-11T02:25:23Z: https://ci.nodejs.org/job/node-test-pull-request/58258/
- Querying data for job/node-test-pull-request/58258/
   ✔  Last Jenkins CI successful
--------------------------------------------------------------------------------
   ✔  No git cherry-pick in progress
   ✔  No git am in progress
   ✔  No git rebase in progress
--------------------------------------------------------------------------------
- Bringing origin/main up to date...
From https://github.com/nodejs/node
 * branch                  main       -> FETCH_HEAD
✔  origin/main is now up-to-date
- Downloading patch for 52427
From https://github.com/nodejs/node
 * branch                  refs/pull/52427/merge -> FETCH_HEAD
✔  Fetched commits as ee4fa77624f1..6900b0dce7cf
--------------------------------------------------------------------------------
[main 50c914c521] buffer: improve `btoa` performance
 Author: Yagiz Nizipli 
 Date: Mon Apr 8 13:23:56 2024 -0400
 3 files changed, 81 insertions(+), 7 deletions(-)
 create mode 100644 benchmark/buffers/buffer-btoa.js
[main de7c8dcb6a] fix: return exception
 Author: Daniel Lemire 
 Date: Wed Apr 10 09:44:11 2024 -0400
 2 files changed, 6 insertions(+), 2 deletions(-)
   ✔  Patches applied
There are 2 commits in the PR. Attempting autorebase.
Rebasing (2/4)

Executing: git node land --amend --yes
--------------------------------- New Message ----------------------------------
buffer: improve btoa performance

PR-URL: #52427
Reviewed-By: Daniel Lemire daniel@lemire.me
Reviewed-By: James M Snell jasnell@gmail.com

[detached HEAD f1e7db2933] buffer: improve btoa performance
Author: Yagiz Nizipli yagiz@nizipli.com
Date: Mon Apr 8 13:23:56 2024 -0400
3 files changed, 81 insertions(+), 7 deletions(-)
create mode 100644 benchmark/buffers/buffer-btoa.js
Rebasing (3/4)
Rebasing (4/4)

Executing: git node land --amend --yes
--------------------------------- New Message ----------------------------------
fix: return exception

PR-URL: #52427
Reviewed-By: Daniel Lemire daniel@lemire.me
Reviewed-By: James M Snell jasnell@gmail.com

[detached HEAD a285a09324] fix: return exception
Author: Daniel Lemire dlemire@lemire.me
Date: Wed Apr 10 09:44:11 2024 -0400
2 files changed, 6 insertions(+), 2 deletions(-)

Successfully rebased and updated refs/heads/main.

ℹ Add commit-queue-squash label to land the PR as one commit, or commit-queue-rebase to land as separate commits.

https://github.com/nodejs/node/actions/runs/8645651485

@panva panva removed the commit-queue-failed An error occurred while landing this pull request using GitHub Actions. label Apr 11, 2024
@panva panva added commit-queue Add this label to land a pull request using GitHub Actions. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. labels Apr 11, 2024
@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Apr 11, 2024
@nodejs-github-bot nodejs-github-bot merged commit 21211a3 into nodejs:main Apr 11, 2024
66 checks passed
@nodejs-github-bot
Copy link
Collaborator

Landed in 21211a3

marco-ippolito pushed a commit that referenced this pull request May 2, 2024
PR-URL: #52427
Reviewed-By: Daniel Lemire <daniel@lemire.me>
Reviewed-By: James M Snell <jasnell@gmail.com>
marco-ippolito pushed a commit that referenced this pull request May 3, 2024
PR-URL: #52427
Reviewed-By: Daniel Lemire <daniel@lemire.me>
Reviewed-By: James M Snell <jasnell@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants