Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use simd to optimize uft8 validation. #437

Closed
Liyixin95 opened this issue Oct 31, 2023 · 3 comments
Closed

use simd to optimize uft8 validation. #437

Liyixin95 opened this issue Oct 31, 2023 · 3 comments

Comments

@Liyixin95
Copy link
Contributor

benchmark env

benchmark suit

benchmark suit

id: 10, 12, 14

my computer

windows11
13th Gen Intel(R) Core(TM) i7-13700H
32.0 GB 3200 MHz

benchmark result

original

Running tests nonverbosely...

Running Flat BSON Decoding...
  [00:01:01] [########################################] 100/100 (0s)
TEST: Flat BSON Decoding -- Score: 367.726 MB/s, Median Iteration Time: 0.205s

Running Deep BSON Decoding...
  [00:01:01] [########################################] 100/100 (0s)
TEST: Deep BSON Decoding -- Score: 67.747 MB/s, Median Iteration Time: 0.290s

Running Full BSON Decoding...
  [00:01:01] [########################################] 100/100 (0s)
TEST: Full BSON Decoding -- Score: 274.048 MB/s, Median Iteration Time: 0.209s

BSONBench Score = 236.507 MB/s

bstr

Running tests nonverbosely...

Running Flat BSON Decoding...
  [00:01:01] [########################################] 100/100 (0s)
TEST: Flat BSON Decoding -- Score: 394.848 MB/s, Median Iteration Time: 0.191s

Running Deep BSON Decoding...
  [00:01:01] [########################################] 100/100 (0s)
TEST: Deep BSON Decoding -- Score: 73.264 MB/s, Median Iteration Time: 0.268s

Running Full BSON Decoding...
  [00:01:01] [########################################] 100/100 (0s)
TEST: Full BSON Decoding -- Score: 277.837 MB/s, Median Iteration Time: 0.206s

BSONBench Score = 248.650 MB/s

bstr need to be pinned to <1.7.0 to satisfy the msrv requirment.

simdutf8

Running tests nonverbosely...

Running Flat BSON Decoding...
  [00:01:01] [########################################] 100/100 (0s)
TEST: Flat BSON Decoding -- Score: 404.186 MB/s, Median Iteration Time: 0.186s

Running Deep BSON Decoding...
  [00:01:01] [########################################] 100/100 (0s)
TEST: Deep BSON Decoding -- Score: 74.206 MB/s, Median Iteration Time: 0.265s

Running Full BSON Decoding...
  [00:01:01] [########################################] 100/100 (0s)
TEST: Full BSON Decoding -- Score: 281.251 MB/s, Median Iteration Time: 0.204s

BSONBench Score = 253.214 MB/s

simdutf8 does not support from_utf8_lossy, so only lossless transformation was optimized.

conclusion

bstr looks like slower then simdutf8 in my computer, may be because they do not support avx2. So, I tend to use simdutf8 to optimize utf8 validation. But the final decision is depending on your teams.

@isabelatkinson
Copy link
Contributor

Hi @Liyixin95, thanks for running these benchmarks! Would you be interested in making a PR to switch over to one of the libraries you profiled? We don't currently have the bandwidth to do this but would be happy to review it. We'd also want to run these benchmarks on our machines to ensure that these performance improvements can be reproduced. Otherwise I can file a ticket to consider doing this in the future.

Copy link

There has not been any recent activity on this ticket, so we are marking it as stale. If we do not hear anything further from you, this issue will be automatically closed in one week.

@github-actions github-actions bot added the Stale label Nov 15, 2023
Copy link

There has not been any recent activity on this ticket, so we are closing it. Thanks for reaching out and please feel free to file a new issue if you have further questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants