Skip to content

RUST-1473 Add GridFS Benchmarks #772

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 28, 2022

Conversation

isabelatkinson
Copy link
Contributor

@isabelatkinson isabelatkinson commented Nov 8, 2022

@isabelatkinson isabelatkinson requested review from a team and patrickfreed and removed request for a team November 9, 2022 00:31
@isabelatkinson isabelatkinson marked this pull request as ready for review November 9, 2022 17:36
Copy link
Contributor

@patrickfreed patrickfreed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes all look good to me, but from looking at the patch link, it doesn't seem like the GridFS benchmarks are actually being run. Is this patch out of date? e.g. https://evergreen.mongodb.com/task_log_raw/mongo_rust_driver_benchmarking_driver_benchmarks__async_runtime~tokio_auth_and_tls~auth_and_tls_os~ubuntu_18.04_benchmark_4.2_replica_set_patch_1037712e4fb060c406e3f0cdeeaa0fdbcb762b84_636ae9b9306615468cacbff2_22_11_08_23_44_00/0?type=T#L1130

Also, it looks like GridFsDownloadStream needs to be publicly exported at the gridfs module level, though that's not strictly related to this PR.

@isabelatkinson
Copy link
Contributor Author

new patch: https://spruce.mongodb.com/version/6373fe0ed6d80a307003bef9/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC

Also, it looks like GridFsDownloadStream needs to be publicly exported at the gridfs module level, though that's not strictly related to this PR.

this is fixed in the upload stream PR

Copy link
Contributor

@patrickfreed patrickfreed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes LGTM!

I compared the results of our benchmark to the Node driver's, and it does seem like we're a bit slower in GridFS:

  • Download: 304 MB/s (Rust) vs 564 MB/s (Node)
  • Upload: 162 MB/s (Rust) vs 371 MB/s (Node)

Unfortunately, the Node driver doesn't seem to have results for the parallel GridFS benchmarks, so not sure how we compare there.

For the Download one, it looks like the difference might be down to us buffering the whole 50MB file in memory. It's possible we could improve this by pre-allocating the vec that we're downloading to. We could also have a dummy AsyncWrite implementation that just discards the bytes, but that seemed not in the spirit of the benchmark (I tried this and did see much faster numbers, btw).

I honestly can't tell why the Upload shows such a stark difference, especially given that Rust way outperforms Node in the LargeDocInsertOne benchmark. I think we should file a ticket and look into it in the future if GridFS performance is ever deemed an issue by users.

Here's the Node driver patch I used as a reference: https://evergreen.mongodb.com/task/mongo_node_driver_next_performance_tests_run_spec_benchmark_tests_8254575d7a33539c7e7e5cb0e3b621f49a7336e3_22_11_16_20_41_05

Copy link
Contributor

@kmahar kmahar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one nit but LGTM besides that, don't need to re-review

@@ -820,6 +820,7 @@ pub enum GridFsErrorKind {
WrongSizeChunk {
actual_size: usize,
expected_size: u32,
n: u32,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could we mention what n is in the docstring?

Copy link
Contributor

@abr-egn abr-egn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@isabelatkinson isabelatkinson merged commit f4b855b into mongodb:main Nov 28, 2022
@isabelatkinson isabelatkinson deleted the gridfs-benchmarks branch March 6, 2023 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants