Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimised extend #127

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
Draft

Optimised extend #127

wants to merge 8 commits into from

Conversation

jdonszelmann
Copy link
Collaborator

@jdonszelmann jdonszelmann commented Sep 15, 2023

@github-actions github-actions bot temporarily deployed to commit September 15, 2023 22:51 Inactive
@github-actions
Copy link

github-actions bot commented Sep 15, 2023

@github-actions github-actions bot temporarily deployed to pull request September 15, 2023 22:51 Inactive
///
/// Depending on the RingBuffer implementation, may be faster than inserting items in a loop
///
/// Note that this function is same as [`extend`] except that it is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually specialized now or will it be?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in constgeneric yes

where
T: Default,
where
T: Default,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rustfmt?

{
#[allow(clippy::let_unit_value)]
let _ = Self::ERROR_CAPACITY_IS_NOT_ALLOWED_TO_BE_ZERO;
let _ = Self::ERROR_CAPACITY_IS_NOT_ALLOWED_TO_BE_ZERO;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems weird?

@@ -418,6 +634,94 @@ mod tests {
}
}

struct WeirdIterator<T: IntoIterator>(<T as IntoIterator>::IntoIter, SizeHint);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weirderator

@jdonszelmann jdonszelmann marked this pull request as draft September 15, 2023 23:48
@github-actions github-actions bot temporarily deployed to commit September 15, 2023 23:49 Inactive
@github-actions github-actions bot temporarily deployed to pull request September 15, 2023 23:49 Inactive
@github-actions github-actions bot temporarily deployed to commit September 16, 2023 09:11 Inactive
@github-actions github-actions bot temporarily deployed to pull request September 16, 2023 09:11 Inactive
@jdonszelmann
Copy link
Collaborator Author

jdonszelmann commented Sep 16, 2023

All improvements are over a baseline of extend-by-push

Batch Size = 1

extend too many         time:   [7.2581 µs 7.3719 µs 7.4832 µs]
                        change: [-50.503% -49.899% -49.252%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  6 (6.00%) high mild
  1 (1.00%) high severe

extend many too many    time:   [7.2662 µs 7.3626 µs 7.4526 µs]
                        change: [-65.328% -64.824% -64.132%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

extend exact cap        time:   [7.2107 µs 7.3238 µs 7.4474 µs]
                        change: [-45.954% -45.304% -44.613%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

extend too few          time:   [4.4391 µs 4.4725 µs 4.5156 µs]
                        change: [-43.002% -41.961% -40.831%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

extend after one        time:   [10.913 µs 10.960 µs 11.013 µs]
                        change: [+47.418% +50.702% +54.065%] (p = 0.00 < 0.05)
                        Performance has regressed.

Batch size = 32

extend too many         time:   [8.7412 µs 8.8189 µs 8.8939 µs]
                        change: [-40.068% -39.467% -38.953%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

extend many too many    time:   [8.7698 µs 8.8540 µs 8.9391 µs]
                        change: [-57.313% -56.996% -56.660%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

extend exact cap        time:   [8.7375 µs 8.8073 µs 8.8730 µs]
                        change: [-33.538% -32.973% -32.406%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

extend too few          time:   [5.1076 µs 5.1550 µs 5.2014 µs]
                        change: [-39.999% -37.896% -35.822%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 37 outliers among 100 measurements (37.00%)
  23 (23.00%) low severe
  12 (12.00%) high mild
  2 (2.00%) high severe

extend after one        time:   [5.9613 µs 5.9809 µs 5.9962 µs]
                        change: [-26.778% -24.424% -22.282%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 25 outliers among 100 measurements (25.00%)
  24 (24.00%) low severe
  1 (1.00%) high mild

Batch size = 256

extend too many         time:   [7.0522 µs 7.1662 µs 7.2915 µs]
                        change: [-51.959% -51.049% -50.207%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) high mild
  4 (4.00%) high severe

extend many too many    time:   [6.8534 µs 6.9608 µs 7.0709 µs]
                        change: [-66.835% -66.522% -66.186%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  6 (6.00%) high mild
  1 (1.00%) high severe

extend exact cap        time:   [6.8443 µs 6.9401 µs 7.0342 µs]
                        change: [-48.258% -47.694% -47.152%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  6 (6.00%) high mild
  1 (1.00%) high severe

extend too few          time:   [4.3663 µs 4.3792 µs 4.3902 µs]
                        change: [-46.243% -44.938% -43.633%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 23 outliers among 100 measurements (23.00%)
  23 (23.00%) low severe

extend after one        time:   [7.0197 µs 7.0315 µs 7.0445 µs]
                        change: [-7.1696% -5.3259% -3.3635%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Batch size = 1024

extend too many         time:   [6.8468 µs 7.0021 µs 7.1560 µs]
                        change: [-54.049% -53.341% -52.581%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

extend many too many    time:   [6.8626 µs 6.9784 µs 7.0967 µs]
                        change: [-66.826% -66.451% -66.018%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

extend exact cap        time:   [6.9517 µs 7.0748 µs 7.1877 µs]
                        change: [-47.808% -47.035% -46.368%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

extend too few          time:   [4.3263 µs 4.3450 µs 4.3627 µs]
                        change: [-47.223% -45.877% -44.443%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 25 outliers among 100 measurements (25.00%)
  23 (23.00%) low severe
  2 (2.00%) high mild

extend after one        time:   [8.1760 µs 8.2432 µs 8.3070 µs]
                        change: [+2.1848% +4.7733% +7.4532%] (p = 0.00 < 0.05)
                        Performance has regressed.

Binary search to a good batch Size; Batch size = 64

extend too many         time:   [7.4389 µs 7.6824 µs 7.9425 µs]
                        change: [-50.277% -49.191% -47.914%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  5 (5.00%) high mild
  6 (6.00%) high severe

extend many too many    time:   [7.3194 µs 7.4105 µs 7.5069 µs]
                        change: [-64.339% -64.007% -63.710%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

extend exact cap        time:   [7.1240 µs 7.2078 µs 7.2905 µs]
                        change: [-45.487% -44.958% -44.426%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

extend too few          time:   [4.3125 µs 4.3217 µs 4.3302 µs]
                        change: [-45.560% -44.525% -43.404%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 24 outliers among 100 measurements (24.00%)
  23 (23.00%) low severe
  1 (1.00%) high mild

extend after one        time:   [6.9133 µs 6.9211 µs 6.9297 µs]
                        change: [-7.6948% -5.7499% -3.7005%] (p = 0.00 < 0.05)
                        Performance has improved.

Batch size = 16

extend too many         time:   [8.3775 µs 8.4665 µs 8.5494 µs]
                        change: [-42.998% -42.399% -41.812%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

extend many too many    time:   [8.4321 µs 8.5265 µs 8.6122 µs]
                        change: [-59.466% -59.067% -58.669%] (p = 0.00 < 0.05)
                        Performance has improved.

extend exact cap        time:   [8.2966 µs 8.4105 µs 8.5225 µs]
                        change: [-36.950% -36.340% -35.688%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

extend too few          time:   [4.8563 µs 4.8749 µs 4.8912 µs]
                        change: [-40.031% -38.532% -37.025%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 24 outliers among 100 measurements (24.00%)
  23 (23.00%) low severe
  1 (1.00%) high severe

extend after one        time:   [6.6999 µs 6.7177 µs 6.7321 µs]
                        change: [-15.276% -13.279% -11.167%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 25 outliers among 100 measurements (25.00%)
  25 (25.00%) low mild

Batch size = 30

extend too many         time:   [7.8955 µs 7.9927 µs 8.0828 µs]
                        change: [-46.726% -46.068% -45.483%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

extend many too many    time:   [7.8121 µs 7.9254 µs 8.0271 µs]
                        change: [-62.714% -62.287% -61.863%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

extend exact cap        time:   [7.7416 µs 7.8717 µs 7.9940 µs]
                        change: [-41.736% -41.071% -40.390%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

extend too few          time:   [4.3390 µs 4.3541 µs 4.3671 µs]
                        change: [-47.165% -45.753% -44.411%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 22 outliers among 100 measurements (22.00%)
  22 (22.00%) low severe

extend after one        time:   [5.4803 µs 5.4959 µs 5.5123 µs]
                        change: [-27.750% -26.327% -24.778%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

@github-actions github-actions bot temporarily deployed to pull request September 16, 2023 11:40 Inactive
@github-actions github-actions bot temporarily deployed to commit September 16, 2023 11:40 Inactive
@jdonszelmann
Copy link
Collaborator Author

Although surprising to me, a batch size of 30 seems to work very well for large extends when the buffer does not start out empty. This feature is now behind a feature flag, since technically it's not 100% tested but it seems to perform very well in all tests I could throw at it.

@github-actions github-actions bot temporarily deployed to commit September 16, 2023 11:43 Inactive
@github-actions github-actions bot temporarily deployed to pull request September 16, 2023 11:43 Inactive
@github-actions github-actions bot temporarily deployed to pull request September 16, 2023 11:50 Inactive
@github-actions github-actions bot temporarily deployed to commit September 16, 2023 11:50 Inactive
// Safety: clear above
unsafe { self.finish_iter::<BATCH_SIZE>(iter) };
} else if self.is_empty() {
self.clear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can move this into finish_iter and make it safe. Maybe rename it to new_from_iter or something.

self.clear();

// Safety: clear above
unsafe { self.finish_iter::<BATCH_SIZE>(iter) };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not extend_batched_internal?

// repeat until we find an empty batch
loop {
// fill up a batch
let how_full = Self::fill_batch(&mut batch, &mut other);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code might be simpler if you fill the batch with min(BATCH_SIZE, CAP - len) instead. Then you never have to split the batch up and stuff.

If I understand correctly, finish_iter is necessary because you need the writeptr not to be equal to readptr or something? Maybe if you do the min thing, that might not be necessary? In any case, might be nice to document why and in what cases you need finish_iter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No wait, I mean min(BATCH_SIZE, CAP - mod(writeptr, CAP)) I think. Whatever the number is of elements that fit until the end of the allocated space. So you don't have to split.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Better bulk insert APIs
2 participants