Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Float.Array.blit: fix case when both arrays coincide #10070

Merged
merged 3 commits into from Dec 6, 2020

Conversation

nojb
Copy link
Contributor

@nojb nojb commented Dec 3, 2020

In spite of what the docstring says, Float.Array.blit does not correctly handle the case when src and dst are the same array.

@nojb nojb changed the title Float.Array.blit: fix case when both array coincide Float.Array.blit: fix case when both arrays coincide Dec 3, 2020
@alainfrisch
Copy link
Contributor

The caml_array_blit C primitives relies on memmove. Perhaps it would be more efficient? (The current implementation cannot be used with no-flat-float-array mode, one would need to restore the tag check, or create another primitive.)

@nojb
Copy link
Contributor Author

nojb commented Dec 3, 2020

The caml_array_blit C primitives relies on memmove. Perhaps it would be more efficient? (The current implementation cannot be used with no-flat-float-array mode, one would need to restore the tag check, or create another primitive.)

I also wondered about this. The following microbenchmark

let n = 1000000

let () =
  let a = Float.Array.create (2 * n) in
  let b = Float.Array.copy a in
  for i = 1 to 1000 do
    Float.Array.blit a 0 b n n
  done

gives

real    0m1.070s
user    0m1.064s
sys     0m0.008s

with memmove, and

real    0m1.273s
user    0m1.274s
sys     0m0.001s

with the OCaml implementation (ie a ~1.2x speedup for memmove).

In the case of overlapping arrays:

let n = 1000000

let () =
  let a = Float.Array.create (2 * n) in
  let b = a in
  for i = 1 to 1000 do
    Float.Array.blit a 0 b n n
  done

I got:

real    0m0.828s
user    0m0.822s
sys     0m0.008s

with memmove and

real    0m1.273s
user    0m1.274s
sys     0m0.001s

with the OCaml implementation (ie ~1.5x speedup for memmove).

Given this, I pushed one more commit changing the implementation to use memmove.

@nojb
Copy link
Contributor Author

nojb commented Dec 4, 2020

This bug is present since 4.08. It would be nice to have the bugfix in 4.12 cc @Octachron

@xavierleroy
Copy link
Contributor

Only a 1.2 - 1.5 speedup by switching to memmove()? Not so bad for the naive OCaml code...

Copy link
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Please bow to check-typo and add a Changes entry.

I'm in favor of having this PR in 4.12, since it fixes a bug.

@xavierleroy xavierleroy added the bug label Dec 5, 2020
@xavierleroy xavierleroy added this to the 4.12 milestone Dec 5, 2020
@nojb
Copy link
Contributor Author

nojb commented Dec 6, 2020

Thanks for the review.

Looks good to me. Please bow to check-typo and add a Changes entry.

Done.

I'm in favor of having this PR in 4.12, since it fixes a bug.

I will merge to trunk as soon as the CI passes, but I await the green light from @Octachron before cherry-picking to 4.12. Also, should I backport to 4.08, 4.09, 4.10 and 4.11?

@nojb nojb merged commit dbb8dca into ocaml:trunk Dec 6, 2020
@nojb nojb deleted the nojebar_floatarray_blit branch December 6, 2020 11:44
@Octachron
Copy link
Member

I also think that this is a fine bug fix for 4.12 .

nojb added a commit that referenced this pull request Dec 7, 2020
* Add test
* Float.Array.blit: handle the case of overlapping arrays
* Changes
@nojb
Copy link
Contributor Author

nojb commented Dec 7, 2020

Cherry-picked to 4.12 e307d1e

dbuenzli pushed a commit to dbuenzli/ocaml that referenced this pull request Mar 25, 2021
* Add test
* Float.Array.blit: handle the case of overlapping arrays
* Changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants