Skip to content

Commit

Permalink
Better autovectorization of delta_in_planes
Browse files Browse the repository at this point in the history
By getting a subslice of the plane before the actual calculation of the
delta, the compiler is able to utilize SIMD instructions, but can't for
some reason when doing `.take()`. When compiling with AVX2 enabled, this
results in about a 5x speedup for `delta_in_planes`.
  • Loading branch information
redzic committed Sep 16, 2021
1 parent 2ec4e67 commit 73c6bcc
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion src/scenechange/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -426,10 +426,11 @@ impl<T: Pixel> SceneChangeDetector<T> {
let lines = plane1.rows_iter().zip(plane2.rows_iter());

for (l1, l2) in lines {
let l1 = l1.get(..plane1.cfg.width).unwrap_or(l1);
let l2 = l2.get(..plane1.cfg.width).unwrap_or(l2);
let delta_line = l1
.iter()
.zip(l2.iter())
.take(plane1.cfg.width)
.map(|(&p1, &p2)| {
(i16::cast_from(p1) - i16::cast_from(p2)).abs() as u32
})
Expand Down

0 comments on commit 73c6bcc

Please sign in to comment.