You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
bot += ker (sum of the kernel) is being re-computed for every element in the image. This is unnecessary. Since bot is being reset to 0 for each element, it is highly unlikely that the C compiler is clever enough to optimize this out by hoisting it above the outer most loop.
Hmmm, maybe not. The interpolation scheme seems to prevent this hoist and is the reason that the kernel summation exists in the Cython code. If the image element is NaN then the kernel element is not incorporated in the sum. (Only an issue for NaN interpolation.)
If this was straight C, I would make this function inline and then wrap it with two funcs or by a single with a condition on normalize_by_kernel so as to shortcircuit the internal conditional pre-call. Since the function is now inline, the C compiler would (should) remove the conditional from within the loop altogether. This way the performance increase can be obtained whilst keeping the code in a single place to remove duplication and aid readability plus and maintenance.
The text was updated successfully, but these errors were encountered:
Hoist index computions
astropy/convolution/boundary_none.pyx::convolve2d_boundary_none()
Is (at least) more performant as the foloowing (though this is probably (hopefully) optimized as such by the compiler anyhow)
Hoist kernel.sum()
c.f. #4
bot += ker
(sum of the kernel) is being re-computed for every element in the image. This is unnecessary. Sincebot
is being reset to0
for each element, it is highly unlikely that the C compiler is clever enough to optimize this out by hoisting it above the outer most loop.Hmmm, maybe not. The interpolation scheme seems to prevent this hoist and is the reason that the kernel summation exists in the Cython code. If the image element is
NaN
then the kernel element is not incorporated in the sum. (Only an issue for NaN interpolation.)Shortcircuit
normalize_by_kernel
c.f. #4
If this was straight C, I would make this function
inline
and then wrap it with two funcs or by a single with a condition onnormalize_by_kernel
so as to shortcircuit the internal conditional pre-call. Since the function is now inline, the C compiler would (should) remove the conditional from within the loop altogether. This way the performance increase can be obtained whilst keeping the code in a single place to remove duplication and aid readability plus and maintenance.The text was updated successfully, but these errors were encountered: