Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gaussian Blur implementation is very slow #4

Open
EhsanKia opened this issue Apr 29, 2019 · 2 comments
Open

Gaussian Blur implementation is very slow #4

EhsanKia opened this issue Apr 29, 2019 · 2 comments

Comments

@EhsanKia
Copy link

On a simple 1080x800px image, with kernel of size 10, it takes well over 15s. Not sure if there's more efficient ways to implement it, is it the same method used as the Java version?

I wonder if it's possible to use the browser's CSS filter for guassian blur, then transfer that back to a canvas.

@M-Valentino
Copy link

M-Valentino commented Sep 24, 2023

I looked through the code that does Gaussian blur, and the efficiency is at least O(n)^4 or larger. There is two for loops to iterate through the image, and there are two for loops to iterate through the kernel. It is possible to completely unroll the for loops, as in make them disappear, for iterating through the image kernel. That would make the efficiency at O(n)^2.

Example in Python:

gausKernel = [[1/16, 2/16, 1/16], [2/16, 4/16, 2/16], [1/16, 2/16, 1/16]]
accum = 0
data2 = copy.deepcopy(data)
for i in range(1, data.shape[0] - 1):
    for j in range(1, data.shape[1] - 1):
        accum += gausKernel[0][0] * data2[i - 1][j - 1][0]
        accum += gausKernel[0][1] * data2[i - 1][j][0]
        accum += gausKernel[0][2] * data2[i - 1][j + 1][0]
        accum += gausKernel[1][0] * data2[i][j - 1][0]
        accum += gausKernel[1][1] * data2[i][j][0]
        accum += gausKernel[1][2] * data2[i][j + 1][0]
        accum += gausKernel[2][0] * data2[i + 1][j - 1][0]
        accum += gausKernel[2][1] * data2[i + 1][j][0]
        accum += gausKernel[2][2] * data2[i + 1][j + 1][0]
        accum = round(accum)
        data2[i][j][0] = accum
        data2[i][j][1] = accum
        data2[i][j][2] = accum
        accum = 0
        
        
image3 = Image.fromarray(data2)
display(image3)

For variable kernel sizes, it would obviously make the code larger to maintain since you would need multiple functions.

@EhsanKia
Copy link
Author

You're not magically reducing the time complexity by unrolling the inner loops. It wasn't really O(n^4) before, it was O(w*h*k^2) and it still is after your change. But yeah this is a very slow operation you'd probably want to do on the gpu in parallel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants