Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use SSE/MMX/Altivec for faster CTMF #432

Merged
merged 8 commits into from
Feb 22, 2013
Merged

Conversation

thouis
Copy link
Contributor

@thouis thouis commented Feb 12, 2013

This speeds up CTMF by about a factor of 2 on my machine.

I don't have access to a non SSE machine, so some testing is in order.

* Modified from code
* Copyright (C) 2006 Simon Perreault
*
* Reference: S. Perreault and P. Hébert, "Median Filtering in Constant Time",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UTF 8

@stefanv
Copy link
Member

stefanv commented Feb 15, 2013

Is anyone able to test this on Windows?

/cc @blink1073

@blink1073
Copy link
Contributor

Will do, tonight

@@ -17,6 +17,11 @@ cimport cython
from libc.stdlib cimport malloc, free
from libc.string cimport memset

cdef extern from "_histogram.h":
ctypedef unsigned short int uint16_t
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use np.uint16_t here and below?

@blink1073
Copy link
Contributor

Good to go, but failed the following two unrelated tests: test_rank.test_otsu and test_freeimage.test_metadata. Tested on 32bit WinXP with Python 2.7.3, numpy 1.6.2, scipy 0.11.0, compiled with mingw32.

@stefanv
Copy link
Member

stefanv commented Feb 19, 2013

ping @thouis

@thouis
Copy link
Contributor Author

thouis commented Feb 20, 2013

I think I got everything above (plus cleaned up compilation a bit with some casts).

@@ -323,33 +327,17 @@ cdef void set_stride(Histograms *ph, SCoord *psc):
#
############################################################################
cdef inline np.int32_t tl_br_colidx(Histograms *ph, np.int32_t colidx):
return (colidx + 3*ph.radius + ph.current_row) % ph.stripe_length
return <np.int32_t> (colidx + 3*ph.radius + ph.current_row) % ph.stripe_length
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8

@stefanv
Copy link
Member

stefanv commented Feb 20, 2013

This looks good to me! Thanks, @thouis . If you can, just do a pass through the PEP8 checker, otherwise I can do that when I merge.

@thouis thouis closed this Feb 20, 2013
@thouis thouis reopened this Feb 20, 2013
@thouis
Copy link
Contributor Author

thouis commented Feb 20, 2013

Bah. I accidentally deleted the branch instead of pushing a new version to it. Anyway, fixed the pep8. Some long lines remain, but I thought they were more readable than breaking them up.

@ahojnnes
Copy link
Member

Merged.

ahojnnes added a commit that referenced this pull request Feb 22, 2013
Use SSE/MMX/Altivec for faster CTMF
@ahojnnes ahojnnes merged commit 9f14eda into scikit-image:master Feb 22, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants