Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: ndimage: log(n) implementation for 1D rank filter #20543

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
25 changes: 23 additions & 2 deletions scipy/ndimage/_filters.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
from . import _ni_support
from . import _nd_image
from . import _ni_docstrings
from . import _rank_filter_1d

__all__ = ['correlate1d', 'convolve1d', 'gaussian_filter1d', 'gaussian_filter',
'prewitt', 'sobel', 'generic_laplace', 'laplace',
Expand Down Expand Up @@ -1486,8 +1487,28 @@ def _rank_filter(input, rank, size=None, footprint=None, output=None,
"A sequence of modes is not supported by non-separable rank "
"filters")
mode = _ni_support._extend_mode_to_code(mode)
_nd_image.rank_filter(input, rank, footprint, output, mode, cval,
origins)
if input.ndim == 1:
rank = int(rank)
origin = int(origin)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should already be integers. Why was this needed?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My assumption was similar to yours but I have added it after failing some unit-test. This was back when I used Cython as a glue and it marked that my input is np.int64 rather than int. I did not want to look for the origin of the problem as I did not want to modify anything unrelated to this pull request. I hoped someone would pay attention and it would be sufficiently visible to be corrected in future.
Anyway, I do not see it now (I assume that Numpy-C API is not sensitive to it) and therefore I removed the casting.

# legacy mode handling
if mode == 6:
mode = 4
if mode == 5:
mode = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct indeed, as explained in the user guide: http://scipy.github.io/devdocs/tutorial/ndimage.html. It looks a bit out of place to do this here though. Not sure why it could not be done inside _extend_mode_to_code if the modes are actually identical.

casting_cond = input.dtype.name not in ['int64', 'float64', 'float32']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer using the actual dtypes: input.dtype not in (np.int64, np.float64, np.float32)

if casting_cond:
x = input.astype('int64')
x_out = np.empty_like(x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me like this should be using np.result_type to determine what the correct output type is. E.g., for other floating-point dtypes, float32 or float64 will be expected, not int64.

else:
x = input
x_out = output
cval = x.dtype.type(cval)
_rank_filter_1d.rank_filter(x, rank, footprint.size, x_out, mode, cval,
origin)
if casting_cond:
np.copyto(output, x_out, casting='unsafe')
else:
_nd_image.rank_filter(input, rank, footprint, output, mode, cval, origins)
if temp_needed:
temp[...] = output
output = temp
Expand Down
7 changes: 7 additions & 0 deletions scipy/ndimage/meson.build
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,13 @@ py3.extension_module('_ni_label',
subdir: 'scipy/ndimage'
)

py3.extension_module('_rank_filter_1d',
'src/_rank_filter_1d.cpp',
install: true,
dependencies: np_dep,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing a line here that is present for all other extension modules:

    link_args: version_link_args,

could you please add that?

subdir: 'scipy/ndimage'
)

py3.extension_module('_ctest',
'src/_ctest.c',
dependencies: np_dep,
Expand Down
304 changes: 304 additions & 0 deletions scipy/ndimage/src/_rank_filter_1d.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,304 @@
/*
Started working on https://ideone.com/8VVEa (Copyright (c) 2011 ashelly.myopenid.com
under <http://www.opensource.org/licenses/mit-license>),
I optimized by restriction of cases and proper initialization,
also adapted for rank filter rather than the original median filter.
Allowed different boundary conditions.
Moved to C++ for polymorphism and added C-Numpy API.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove the comments on evolution of the code, and copy exactly the copyright comment that was in the original file?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it ok with you? I would not insist on the second line but some major changes have been made in did.

//Copyright (c) 2011 ashelly.myopenid.com under http://www.opensource.org/licenses/mit-license
//Modified in 2024 by Gideon Genadi Kogan

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is fine with me.

*/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style comment: could you please run clang-format over this file?

#include "Python.h"
#include "numpy/arrayobject.h"

#include <stdlib.h>
#include <stdio.h>


struct Mediator//this is used for rank keeping
{
int* pos; //index into `heap` for each value
int* heap; //max/rank/min heap holding indexes into `data`.
int N; //allocated size.
int idx; //position in circular queue
int minCt; //count of items in min heap
int maxCt; //count of items in max heap
};

typedef enum {
NEAREST = 0,
WRAP = 1,
REFLECT = 2,
MIRROR = 3,
CONSTANT = 4,
} Mode;

/*--- Helper Functions ---*/

//returns 1 if heap[i] < heap[j]
template <typename T>
inline int mmless(T* data, Mediator* m, int i, int j)
{
return (data[m->heap[i]] < data[m->heap[j]]);
}

//swaps items i&j in heap, maintains indexes
int mmexchange(Mediator* m, int i, int j)
{
int t = m->heap[i];
m->heap[i] = m->heap[j];
m->heap[j] = t;
m->pos[m->heap[i]] = i;
m->pos[m->heap[j]] = j;
return 1;
}

//swaps items i & j if i < j; returns true if swapped
template <typename T>
inline int mmCmpExch(T* data, Mediator* m, int i, int j)
{
return (mmless(data, m,i,j) && mmexchange(m,i,j));
}

//maintains minheap property for all items below i.
template <typename T>
void minSortDown(T* data, Mediator* m, int i)
{
for (i*=2; i <= m->minCt; i*=2)
{ if (i < m->minCt && mmless(data, m, i+1, i)) { ++i; }
if (!mmCmpExch(data, m, i, i/2)) { break; }
}
}

//maintains maxheap property for all items below i. (negative indexes)
template <typename T>
void maxSortDown(T* data, Mediator* m, int i)
{
for (i*=2; i >= -m->maxCt; i*=2)
{ if (i > -m->maxCt && mmless(data, m, i, i-1)) { --i;}
if (!mmCmpExch(data, m, i/2, i)) { break; }
}
}

//maintains minheap property for all items above i, including the rank
//returns true if rank changed
template <typename T>
inline int minSortUp(T* data, Mediator* m, int i)
{
while (i>0 && mmCmpExch(data, m, i, i/2)) i/=2;
return (i==0);
}

//maintains maxheap property for all items above i, including rank
//returns true if rank changed
template <typename T>
inline int maxSortUp(T* data, Mediator* m, int i)
{
while (i<0 && mmCmpExch(data, m, i/2, i)) i/=2;
return (i==0);
}

/*--- Public Interface ---*/


//creates new Mediator: to calculate `nItems` running rank.
Mediator* MediatorNew(int nItems, int rank)
{
Mediator* m = (Mediator*)malloc(sizeof(Mediator));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to use new instead of malloc, since this is C++ code now.

m->pos = (int*)malloc(sizeof(int) * nItems);
m->heap = (int*)malloc(sizeof(int) * nItems);
if ((m == nullptr)||(m->pos == nullptr)||(m->heap == nullptr)){printf("out of memory\n"); exit(1);}
m->heap += rank; //points to rank
m->N = nItems;
m->idx = 0;
m->minCt = nItems - rank - 1;
m->maxCt = rank;
while (nItems--)
{
m->pos[nItems]= nItems - rank;
m->heap[m->pos[nItems]]=nItems;
}
return m;
}

//Inserts item, maintains rank in O(lg nItems)
template <typename T>
void MediatorInsert(T* data, Mediator* m, T v)
{
int p = m->pos[m->idx];
T old = data[m->idx];
data[m->idx] = v;
m->idx++;
if(m->idx == m->N){m->idx = 0; }

if (p > 0) //new item is in minHeap
{ if (v > old) { minSortDown(data, m, p); return; }
if (minSortUp(data, m, p) && mmCmpExch(data, m, 0, -1)) { maxSortDown(data, m,-1); }
}
else if (p < 0) //new item is in maxheap
{ if ( v < old) {maxSortDown(data, m, p); return; }
if (maxSortUp(data, m, p) && mmCmpExch(data, m, 1, 0)) { minSortDown(data, m, 1); }
}
else //new item is at rank
{ if (maxSortUp(data, m, -1)) { maxSortDown(data, m, -1); }
if (minSortUp(data, m, 1)) { minSortDown(data, m, 1); }
}
}

template <typename T>
void _rank_filter(T* in_arr, int rank, int arr_len, int win_len, T* out_arr, int mode, T cval, int origin)
{
int i, arr_len_thresh, lim = (win_len - 1) / 2 - origin;
int lim2 = arr_len - lim;
Mediator* m = MediatorNew(win_len, rank);
T* data = (T*)malloc(sizeof(T) * win_len);

switch (mode)
{
case REFLECT:
for (i=win_len - lim - 1; i > - 1; i--){MediatorInsert(data, m, in_arr[i]);}
break;
case CONSTANT:
for (i=win_len - lim; i > 0; i--){MediatorInsert(data, m, cval);}
break;
case NEAREST:
for (i=win_len - lim; i > 0; i--){MediatorInsert(data, m, in_arr[0]);}
break;
case MIRROR:
for (i=win_len - lim; i > 0; i--){MediatorInsert(data, m, in_arr[i]);}
break;
case WRAP:
for (i=arr_len - lim - 1 - 2 * origin; i < arr_len; i++){MediatorInsert(data, m, in_arr[i]);}
break;
}

for (i=0; i < lim; i++){MediatorInsert(data, m, in_arr[i]);}
for (i=lim; i < arr_len; i++)
{
MediatorInsert(data, m, in_arr[i]);
out_arr[i - lim] = data[m->heap[0]];
}
switch (mode)
{
case REFLECT:
arr_len_thresh = arr_len - 1;
for (i=0; i < lim; i++)
{
MediatorInsert(data, m, in_arr[arr_len_thresh - i]);
out_arr[lim2 + i] = data[m->heap[0]];
}
break;
case CONSTANT:
for (i=0; i < lim; i++)
{
MediatorInsert(data, m, cval);
out_arr[lim2 + i] = data[m->heap[0]];
}
break;
case NEAREST:
arr_len_thresh = arr_len - 1;
for (i=0; i < lim; i++)
{
MediatorInsert(data, m, in_arr[arr_len_thresh]);
out_arr[lim2 + i] = data[m->heap[0]];
}
break;
case MIRROR:
arr_len_thresh = arr_len - 2;
for (i=0; i < lim + 1; i++)
{
MediatorInsert(data, m, in_arr[arr_len_thresh - i]);
out_arr[lim2 + i] = data[m->heap[0]];
}
break;
case WRAP:
for (i=0; i < win_len; i++){
MediatorInsert(data, m, in_arr[i]);
out_arr[lim2 + i] = data[m->heap[0]];
}
break;
}

m->heap -= rank;
free(m->heap);
m->heap = nullptr;
free(m->pos);
m->pos = nullptr;
free(m);
m = nullptr;
free(data);
data = nullptr;
}

// Python wrapper for rank_filter
static PyObject* rank_filter(PyObject* self, PyObject* args)
{
PyObject *in_arr_obj, *out_arr_obj, *cval_obj;
int32_t rank, arr_len, win_len, mode, origin, type;
if (!PyArg_ParseTuple(args, "OiiOiOi", &in_arr_obj, &rank, &win_len, &out_arr_obj, &mode, &cval_obj, &origin))
{
return NULL;
}
PyArrayObject *in_arr = (PyArrayObject *)PyArray_FROM_OTF(in_arr_obj, NPY_NOTYPE, NPY_ARRAY_INOUT_ARRAY2);
PyArrayObject *out_arr = (PyArrayObject *)PyArray_FROM_OTF(out_arr_obj, NPY_NOTYPE, NPY_ARRAY_INOUT_ARRAY2);

if (in_arr == NULL || out_arr == NULL)
{
return NULL;
}
arr_len = PyArray_SIZE(in_arr);
type = PyArray_TYPE(in_arr);
switch (type) { // the considered types are float, double, int64
case NPY_FLOAT:
{
float *c_in_arr = (float *)PyArray_DATA(in_arr);
float *c_out_arr = (float *)PyArray_DATA(out_arr);
float cval = (float)PyFloat_AsDouble(cval_obj);
_rank_filter(c_in_arr, rank, arr_len, win_len, c_out_arr, mode, cval, origin);
break;
}
case NPY_DOUBLE:
{
double *c_in_arr = (double *)PyArray_DATA(in_arr);
double *c_out_arr = (double *)PyArray_DATA(out_arr);
double cval = PyFloat_AsDouble(cval_obj);
_rank_filter(c_in_arr, rank, arr_len, win_len, c_out_arr, mode, cval, origin);
break;
}
case NPY_INT64:
{
int64_t *c_in_arr = (int64_t *)PyArray_DATA(in_arr);
int64_t *c_out_arr = (int64_t *)PyArray_DATA(out_arr);
int64_t cval = PyLong_AsLongLong(cval_obj);
_rank_filter(c_in_arr, rank, arr_len, win_len, c_out_arr, mode, cval, origin);
break;
}
default:
PyErr_SetString(PyExc_TypeError, "Unsupported array type");
break;
}
Py_DECREF(in_arr);
Py_DECREF(out_arr);
Py_RETURN_NONE;
}

//define the module methods
static PyMethodDef myMethods[] = {
{"rank_filter", rank_filter, METH_VARARGS, "1D rank filter"},
{NULL, NULL, 0, NULL}};


//define the module
static struct PyModuleDef _rank_filter_1d = {
PyModuleDef_HEAD_INIT,
"_rank_filter_1d",
"1D rank filter module",
-1,
myMethods};

//init the module
PyMODINIT_FUNC PyInit__rank_filter_1d(void)
{
import_array();
return PyModule_Create(&_rank_filter_1d);
}