Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] ulab array corruption with slices #387

Closed
jepler opened this issue May 13, 2021 · 1 comment · Fixed by #388
Closed

[BUG] ulab array corruption with slices #387

jepler opened this issue May 13, 2021 · 1 comment · Fixed by #388
Labels
bug Something isn't working

Comments

@jepler
Copy link
Collaborator

jepler commented May 13, 2021

As reported by @kevinjwaters at adafruit/circuitpython#4753, there seems to be a problem with memory corruption when ulab arrays are sliced.

I am in a fresh, up to date clone of ulab master branch (90824d2), ran ./build.py.

I put the following content in gcproblem.py:

from ulab import numpy as np
import gc
data1 = np.ones(1000)[6:-6]
print(sum(data1)) # should print 988, consistently does
print(data1)      # should print an array full of ones, consistently does
gc.collect()
print(sum(data1)) # should print 988, usually does NOT
print(data1)      # should print an array full of ones, usually does NOT

then I run ./micropython/ports/unix/micropython gcproblem.py:

$ ./micropython/ports/unix/micropython  ~/gcproblem.py 
988.0000000000001
array([1.0, 1.0, 1.0, ..., 1.0, 1.0, 1.0], dtype=float64)
14240.0
array([0.0, 0.0, 4.670711089722114e-310, ..., 0.0, 4.670711089722114e-310, 256.0], dtype=float64)

The first two lines of output are correct, the second two are wrong.

I notice the following detail about how memoryview works (py/objarray.c in the micropython source code):

memoryview must keep a pointer to the base of the buffer so that the buffer is not GC'd if the original parent object is no longer around (we are assuming that all memoryview'able objects return a pointer which points to the start of a GC chunk).

It may be the case that a form of this trick needs to be followed in ulab.

@v923z
Copy link
Owner

v923z commented May 13, 2021

memoryview must keep a pointer to the base of the buffer so that the buffer is not GC'd if the original parent object is no longer around (we are assuming that all memoryview'able objects return a pointer which points to the start of a GC chunk).

It may be the case that a form of this trick needs to be followed in ulab.

So, would it be enough, if, in

typedef struct _ndarray_obj_t {
mp_obj_base_t base;
uint8_t dtype;
uint8_t itemsize;
uint8_t boolean;
uint8_t ndim;
size_t len;
size_t shape[ULAB_MAX_DIMS];
int32_t strides[ULAB_MAX_DIMS];
void *array;
} ndarray_obj_t;
, we added a new member to the struct, void *origin, and then in
ndarray_obj_t *ndarray_new_view(ndarray_obj_t *source, uint8_t ndim, size_t *shape, int32_t *strides, int32_t offset) {
// creates a new view from the input arguments
ndarray_obj_t *ndarray = m_new_obj(ndarray_obj_t);
ndarray->base.type = &ulab_ndarray_type;
ndarray->boolean = source->boolean;
ndarray->dtype = source->dtype;
ndarray->ndim = ndim;
ndarray->itemsize = source->itemsize;
ndarray->len = ndim == 0 ? 0 : 1;
for(uint8_t i=ULAB_MAX_DIMS; i > ULAB_MAX_DIMS-ndim; i--) {
ndarray->shape[i-1] = shape[i-1];
ndarray->strides[i-1] = strides[i-1];
ndarray->len *= shape[i-1];
}
uint8_t *pointer = (uint8_t *)source->array;
pointer += offset;
ndarray->array = pointer;
return ndarray;
}

simply had

ndarray->origin = source->origin;

Of course, we also have to set origin in

ndarray->array = array;
,

ndarray->origin = array;

Since ndarrays are always created either via ndarray_new_ndarray, or ndarray_new_view, those should be the only changes we have to implement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants