# tutorials NumpyPointerToC

Gabriel de Marmiesse edited this page Jun 30, 2018 · 2 revisions

## Passing a numpy pointer to C/C++

A more up-to-date tutorial can be found in the Cython documentation.

One of the strengths of numpy arrays is that they are essentially wrappers around a regular C pointer (C array). This means that you can easily use Cython code to pass the data from a numpy array into C or C++ code, and manipulate it there, without any data copying.

In this case, the goal is to manipulate the data in a numpy array, such that there is no data copying, and the changes are seen in the numpy array on the Python side. This can be very useful, as you can then let Python/numpy handle all the memory management, while still leveraging C code that takes pointers to "standard" C arrays.

There are a number of ways to get the pointer from a numpy array -- this approach seems to be the consensus as the "best" way, at least as of June, 2012. Cython-users thread

## C function

A trivial C function (for example's sake) that multiplies all the elements of a 2-d array of floats by a passed-in value:

```/*
c_multiply.c

simple C function that alters data passed in via a pointer

used to see how we can do this with Cython/numpy

*/

void c_multiply (double* array, double multiplier, int m, int n) {

int i, j ;
int index = 0 ;

for (i = 0; i < m; i++) {
for (j = 0; j < n; j++) {
array[index] = array[index]  * multiplier ;
index ++ ;
}
}
return ;
}
```

## Cython Code

This code takes a numpy array, and passes its data pointer to the C function to do the real work.

```"""
multiply.pyx

simple cython test of accessing a numpy array's data

the C function: c_multiply multiplies all the values in a 2-d array by a scalar, in place.

"""

import cython

# import both numpy and the Cython declarations for numpy
import numpy as np
cimport numpy as np

# declare the interface to the C code
cdef extern void c_multiply (double* array, double value, int m, int n)

@cython.boundscheck(False)
@cython.wraparound(False)
def multiply(np.ndarray[double, ndim=2, mode="c"] input not None, double value):
"""
multiply (arr, value)

Takes a numpy arry as input, and multiplies each elemetn by value, in place

param: array -- a 2-d numpy array of np.float64
param: value -- a number that will be multiplied by each element in the array

"""
cdef int m, n

m, n = input.shape[0], input.shape[1]

c_multiply (&input[0,0], value, m, n)

return None

def multiply2(np.ndarray[double, ndim=2, mode="c"] input not None, double value):
"""
this method works fine, but is not as future-proof the nupy API might change, etc.
"""
cdef int m, n

m, n = input.shape[0], input.shape[1]

c_multiply (<double*> input.data, value, m, n)

return None
```

The `np.ndarray[double, ndim=2, mode="c"]` assures that you get a C-contiguous numpy array of doubles -- this is key, as it's important that the data pointer points to a standard C array of floats.

The `&input[0,0]` passed in the address of the beginning of the data array.

Note that if you wanted to, for example, iterate through the rows in Cython, but process each row as a 1-d C array, you could pass in the address of a row with: `&input[i,0]`, and similarly for other sub-parts of the array.

the :

`@cython.boundscheck(False)`

`@cython.wraparound(False)`

Tells Cython not to put in code to check if the indexes are out of the bounds of the array, and that you won't be using negative indexes (the Python syntax fo indexing from the end). This will prevent a bunch of error checking code from being added -- not a huge deal if there a fair bit of work being done in the function you're passing the pointer to, but still a lot more than a simple pointer pass.

This is totally safe if you are hard-coding the indexes to zero, like in this case -- you may want to be a bit more careful if you are using variables for the indexes.

## Other options

Another way to do this is to pass in the 'data' member of the numpy array:

`c_multiply (<double*> input.data, value, m, n)`

However, this relies on the current numpy data structure -- so it will break if numpy changes. from the mailing list:

I think `&input[0, 0]` should still be preferred, as accessing the `.data` attribute is deprecated in numpy and the rewrite to `PyArray_DATA()` not yet merged. Taking the pointer to the first element is also more consistent with memoryviews.

## Build script

Here is the setup.py to build the Cython and extension.

```#!/usr/bin/env python

from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext

import numpy

setup(
cmdclass = {'build_ext': build_ext},
ext_modules = [Extension("multiply",
sources=["multiply.pyx", "c_multiply.c"],
include_dirs=[numpy.get_include()])],
)
```

It can be built with `python setup.py build_ext --inplace`

## Test Code

some trivial test code (not a real unit test)

```#!/usr/bin/env python

"""
simple test of the multiply.pyx and c_multiply.c test code
"""

import numpy as np

import multiply

a = np.arange(12, dtype=np.float64).reshape((3,4))

print a

multiply.multiply(a, 3)

print a
```

## Unit Test Code

A py.test compliant unit test (might work with nose, with minor modification, too)

```#!/usr/bin/env python

"""
multiply.pyx and c_multiply.c test code

designed to be run-able with py.test
"""
import pytest
import numpy as np
import multiply

def test_basic():
a = np.arange(12, dtype=np.float64).reshape((3,4))
b = a * 3
multiply.multiply(a, 3)
assert np.array_equal(a, b)

def test_wrong_dims():
a = np.arange(12, dtype=np.float64).reshape((3,2,2))
with pytest.raises(ValueError):
multiply.multiply(a, 3)

def test_wrong_type():
a = np.arange(12, dtype=np.float32).reshape((3,4))
b = a * 3
with pytest.raises(ValueError):
multiply.multiply(a, 3)

def test_zero_dims():
"""
this shoudln't crash!
"""
a = np.ones( (3, 0), dtype=np.float64)
b = a.copy()
multiply.multiply(a, 3) # zero size, shouldn't do anything
assert np.array_equal(a, b)
```

CategoryCythonDoc