# Example 4: Numpy Arrays (ndarrays)
This module gives examples on how to accept, handle, and create Numpy arrays.

In [1]:
import sys, numpy as np
import ex4_ndarrays as ex
help(ex)

Help on module ex4_ndarrays:

NAME
    ex4_ndarrays - Pass a numpy ndarray to some functions, and/or create a numpy array with some functions.

FUNCTIONS
    accept_array(x)
        Accept a numpy array as an input and do nothing (besides print
        on success).  This simply tests if the right #include worked and
        the 'import_array()' line is in the PyMODINIT_FUNC definition.
    
    accept_array_wrong(x)
        Same as the function 'accept_array', but this one does not handle
        the reference counting correctly.
    
    add_scalar_to_array(x, n)
        x is a numpy array of dtype numpy.float64 / python float
        n is a python scalar of dtype float.
        Return x+n
    
    add_two_arrays(x, y)
        Input two numpy arrays, (x and y) of dtype float.
        Return a new array that is the element-wise sum of the two.
    
    copy_1d_int8_array_get(a)
        Provide an array of type int8 and return a copy of that
        array.  This is also a 'right' way to

---
### Pass an ndarray to a function
The functions `accept_array` and `accept_array_wrong` simply demonstrate how to successfully write a module that will accept a Numpy array as input.  The only difference between the two is in the reference counting: `accept_array_wrong` will lead to a memory leak.

In [2]:
a = np.random.rand(10)
a

array([0.59425711, 0.93295992, 0.62418784, 0.49408697, 0.02893419,
       0.56268605, 0.47362242, 0.59279444, 0.52098547, 0.08787759])

We can use `sys.getrefcount` to see how many references there are to a given object.

In [4]:
sys.getrefcount(a) # run this cell twice for some reason

8

In [5]:
ex.accept_array(a)

Successful passing (and releasing) of an ndarray to function!


In [6]:
sys.getrefcount(a)

8

Note that the reference count before and after passing the array to `accept_array` is the same.

In [7]:
ex.accept_array_wrong(a)

Successful passing (but not releasing) of an ndarray to function!


In [10]:
sys.getrefcount(a)

10

In [9]:
ex.accept_array_wrong(a)
ex.accept_array_wrong(a)
ex.accept_array_wrong(a)
ex.accept_array_wrong(a)

Successful passing (but not releasing) of an ndarray to function!
Successful passing (but not releasing) of an ndarray to function!
Successful passing (but not releasing) of an ndarray to function!
Successful passing (but not releasing) of an ndarray to function!


In [11]:
sys.getrefcount(a)

10

Note that the reference count of the array has increased.  Because of this, `a` will never be freed by the garbage collector, even when it's not being used anymore.  Another important thing to note is that the reported refcount in Jupyter or IPython may be screwed up compared to running it in an actual Python executable.  The point to emphasize is simply that the refcount increases because `accept_array_wrong` neglects to decrement the refcount when it is finished.

---
### ndarray data types
The C api keeps track of the data type of elements in an ndarray with an enum.  One need never worry about the actual number, and instead refer to the enum name.  For example, if one wanted to create an array with dtype np.int16, one would simply write "`NPY_INT16`" instead of "`3`".  But below, one can see that multiple names point to the same number and are therefore equivalent data types (`NPY_DOUBLE` and `NPY_FLOAT64` are identical, for example).

In [12]:
ex.print_dtypes()

NPY_NTYPES = 24
NPY_BOOL = 0
NPY_BYTE = 1
NPY_INT8 = 1
NPY_INT16 = 3
NPY_SHORT = 3
NPY_INT32 = 5
NPY_INT64 = 7
NPY_INT = 5
NPY_INTP = 7
NPY_LONG = 7
NPY_LONGLONG = 9
NPY_UBYTE = 2
NPY_UINT8 = 2
NPY_UINT16 = 4
NPY_USHORT = 4
NPY_UINT32 = 6
NPY_UINT = 6
NPY_UINT64 = 8
NPY_UINTP = 8
NPY_ULONG = 8
NPY_ULONGLONG = 10
NPY_FLOAT16 = 23
NPY_HALF = 23
NPY_FLOAT32 = 11
NPY_FLOAT = 11
NPY_FLOAT64 = 12
NPY_DOUBLE = 12
NPY_LONGDOUBLE = 13
NPY_COMPLEX64 = 14
NPY_CFLOAT = 14
NPY_COMPLEX128 = 15
NPY_CDOUBLE = 15
NPY_CLONGDOUBLE = 16
NPY_DATETIME = 21
NPY_TIMEDELTA = 22
NPY_STRING = 18
NPY_UNICODE = 19
NPY_OBJECT = 17
NPY_VOID = 20
NPY_DEFAULT_TYPE = 12


---
### Getting the dtype of an ndarray
This function demonstrates how to query an ndarray for its dtype.

In [13]:
a = np.zeros(5, dtype=np.int8)
b = np.zeros(5, dtype=np.float64)
c = np.zeros(5, dtype=bool)

In [14]:
ex.get_dtype(a)

type integer is 1
NPY_BYTE
NPY_INT8


In [15]:
ex.get_dtype(b)

type integer is 12
NPY_FLOAT64
NPY_DOUBLE
NPY_DEFAULT_TYPE


In [16]:
ex.get_dtype(c)

type integer is 0
NPY_BOOL


---
### Describe an array
Here we access some metadata of the array, and also desribe some modified arrays

In [17]:
a = np.array([[1,2,3,4,5,6,7,8,9,10],
              [11,12,13,14,15,16,17,18,19,20]], dtype=np.int8)

In [18]:
ex.describe_array(a)

    dtype: NPY_BYTE
    C-contiguous
    Number of dimensions: 2
    Shape = (2, 10)
    Strides: (10, 1)
    Data pointer:   0x7fb3aa05bdf0
    Object pointer: 0x12e73a310


Note the data pointer of a.  If I slice the first row of the array, the object pointer is different, but the data pointer is the same.  This is because an array is really just a `struct` that tells the interpreter _how_ to look at the data, but the data itself exists separately, to which other arrays can also point.  The data pointer points to the first element in the array.

In [19]:
ex.describe_array(a[0,:])

    dtype: NPY_BYTE
    C-contiguous
    F-contiguous
    Number of dimensions: 1
    Shape = (10)
    Strides: (1)
    Data pointer:   0x7fb3aa05bdf0
    Object pointer: 0x12e73a970


If I slice the first row of the array, but take every other element, again the new object points to the same first element, but now the strides are different.

In [20]:
ex.describe_array(a[0,::2])

    dtype: NPY_BYTE
    Number of dimensions: 1
    Shape = (5)
    Strides: (2)
    Data pointer:   0x7fb3aa05bdf0
    Object pointer: 0x12e73a970


Taking the transpose creates a new object that also points to the same data.  However, it just says to read the data as F-contiguous instead of the original C-contiguous.

In [21]:
print("a:", flush=True)
ex.describe_array(a)
print("", flush=True)
print("-------------------------")
print("a.T")
ex.describe_array(a.T)

a:
    dtype: NPY_BYTE
    C-contiguous
    Number of dimensions: 2
    Shape = (2, 10)
    Strides: (10, 1)
    Data pointer:   0x7fb3aa05bdf0
    Object pointer: 0x12e73a310

-------------------------
a.T
    dtype: NPY_BYTE
    F-contiguous
    Number of dimensions: 2
    Shape = (10, 2)
    Strides: (1, 10)
    Data pointer:   0x7fb3aa05bdf0
    Object pointer: 0x12e73a970
