# ndarray

> The NDArray class in minima is a kind of container that can hold all sorts of elements, especially numbers (but for now just f32 numbers).  
> It can work with data that has any number of dimensions. For example, it can handle simple 1-dimensional rows or columns, more complex 2-dimensional  
>  matrices, or even arrays with more dimensions. The cool thing about this NDArray class is that it can work with different backends.  
> So, you could use it with Numpy, or a CPU backend, or even a GPU backend. It's a versatile tool for handling multi-dimensional data.  

In [None]:
#| default_exp ndarray

In [None]:
#| export
import math
import numpy as np
from minima import ndarray_backend_numpy
from typing import Optional, Sequence, Tuple, Union, Callable, Any
from minima.utility import *
# from . import ndarray_backend_cpu

In [None]:
#| export
class BackendDevice:
    """A backend device, wraps the implementation module."""

    def __init__(self, name, mod):
        self.name = name
        self.mod = mod

    def __eq__(self, other):
        return self.name == other.name

    def __repr__(self):
        return f"device(type='{self.name}')"

    def __getattr__(self, name):
        return getattr(self.mod, name)

    def enabled(self):
        return self.mod is not None

    def randn(self, *shape, dtype="float32"):
        return NDArray(numpy.random.randn(*shape).astype(dtype), device=self)

    def rand(self, *shape, dtype="float32"):
        return NDArray(numpy.random.rand(*shape).astype(dtype), device=self)

    def one_hot(self, n, i, dtype="float32"):
        return NDArray(numpy.eye(n, dtype=dtype)[i], device=self)

    def empty(self, shape, dtype="float32"):
        dtype = "float32" if dtype is None else dtype
        assert dtype == "float32"
        return NDArray.make(shape, device=self)

    def full(self, shape, fill_value, dtype="float32"):
        dtype = "float32" if dtype is None else dtype
        assert dtype == "float32"
        arr = self.empty(shape, dtype)
        arr.fill(fill_value)
        return arr

def cpu_numpy():
    """Return numpy device"""
    return BackendDevice('cpu_numpy', ndarray_backend_numpy)

def default_device():
    return cpu_numpy()

In [None]:
#| export
class NDArray:

    """
    NDArray represents a n-dimensional array with operations that can be performed on multiple devices. 
    This class is an abstraction over numpy and other backend devices, providing a unified interface to interact with arrays.

    Use cases of this class include numerical operations, scientific computing, and machine learning.

    Parameters
    ----------
    value : NDArray or np.ndarray or other array-like structures
        The array-like structure to be transformed into NDArray.
    device : Optional[BackendDevice]
        The device on which the array computations should be performed. 
        If None, the default device is used.

    Attributes
    ----------
    _shape : tuple
        The shape of the array.
    _strides : tuple
        The strides of the array.
    _offset : int
        The offset in the underlying buffer.
    _device : BackendDevice
        The device on which the array computations are performed.
    _handle : Buffer
        The underlying buffer that holds the data.
    """
    
    def __init__(
        self,
        value: Union['NDArray', np.ndarray, Sequence], # The value on which to create the NDArray from
        device: Optional[BackendDevice] = None # The device on which the array computations are performed.
    ) -> None:
        """
        Constructs a new NDArray instance from an existing `NDArray`, numpy array, or a Python sequence. 
        This array can be used to perform high-performance computations on the specified device.

        Parameters
        ----------
        value : Union[NDArray, np.ndarray, Sequence]
            The value to create the NDArray from. If it's an NDArray, it is deep-copied to the new NDArray. 
            If it's a numpy array, it's copied to a new NDArray. If it's a Python sequence, it's converted to 
            a numpy array and then copied to a new NDArray.

        device : Optional[BackendDevice]
            The device on which the array computations are performed. Defaults to the device of the input value 
            if it's an NDArray, or to the default device otherwise.
        """
        
        if isinstance(value, NDArray): # copy of existing NDArray
            if device is None: device = value.device
            self._init(value.to(device) + 0.0)
        elif isinstance(value, np.ndarray): # copy of existing np array
            device = device if device is not None else default_device()
            array = self.make(value.shape, device=device)
            array.device.from_numpy(np.ascontiguousarray(value), array._handle)
            self._init(array)
        else:
            array = NDArray(np.array(value), device=device)
            self._init(array)

    def _init(self, other) -> None:
        """
        A private method that initializes the new NDArray with the values, shape, strides, offset, device, 
        and handle of another NDArray.
    
        Parameters
        ----------
        other : NDArray
            The NDArray to initialize from.
        """
        self._shape = other._shape
        self._strides = other._strides
        self._offset = other._offset
        self._device = other._device
        self._handle = other._handle

    @staticmethod
    def make(
        shape: Sequence[int], # The shape of the new array.
        strides: Optional[Sequence[int]] = None, # The strides of the new array. If None, compact strides are computed.
        device: Optional[BackendDevice] = None, # The device on which the new array computations should be performed. If None, the default device is used.
        offset: Optional[int] = None, # The offset in the underlying buffer of the new array. If None, it defaults to 0.
        handle: Optional[Any] = None # The underlying buffer that should hold the data. If None, a new buffer is allocated.
    ) -> 'NDArray':
        """
        Constructs a new NDArray with the specified shape, strides, device, offset, and handle.

        Parameters
        ----------
        shape : Sequence[int]
            The shape of the new array.
        strides : Optional[Sequence[int]]
            The strides of the new array. If None, compact strides are computed.
        device : Optional[BackendDevice]
            The device on which the new array computations should be performed. If None, the default device is used.
        offset : Optional[int]
            The offset in the underlying buffer of the new array. If None, it defaults to 0.
        handle : Optional[Buffer]
            The underlying buffer that should hold the data. If None, a new buffer is allocated.

        Returns
        -------
        NDArray
            A new NDArray instance.
        """
        array = NDArray.__new__(NDArray)
        array._shape = tuple(shape)
        array._strides = NDArray.compact_strides(shape) if strides is None else strides
        array._device = default_device() if device is None else device
        array._offset = offset
        array._handle = array.device.Array(prod(shape)) if handle is None else handle
        return array

    @staticmethod
    def compact_strides(shape) -> Tuple:
        res = [1] + [prod(shape[-i:]) for i in range(1, len(shape))]
        return tuple(res[::-1])

    def _is_compact(self) -> bool:
        return self.strides == self.compact_strides(self.shape) and prod(self.shape) == self._handle.size

    def compact(self) -> 'NDArray':
        """
        Returns a compact version of this array. If the array is already compact, it returns itself.

        Returns
        -------
        NDArray
            The compact version of this array.
        """
        if self._is_compact():
            return self
        out = NDArray.make(shape=self.shape, device=self.device)
        self.device.compact(self._handle, out._handle, self.shape, self.strides, self.offset)
        return out
        
    def as_strided(self, shape, strides) -> 'NDArray':
        assert len(shape) == len(strides)
        return NDArray.make(shape=shape, strides=strides, handle=self._handle)

    def flat(self) -> 'NDArray':
        return self.reshape((self.size, ))

    def to(self, device: BackendDevice) -> 'NDArray':
        """
        Transfers this array to the specified device.

        Parameters
        ----------
        device : BackendDevice
            The device to which this array should be transferred.

        Returns
        -------
        NDArray
            This array after it has been transferred to `device`.
        """
        return self if device == self.device else NDArray(self.numpy(), device=device)
        
    def numpy(self) -> np.ndarray:
        """
        Returns a numpy representation of this array.

        Returns
        -------
        np.ndarray
            A numpy array that has the same data as this array.
        """
        return self.device.to_numpy(self._handle, self._shape, self._strides, self._offset)

    @property
    def shape(self) -> Tuple[int, ...]:
        return self._shape

    @property
    def strides(self) -> Tuple[int, ...]:
        return self._strides

    @property
    def device(self) -> BackendDevice:
        return self._device

    @property
    def dtype(self) -> str:
        # only support float32 for now
        return "float32"

    @property
    def ndim(self) -> int:
        """ Return number of dimensions. """
        return len(self._shape)

    @property
    def size(self) -> int:
        return prod(self._shape)

    def __repr__(self) -> str:
        return "NDArray(" + self.numpy().__str__() + f", device={self.device})"

    def __str__(self) -> str:
        return self.numpy().__str__()

    def fill(self, val) -> 'NDArray':
        return self.device.fill(self._handle, val)

    ### Elementwise functions

    def log(self):
        """
        Computes the natural logarithm element-wise for the NDArray. 
    
        Returns
        -------
        NDArray
            A new NDArray with the natural logarithm applied element-wise. The shape of the returned array matches
            the original NDArray.
        """
        out = NDArray.make(self.shape, device=self.device)
        self.device.ewise_log(self.compact()._handle, out._handle)
        return out

    def exp(self):
        """
        Computes the exponential function element-wise for the NDArray.
    
        Returns
        -------
        NDArray
            A new NDArray with the exponential function applied element-wise. The shape of the returned array matches
            the original NDArray.
        """
        
        out = NDArray.make(self.shape, device=self.device)
        self.device.ewise_exp(self.compact()._handle, out._handle)
        return out

    def tanh(self):
        """
        Computes the hyperbolic tangent element-wise for the NDArray.
    
        Returns
        -------
        NDArray
            A new NDArray with the hyperbolic tangent applied element-wise. The shape of the returned array matches
            the original NDArray.
        """
        
        out = NDArray.make(self.shape, device=self.device)
        self.device.ewise_tanh(self.compact()._handle, out._handle)
        return out

    def reshape(self, new_shape):
        """
        Reshape the matrix without copying memory.  This will return a matrix
        that corresponds to a reshaped array but points to the same memory as
        the original array.
        Raises:
            ValueError if product of current shape is not equal to the product
            of the new shape, or if the matrix is not compact.
        Args:
            new_shape (tuple): new shape of the array
        Returns:
            NDArray : reshaped array; this will point to the same memory as the original NDArray.
        """
        
        if prod(new_shape) != prod(self.shape) or not self.is_compact():
            raise ValueError("Invalid reshape")
        return self.as_strided(shape=new_shape, strides=NDArray.compact_strides(new_shape))

    def permute(self, new_axes):
        """
        Permute order of the dimensions.  new_axes describes a permutation of the
        existing axes, so e.g.:
          - If we have an array with dimension "BHWC" then .permute((0,3,1,2))
            would convert this to "BCHW" order.
          - For a 2D array, .permute((1,0)) would transpose the array.
        Like reshape, this operation should not copy memory, but achieves the
        permuting by just adjusting the shape/strides of the array.  That is,
        it returns a new array that has the dimensions permuted as desired, but
        which points to the same memory as the original array.
        Args:
            new_axes (tuple): permutation order of the dimensions
        Returns:
            NDarray : new NDArray object with permuted dimensions, pointing
            to the same memory as the original NDArray (i.e., just shape and
            strides changed).
        """
        
        new_shape = tuple(self.shape[i] for i in new_axes)
        new_strides = tuple(self.strides[i] for i in new_axes)
        return self.as_strided(shape=new_shape, strides=new_strides)

    def broadcast_to(self, new_shape):
        """
        Broadcast an array to a new shape.  new_shape's elements must be the
        same as the original shape, except for dimensions in the self where
        the size = 1 (which can then be broadcast to any size).  As with the
        previous calls, this will not copy memory, and just achieves
        broadcasting by manipulating the strides.
        Raises:
            assertion error if new_shape[i] != shape[i] for all i where
            shape[i] != 1
        Args:
            new_shape (tuple): shape to broadcast to
        Returns:
            NDArray: the new NDArray object with the new broadcast shape; should
            point to the same memory as the original array.
        """
        
        for old_shape_i, new_shape_i in zip(self.shape, new_shape):
            if old_shape_i != 1:
                assert new_shape_i == old_shape_i
        new_strides = tuple(0 if old_shape_i == 1 else stride_i for old_shape_i, stride_i in zip(self.shape, self.strides))
        return self.as_strided(shape=new_shape, strides=new_strides)

    def _ewise_or_scalar(self, other: Union['NDArray', float], ewise_fn: Callable, scalr_fn: Callable) -> 'NDArray':
        """
        This private method applies an element-wise function (`ewise_fn`) to two `NDArray` instances, or a scalar function (`scalr_fn`) 
        to this `NDArray` and a scalar value. It returns a new `NDArray` instance with the results.
    
        Parameters
        ----------
        other : Union[NDArray, float]
            The second operand for the operation. It can be either another `NDArray` (for element-wise operations) or a scalar 
            (for scalar operations).
    
        ewise_fn : Callable
            A function to apply element-wise if `other` is an `NDArray`. This function should take two `NDArray` handles and 
            output a handle.
    
        scalr_fn : Callable
            A function to apply if `other` is a scalar. This function should take an `NDArray` handle and a scalar, and 
            output a handle.
    
        Returns
        -------
        NDArray
            A new `NDArray` instance with the results of the operation.
    
        Raises
        ------
        AssertionError
            If `other` is an `NDArray` but does not have the same shape as `self`.
        """
        out = NDArray.make(shape=self.shape, device=self.device)
        if isinstance(other, NDArray):
            assert self.shape == other.shape, f'operands could not be added together with shapes {self.shape} {other.shape}'
            ewise_fn(self.compact()._handle, other.compact()._handle, out._handle)
        else:
            scalr_fn(self.compact()._handle, other, out._handle)
        return out

    def __add__(self, other: Union['NDArray', float]) -> 'NDArray':
        """
        Performs element-wise addition between this array and `other`. If `other` is not an NDArray, it is treated as a scalar.

        Parameters
        ----------
        other : NDArray or scalar
            The other operand in the addition.

        Returns
        -------
        NDArray
            The result of the addition.

        Raises
        ------
        AssertionError
            If `other` is an NDArray and does not have the same shape as this array.
        """
        return self._ewise_or_scalar(other, ewise_fn=self.device.ewise_add, scalr_fn=self.device.scalar_add)

    def __sub__(self, other) -> 'NDArray':
        """
        Implements the subtract operation. This method performs element-wise subtraction between two NDArrays
        or an NDArray and a scalar.
    
        Parameters
        ----------
        other : NDArray or scalar
            The array or scalar to subtract from the current NDArray.
    
        Returns
        -------
        NDArray
            The resultant NDArray after performing subtraction.
        """
        return self + (-other)

    def __rsub__(self, other) -> 'NDArray':
        """
        Implements the reverse subtract operation. This is used when the NDArray is on the right side of a subtraction.
    
        Parameters
        ----------
        other : scalar
            The scalar to subtract the NDArray from.
    
        Returns
        -------
        NDArray
            The resultant NDArray after performing subtraction.
        """
        
        return (-self) + other

    def __mul__(self, other) -> 'NDArray':
        """
        Implements the multiply operation. This method performs element-wise multiplication between two NDArrays
        or an NDArray and a scalar.
    
        Parameters
        ----------
        other : NDArray or scalar
            The array or scalar to multiply with the current NDArray.
    
        Returns
        -------
        NDArray
            The resultant NDArray after performing multiplication.
        """
        return self._ewise_or_scalar(other, ewise_fn=self.device.ewise_mul, scalr_fn=self.device.scalar_mul)

    def __truediv__(self,  other) -> 'NDArray':
        """
        Implements the true divide operation. This method performs element-wise division between two NDArrays
        or an NDArray and a scalar.
    
        Parameters
        ----------
        other : NDArray or scalar
            The array or scalar to divide the current NDArray by.
    
        Returns
        -------
        NDArray
            The resultant NDArray after performing division.
        """
        return self._ewise_or_scalar(other, ewise_fn=self.device.ewise_div, scalr_fn=self.device.scalar_div)

    def __neg__(self):
        """
        Implements the negation operation. This method performs element-wise negation for self(NDArray).

        Parameters
        ----------
        self : NDArray
            The array to negate.
    
        Returns
        -------
        NDArray
            The resultant NDArray after performing negation.
        """
        
        return self * (-1)

    def __pow__(self, scalar) -> 'NDArray':
        out = NDArray.make(self.shape, self.device)
        self.device.scalr_power(self.compact()._handle, scalar, out._handle)
        return out

    __radd__ = __add__
    __rmul__ = __mul__

    def maximum(self, other):
        return self.ewise_or_scalar(other, self.device.ewise_maximum, self.device.scalar_maximum)

    def __eq__(self, other):
        return self.ewise_or_scalar(other, self.device.ewise_eq, self.device.scalar_eq)

    def __ge__(self, other):
        return self.ewise_or_scalar(other, self.device.ewise_ge, self.device.scalar_ge)

    def __ne__(self, other):
        return 1 - (self == other)

    def __gt__(self, other):
        return (self >= other) * (self != other)

    def __lt__(self, other):
        return 1 - (self >= other)

    def __le__(self, other):
        return 1 - (self > other)
        

    def process_slice(self, sl, dim):
        """ Convert a slice to an explicit start/stop/step """
        start, stop, step = sl.start, sl.stop, sl.step
        if start == None:
            start = 0
        if start < 0:
            start = self.shape[dim]
        if stop == None:
            stop = self.shape[dim]
        if stop < 0:
            stop = self.shape[dim] + stop
        if step == None:
            step = 1

        # we're not gonna handle negative strides and that kind of thing
        assert stop > start, "Start must be less than stop"
        assert step > 0, "No support for  negative increments"
        return slice(start, stop, step)

    def __getitem__(self, idxs):
        """
        Implements the get item operation to access elements or sub-arrays of our NDArray instance. 
        This method supports slicing and integer-based access similar to NumPy. It returns a new NDArray
        object that represents a view into the original array without copying memory.
    
        Raises
        ------
        AssertionError
            If a slice has negative size or step, or if the number of slices is not equal to the number of dimensions.
    
        Parameters
        ----------
        idxs : tuple
            A tuple of slice or integer elements corresponding to the subset of the matrix to get.
    
        Returns
        -------
        NDArray
            A new NDArray object corresponding to the selected subset of elements. This should not copy memory but 
            just manipulate the shape/strides/offset of the new array, referencing the same array as the original one.
        """

        # handle singleton as tuple, everything as slices
        if not isinstance(idxs, tuple):
            idxs = (idxs,)
        idxs = tuple(
            [
                self.process_slice(s, i) if isinstance(s, slice) else slice(s, s + 1, 1)
                for i, s in enumerate(idxs)
            ]
        )
        assert len(idxs) == self.ndim, "Need indexes equal to number of dimensions"
        shape = tuple((idx.stop - idx.start) // idx.step for idx in idxs)
        offset = sum(idx.start * stride for idx, stride in zip(idxs, self.strides))
        strides = tuple(idx.step * stride for idx, stride in zip(idxs, self.strides)) # Corrected line -> haha was FUN!!
        return NDArray.make(shape, strides=strides, device=self.device, handle=self._handle, offset=offset)

    def __setitem__(self, idxs, other):
        """
        Implements the set item operation to modify elements or sub-arrays of our NDArray instance. 
        This method supports slicing and integer-based access similar to NumPy. It modifies the original NDArray
        in place.
        -> uses same semantics as __getitem__().
    
        Parameters
        ----------
        idxs : tuple
            A tuple of slice or integer elements corresponding to the subset of the matrix to set.
    
        other : NDArray or scalar
            The array or scalar value to set into the specified subset of the matrix. If `other` is an NDArray,
            its shape should match the shape of the subset defined by `idxs`.
    
        Raises
        ------
        AssertionError
            If `other` is an NDArray and its shape does not match the shape of the subset defined by `idxs`.
        """
        view = self.__getitem__(idxs)
        if isinstance(other, NDArray):
            assert prod(view.shape) == prod(other.shape)
            self.device.ewise_setitem(
                other.compact()._handle,
                view._handle,
                view.shape,
                view.strides,
                view._offset,
            )
        else:
            self.device.scalar_setitem(
                other,
                view._handle,
                view.shape,
                view.strides,
                view._offset,
            )

The concept of "strides" is crucial to understanding how multi-dimensional arrays are stored and accessed in memory. The term "stride" refers to the number of elements (or steps) you need to move in memory to go from one element to the next along a particular axis of an array.

In the context of PyTorch and many other libraries that work with multi-dimensional arrays, the strides attribute of a tensor gives you the number of elements you need to skip in memory to move one step along each dimension of the tensor.

The strides of a tensor are defined as a tuple of integers, where each integer represents the step size for the corresponding dimension. 

Let's look at your example:

```python
x = torch.arange(24).reshape(2,3,4)
print(x.stride()) # outputs: (12, 4, 1)
```

Here, the tensor `x` has a shape of (2,3,4), and the stride is (12,4,1).

- The first element of the stride tuple, 12, tells you that you need to step over 12 elements in memory to get from one element to the next along the first axis (axis=0, the one that has size 2). This makes sense, because there are 12 elements in each "block" of this dimension (3*4 = 12).

- The second element, 4, says that you need to step over 4 elements in memory to move from one element to the next along the second axis (axis=1, the one that has size 3). This is because there are 4 elements in each "row" of this dimension.

- The last element, 1, shows that you only need to step over 1 element in memory to move from one element to the next along the last axis (axis=2, the one that has size 4). This is because elements along this axis are contiguous in memory.

In conclusion, the concept of strides is critical for efficient storage and computations on multi-dimensional arrays, as it allows libraries like PyTorch to perform complex operations without needing to actually rearrange or copy any data in memory.

Let's start with a simple 1D case. Imagine we have an array of size 10 and we want to select every second element. Instead of physically copying every second element to a new array, we could just create a new "view" of the array with a stride of 2. This means that to move to the next element in our sliced array, we jump over 2 elements in the original data. 

For multi-dimensional arrays, the principle is the same but each dimension has its own stride. When you slice a tensor, you're essentially creating a new tensor (a view) that starts from a different offset and possibly uses different strides. 

Consider a 2D case: if you slice the first dimension (e.g., `array[1:, :]`), you're changing the offset to start from the second element along that dimension. Essentially, you're jumping over a number of elements equal to the stride of that dimension. 

If you slice the second dimension (e.g., `array[:, ::2]` to select every second column), you're not changing the offset, but you're doubling the stride for the second dimension. This tells the tensor to skip one element in memory for every step in that dimension, giving you every second column.

In summary, slicing doesn't involve any data copying. Instead, it changes the starting point (offset) and how you move along each dimension (stride) of the tensor. This makes slicing operations very efficient, even on large tensors.

In [None]:
import nbdev; nbdev.nbdev_export()