## References

- [Indexing - Scipy/Numpy Manual](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html)
- [Slicing Lecture](http://www.scipy-lectures.org/intro/numpy/numpy.html)
- [Viewing vs Copying](http://scipy-cookbook.readthedocs.io/items/ViewsVsCopies.html)
- [Broadcasting](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)
- [ufunc](https://docs.scipy.org/doc/numpy/reference/ufuncs.html)
- [Structured arrays](https://docs.scipy.org/doc/numpy/user/basics.rec.html)

# Notes

- Three types of indexing: field access, basic slicing, advanced indexing.

-------

## Basic Slicing and Indexing

- `arr[obj]` where obj is a slice (`start:stop:step`), an integer, or a tuple of slice objects and integers.
- Basic slicing is also initiated if the selection object is any non-`ndarray` sequence (such as a `list`) containing `slice` objects, the `Ellipsis` object, or the `newaxis` object, but not for integer arrays or other embedded sequences.
- All arrays generated by basic slicing are always [views](https://docs.scipy.org/doc/numpy/glossary.html#term-view) of the original array.
  - `view` is just another view point of original `data` (memory chunks).
  - That is to say, new view if different from original view on its `strides`.
  - In other word, any slicing that can be done through modifing `strides` is basic slicing.

## <font color="#006600">Summary</font>
Basic slicing is triggered by:
1. `x[i:j:k]` syntax (equivalent with `x[slice(i, j, k)]`)
2. `x[i]` where `i` is an integer.
3. `x[(a,b,c,d,...)]` where `a`, `b`, `c`, `d` and ... can be integer or slice object **only**.
  - including `...` and `np.newaxis`
4. `x[seqobj]` where `seqobj` is an non-`ndarray` sequence object containing `slice` objects, `Elipsis` or `newaxis` but not integer arrays.
  - ex: `x[[1,2,3]] --> advanced, x[[1, 2, newaxis]] --> basic`
  - ex: `x[(1,2,3),] --> advanced (since the slice object is a tuple containing sequence object, ((1,2,3),:,:,...), x[(1, 2, 3)] --> basic`

In [1]:
from __future__ import print_function
import numpy as np

In [2]:
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
c = x[1:7:2]

In [3]:
x.dtype

dtype('int64')

In [4]:
str(x.data)

'\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x05\x00\x00\x00\x00\x00\x00\x00\x06\x00\x00\x00\x00\x00\x00\x00\x07\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\t\x00\x00\x00\x00\x00\x00\x00'

In [5]:
c.base is x # x[1:7:2] is a view of x

True

In [6]:
x.strides # stride for x, it has only one number since x is one dimentional.

(8,)

In [7]:
c.strides

(16,)

In [8]:
c

array([1, 3, 5])

In [9]:
x_ = x.reshape((2, 5)) 

In [10]:
# reshape will not create a new array but a view of original array with different strides
x_.base is x

True

In [11]:
x_.strides

(40, 8)

In [12]:
x[()].base is None # a new view of original array

False

In [13]:
xx = x           # a reference to original array (not copy)
x.base is None

True

- Negative i and j are interpreted as n + i and n + j where n is the number of elements in the corresponding dimension. Negative k makes stepping go **towards smaller indices**.

In [14]:
x[-2:10] # == x[10-2:10]

array([8, 9])

In [15]:
x[-3:3:-1]

array([7, 6, 5, 4])

- For slicing on one axis (`i:j:k`), following rules hold:
  - if i is not given it defaults to 0 for k > 0 and n - 1 for k < 0.
  - If j is not given it defaults to n for k > 0 and -1 for k < 0.
  - If k is not given it defaults to 1

In [16]:
x[::-1]

array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])

- If the number of objects in the selection tuple is less than `N` , then `:` is assumed for any subsequent dimensions.

In [17]:
x2 = np.array([[[1],[2],[3]], [[4],[5],[6]]])
x2[1:2]

array([[[4],
        [5],
        [6]]])

In [18]:
x2[1:2,:,:]

array([[[4],
        [5],
        [6]]])

- Ellipsis (`...`) expand to the number of `:` objects needed to make a selection tuple of the same length as `x.ndim`. There may only be a single ellipsis present.

In [19]:
x2[..., 0]

array([[1, 2, 3],
       [4, 5, 6]])

In [20]:
x3 = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 1, 1], [2, 2, 3]]])

In [21]:
x3.shape

(2, 2, 3)

In [22]:
x3[0,...]

array([[1, 2, 3],
       [4, 5, 6]])

In [23]:
x3[...,1]

array([[2, 5],
       [1, 2]])

- Each `newaxis` object in the selection tuple serves to expand the dimensions of the resulting selection by one unit-length dimension.

In [24]:
np.newaxis is None

True

In [25]:
x3[:,np.newaxis,...].shape

(2, 1, 2, 3)

- An integer, `i`, returns the same values as `i:i+1` except the dimensionality of the returned object is reduced by 1.

In [26]:
x3[0].shape

(2, 3)

In [27]:
x3[0:1].shape

(1, 2, 3)

In [28]:
x3[0,:,:].shape

(2, 3)

----

## Advanced Indexing

- advanced indexing will *copy* data from slicing array.
- when select `obj` in `arr[obj]` falls in one of follwoing cases, the advanced indexing will be triggered:
  1. non-tuple sequence object (e.g `list`)
  2. `numpy.ndarray` with `dtype` as `int` or `bool`
  3. tuple with at least one `ndarray` or sequence object
 
ex:
```
x[[1,2,3]] # <-- case 1
x[np.array([1, 2, 3])] # <- case 2
x[(1, 2, [1, 2])] # <-- case 3
x[(1,2,3),] # <-- case 3 (better to think about why)
```

### Integer Arrays Indexing

- Index consists of as many integer arrays as the array being indexed has dimensions.
- Each integer array represents a number of indexes into that dimension.

In [29]:
arr = np.array([[1, 2], [3, 4], [5, 6]]) # shape (3, 2)
c = arr[[0, 1, 2], [0, 1, 0]]
c.shape
# according to broadcasting rule, index1 ([0, 1, 2]) is of shape (3,)
# and index2 ([0, 1, 0]) is also of shape (3,), the resulting slice
# is of shape (3,)

(3,)

In [30]:
c.base is None # advanced slice

True

In [31]:
c

array([1, 4, 5])

In [32]:
# in this case, index1 is of shape (2, 1) and index2 is 
# of shape (3,), the broadcasting rule applies and the 
# resulting array is of shape (2, 3).
arr[np.array([0, 2])[:, None], [0, 1, 0]]

array([[1, 2, 1],
       [5, 6, 5]])

In [33]:
arr.base is None # advanced slice

True

In [34]:
# ex: select corner
arr = np.array([[0,  1,  2],
                [3,  4,  5],
                [6,  7,  8],
                [9, 10, 11]])
index1 = np.array([0,3])[:,None] # None is a must here.
index2 = np.array([0, 2])
arr[index1, index2]

array([[ 0,  2],
       [ 9, 11]])

In [36]:
print(np.ix_.__doc__)


    Construct an open mesh from multiple sequences.

    This function takes N 1-D sequences and returns N outputs with N
    dimensions each, such that the shape is 1 in all but one dimension
    and the dimension with the non-unit shape value cycles through all
    N dimensions.

    Using `ix_` one can quickly construct index arrays that will index
    the cross product. ``a[np.ix_([1,3],[2,5])]`` returns the array
    ``[[a[1,2] a[1,5]], [a[3,2] a[3,5]]]``.

    Parameters
    ----------
    args : 1-D sequences

    Returns
    -------
    out : tuple of ndarrays
        N arrays with N dimensions each, with N the number of input
        sequences. Together these arrays form an open mesh.

    See Also
    --------
    ogrid, mgrid, meshgrid

    Examples
    --------
    >>> a = np.arange(10).reshape(2, 5)
    >>> a
    array([[0, 1, 2, 3, 4],
           [5, 6, 7, 8, 9]])
    >>> ixgrid = np.ix_([0,1], [2,4])
    >>> ixgrid
    (array([[0],
           [1]]), array([[2, 4]]))
   

In [41]:
arr_2d = np.array([[0, 1, 2], [3, 4, 5]])

In [42]:
grid = np.ix_([0], [0, 2])
arr_2d[grid]

array([[0, 2]])

In [46]:
arr_3d = np.array([[[0, 1, 2], [3, 4, 5]], [[6, 7, 8], [9, 10, 11]]])

In [47]:
arr_3d.shape

(2, 2, 3)

In [48]:
grid = np.ix_([0], [1], [0, 2])
arr_3d[grid]

array([[[3, 5]]])

## Universal Function

In [50]:
def my_abs(x):
    return x > 0 and x or -x

In [51]:
my_abs(-3)

3

In [52]:
print(np.frompyfunc.__doc__)

frompyfunc(func, nin, nout)

    Takes an arbitrary Python function and returns a NumPy ufunc.

    Can be used, for example, to add broadcasting to a built-in Python
    function (see Examples section).

    Parameters
    ----------
    func : Python function object
        An arbitrary Python function.
    nin : int
        The number of input arguments.
    nout : int
        The number of objects returned by `func`.

    Returns
    -------
    out : ufunc
        Returns a NumPy universal function (``ufunc``) object.

    See Also
    --------
    vectorize : evaluates pyfunc over input arrays using broadcasting rules of numpy

    Notes
    -----
    The returned ufunc always returns PyObject arrays.

    Examples
    --------
    Use frompyfunc to add broadcasting to the Python function ``oct``:

    >>> oct_array = np.frompyfunc(oct, 1, 1)
    >>> oct_array(np.array((10, 30, 100)))
    array([012, 036, 0144], dtype=object)
    >>> np.array((oct(10), oct(30), oct(100))) # for c

In [53]:
umy_abs = np.frompyfunc(my_abs, 1, 1)

In [54]:
umy_abs([-1, 2, -3])

array([1, 2, 3], dtype=object)

In [55]:
print(np.vectorize.__doc__)


    vectorize(pyfunc, otypes=None, doc=None, excluded=None, cache=False,
              signature=None)

    Generalized function class.

    Define a vectorized function which takes a nested sequence of objects or
    numpy arrays as inputs and returns an single or tuple of numpy array as
    output. The vectorized function evaluates `pyfunc` over successive tuples
    of the input arrays like the python map function, except it uses the
    broadcasting rules of numpy.

    The data type of the output of `vectorized` is determined by calling
    the function with the first element of the input.  This can be avoided
    by specifying the `otypes` argument.

    Parameters
    ----------
    pyfunc : callable
        A python function or method.
    otypes : str or list of dtypes, optional
        The output data type. It must be specified as either a string of
        typecode characters or a list of data type specifiers. There should
        be one data type specifier for each output.

In [79]:
vmy_abs = np.vectorize(my_abs, otypes=[np.float])

In [80]:
vmy_abs([1.0, 2.0, 3.0])

array([ 1.,  2.,  3.])

In [81]:
l = [2*np.random.random() - 1 for _ in range(1000000)]

In [82]:
len(l)

1000000

------
Simple Comparison

In [84]:
%timeit np.abs(l)

10 loops, best of 3: 49.3 ms per loop


In [85]:
%timeit list(map(abs, l))

10 loops, best of 3: 84.8 ms per loop


In [83]:
%timeit vmy_abs(l) # 

1 loop, best of 3: 314 ms per loop
