Today we'll cover:

1. [Manipulating arrays](#Manipulating-arrays)
2. [Universal functions (ufuncs)](#Universal-functions)
3. [Sorting and searching](#Sorting-and-searching)

# Manipulating arrays

We already seen the `reshape` method for arrays.

In [1]:
import numpy as np

In [2]:
print np.arange(12).reshape((3, 4))

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


Equivalently one can use Numpy's `reshape` function.

In [3]:
row_major = np.reshape(np.arange(12), (3, 4)); print row_major

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


The reshaping happens in "row major order" (see [Row-major order](http://en.wikipedia.org/wiki/Row-major_order)) (e.g., the language C uses this order). You can choose the "column major order" by specifying the `order` argument. Note that `'F'` is for Fortran, a language that uses column major order.

In [4]:
col_major = np.reshape(np.arange(12), (3, 4), order='F'); print col_major

[[ 0  3  6  9]
 [ 1  4  7 10]
 [ 2  5  8 11]]


The function `ravel` does the opposite of `reshape`. It returns a flattened (1d) array.

In [5]:
print np.ravel(row_major)

[ 0  1  2  3  4  5  6  7  8  9 10 11]


In [6]:
print np.ravel(col_major)  # this will not give range(12) since we didn't specify the correct order

[ 0  3  6  9  1  4  7 10  2  5  8 11]


In [7]:
print np.ravel(col_major, order='F')  # flatten using column major (Fortran) order

[ 0  1  2  3  4  5  6  7  8  9 10 11]


The `T` attribute of ndarrays gives the transpose.

In [8]:
print row_major.T

[[ 0  4  8]
 [ 1  5  9]
 [ 2  6 10]
 [ 3  7 11]]


The `transpose()` method does the same.

In [9]:
transposed = row_major.transpose()

In [10]:
transposed[0, 0] = 100; print row_major  # transpose() returns a view not copy

[[100   1   2   3]
 [  4   5   6   7]
 [  8   9  10  11]]


In [11]:
transposed[0, 0] = 0; print row_major  # convert the 100 back to 0

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


In [12]:
second_row = row_major[[1]]; print second_row  # returns a matrix of shape (1, 4)

[[4 5 6 7]]


In [13]:
second_row.shape

(1, 4)

In [14]:
squeezed = np.squeeze(second_row); print squeezed  # squeeze eliminates any dimensions of length 1

[4 5 6 7]


In [15]:
squeezed.shape  # now it is a 1-d array

(4,)

Now let us learn a few stacking functions:

* `column_stack`: out 1d arrays together as columns of a 2-d array
* `dstack`: put 2-d arrays together depthwise to create a 3-d array
* `vstack`: put 2-d arrays together vertically (number of rows increases)
* `hstack`: put 2-d arrays together horizontally (number of columns increases)

In [16]:
vec1 = np.arange(4); print vec1

[0 1 2 3]


In [17]:
vec2 = np.arange(5, 9); print vec2

[5 6 7 8]


In [18]:
mat1 = np.column_stack((vec1, vec2)); print mat1

[[0 5]
 [1 6]
 [2 7]
 [3 8]]


In [19]:
mat2 = 2*mat1; print mat2

[[ 0 10]
 [ 2 12]
 [ 4 14]
 [ 6 16]]


In [20]:
mat_3d = np.dstack((mat1, mat2)); print mat_3d.shape  # returns a 3-d array

(4, 2, 2)


In [21]:
print mat_3d[:,:,0]  # this is indeed mat1

[[0 5]
 [1 6]
 [2 7]
 [3 8]]


In [22]:
print mat_3d[:,:,1]  # this is indeed mat2

[[ 0 10]
 [ 2 12]
 [ 4 14]
 [ 6 16]]


In [23]:
print np.vstack((mat1, mat2))  # stack vertically

[[ 0  5]
 [ 1  6]
 [ 2  7]
 [ 3  8]
 [ 0 10]
 [ 2 12]
 [ 4 14]
 [ 6 16]]


In [24]:
print np.hstack((mat1, mat2))  # stack horizontally

[[ 0  5  0 10]
 [ 1  6  2 12]
 [ 2  7  4 14]
 [ 3  8  6 16]]


A smaller array can be tiled to produce bigger arrays.

In [25]:
two_by_two = np.arange(4).reshape((2, 2)); print two_by_two

[[0 1]
 [2 3]]


In [26]:
print np.tile(two_by_two, (2, 2))  # tile in both directions

[[0 1 0 1]
 [2 3 2 3]
 [0 1 0 1]
 [2 3 2 3]]


In [27]:
print np.tile(two_by_two, 4)  # tile horizontally

[[0 1 0 1 0 1 0 1]
 [2 3 2 3 2 3 2 3]]


In [28]:
print np.tile(two_by_two, (4, 1))  # tile vertically

[[0 1]
 [2 3]
 [0 1]
 [2 3]
 [0 1]
 [2 3]
 [0 1]
 [2 3]]


A very useful function is `unique()` that returns a 1-d array containing the unique elements in an array.

In [29]:
magic_chars = np.array([c for c in 'abracadabra']); print magic_chars

['a' 'b' 'r' 'a' 'c' 'a' 'd' 'a' 'b' 'r' 'a']


In [30]:
unique_chars = np.unique(magic_chars); print unique_chars

['a' 'b' 'c' 'd' 'r']


In [31]:
u, indices = np.unique(magic_chars, return_index=True)

In [32]:
print indices  # indices of elements of u w.r.t. original array

[0 1 4 6 2]


In [33]:
u, indices = np.unique(magic_chars, return_inverse=True)

In [34]:
print indices  # indices of elements of original w.r.t. u

[0 1 4 0 2 0 3 0 1 4 0]


In [35]:
print u[indices]  # gives back the original array

['a' 'b' 'r' 'a' 'c' 'a' 'd' 'a' 'b' 'r' 'a']


Finally flipping matrices along either axis is easy.

In [36]:
mat = np.arange(24).reshape(4, 6); print mat

[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]]


In [37]:
print np.fliplr(mat)

[[ 5  4  3  2  1  0]
 [11 10  9  8  7  6]
 [17 16 15 14 13 12]
 [23 22 21 20 19 18]]


In [38]:
print np.flipud(mat1)

[[3 8]
 [2 7]
 [1 6]
 [0 5]]


# Universal functions

In [39]:
mat_of_squares = np.array([i**2 for i in range(12)]).reshape((3,4)); print mat_of_squares

[[  0   1   4   9]
 [ 16  25  36  49]
 [ 64  81 100 121]]


In [40]:
import math

In [41]:
math.sqrt(mat_of_squares)  # can't do this!

TypeError: only length-1 arrays can be converted to Python scalars

In [42]:
print np.sqrt(mat_of_squares)

[[  0.   1.   2.   3.]
 [  4.   5.   6.   7.]
 [  8.   9.  10.  11.]]


In Numpy terminology, `np.sqrt` is a a *ufunc* (short for "universal function") that can work elementwise on an ndarray of any size.

In [43]:
print np.power(mat_of_squares, 0.5)  # np.power is a more general ufunc

[[  0.   1.   2.   3.]
 [  4.   5.   6.   7.]
 [  8.   9.  10.  11.]]


In fact, `mat ** exponent` is equivalent to `np.power(mat, exponent)`.

In [44]:
print mat_of_squares ** 0.5

[[  0.   1.   2.   3.]
 [  4.   5.   6.   7.]
 [  8.   9.  10.  11.]]


A list of all available ufuncs can be found [here](http://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs).

# Sorting and searching

In [45]:
python = np.array([c for c in 'python']).reshape((2, 3)); print python

[['p' 'y' 't']
 ['h' 'o' 'n']]


Like the builtin function `sorted` (which returns a sorted copy of a Python list), the numpy function `sort` returns a sorted copy of the array.

In [46]:
print np.sort(python)  # sort along last axis, 2nd (i.e. horizontally) in this case

[['p' 't' 'y']
 ['h' 'n' 'o']]


In [47]:
print np.sort(python, 0)  # sort along first axis (i.e. vertically)

[['h' 'o' 'n']
 ['p' 'y' 't']]


In [48]:
print np.sort(python, None)  # if the axis is None, return all elements sorted as a 1-d array 

['h' 'n' 'o' 'p' 't' 'y']


In [49]:
print python  # since all calls to sort returned copies, the original array is unchanged

[['p' 'y' 't']
 ['h' 'o' 'n']]


Sorting an array in-place is achives via its `sort` method.

In [50]:
python.sort(); print python  # now it's sorted along the horizontal axis

[['p' 't' 'y']
 ['h' 'n' 'o']]


Another useful sorting related method is `argsort`.

In [51]:
squares_last_digit = np.remainder(np.arange(10)**2, 10); print squares_last_digit

[0 1 4 9 6 5 6 9 4 1]


In [52]:
ind_in_sorted_order = np.argsort(squares_last_digit); print ind_in_sorted_order

[0 1 9 2 8 5 4 6 3 7]


In [53]:
print squares_last_digit[ind_in_sorted_order]  # elements for squares_last_digit will appear in sorted order

[0 1 1 4 4 5 6 6 9 9]


In [54]:
mat = np.array([7, 4, 8, 1, 2, 6, 0, -2, 3]).reshape((3, 3)); print mat

[[ 7  4  8]
 [ 1  2  6]
 [ 0 -2  3]]


In [55]:
print np.amax(mat, 0)  # max along axis 0 (vertical)

[7 4 8]


In [56]:
print np.amax(mat, 1)  # max along axis 1 (horizontal)

[8 6 3]


Confusingly enough, the default value of the `axis` argument of `numpy.amax` is `None` (unlike the case of numpy.sort where it was `-1`)

In [57]:
np.amax(mat)  # equivalent to np.amax(mat, None), i.e. flattens and computes max

8

We can also use the `max` method of ndarray objects to achive the same results.

In [58]:
print mat.max(0)

[7 4 8]


In [59]:
print mat.max(1)

[8 6 3]


In [60]:
print mat.max()

8


But do not confuse the `numpy.amax` function or the `numpy.ndarray.max` method with the ufunc `numpy.maximum`!

In [61]:
mat1 = np.array([1, 5, 2, 6]).reshape((2, 2)); print mat1

[[1 5]
 [2 6]]


In [62]:
mat2 = np.array([4, 2, 3, 7]).reshape((2, 2)); print mat2

[[4 2]
 [3 7]]


In [63]:
print np.maximum(mat1, mat2)  # elementwise maximum

[[4 5]
 [3 7]]


Finally, a useful function for conditonally choosing elements from two arrays is `numpy.where`.

In [64]:
left_right = np.array(['L', 'R', 'R', 'L']).reshape((2, 2)); print left_right

[['L' 'R']
 ['R' 'L']]


In [65]:
left_mat = np.array([1, 2, 3, 4]).reshape((2, 2)); print left_mat

[[1 2]
 [3 4]]


In [66]:
right_mat = np.array([100, 200, 300, 400]).reshape((2, 2)); print right_mat

[[100 200]
 [300 400]]


In [67]:
print np.where(left_right == 'L', left_mat, right_mat)  # choose entries from 2nd array or 3rd based on first

[[  1 200]
 [300   4]]
