## Employing Array Methods and Functions
### Diving Into NumPy
*Curtis Miller*

Let's first create an array.

In [1]:
import numpy as np
from numpy.random import randn

arr1 = np.array(randn(4, 4) * 10, dtype = np.int8)
print(arr1)

[[  8   3  12  -2]
 [  4 -15   3   3]
 [-10  -3   2  -5]
 [  0  -5 -26  -3]]


Let's see some manipulations we can apply to this array.

In [2]:
arr1.tolist()    # Turn arr1 to a list

[[8, 3, 12, -2], [4, -15, 3, 3], [-10, -3, 2, -5], [0, -5, -26, -3]]

In [3]:
arr1.flatten()    # Make a 1D array

array([  8,   3,  12,  -2,   4, -15,   3,   3, -10,  -3,   2,  -5,   0,
        -5, -26,  -3], dtype=int8)

In [4]:
arr2 = np.empty((4, 3), dtype=np.dtype('<U16'))    # An empty array for strings
arr2.fill("Carlos")    # Fill with "Carlos", in-place (make changes to the array, not a new one)
arr2

array([['Carlos', 'Carlos', 'Carlos'],
       ['Carlos', 'Carlos', 'Carlos'],
       ['Carlos', 'Carlos', 'Carlos'],
       ['Carlos', 'Carlos', 'Carlos']], 
      dtype='<U16')

Now for some mathematical manipulations.

In [5]:
arr1.sum()

-34

In [6]:
arr1.sum(axis=0)

array([  2, -20,  -9,  -7])

In [7]:
arr1.sum(axis=1)

array([ 21,  -5, -16, -34])

In [8]:
arr1.cumsum(axis=0)

array([[  8,   3,  12,  -2],
       [ 12, -12,  15,   1],
       [  2, -15,  17,  -4],
       [  2, -20,  -9,  -7]], dtype=int32)

In [9]:
arr1.mean(axis=0)

array([ 0.5 , -5.  , -2.25, -1.75])

Here I show the application of `ufunc`s.

In [10]:
np.sign(arr1)

array([[ 1,  1,  1, -1],
       [ 1, -1,  1,  1],
       [-1, -1,  1, -1],
       [ 0, -1, -1, -1]], dtype=int8)

In [11]:
arr1 * np.sign(arr1)    # All entries are positive now

array([[ 8,  3, 12,  2],
       [ 4, 15,  3,  3],
       [10,  3,  2,  5],
       [ 0,  5, 26,  3]], dtype=int8)

In [12]:
np.sqrt(arr1 * np.sign(arr1))

array([[ 2.82842708,  1.73205078,  3.46410155,  1.41421354],
       [ 2.        ,  3.87298346,  1.73205078,  1.73205078],
       [ 3.1622777 ,  1.73205078,  1.41421354,  2.23606801],
       [ 0.        ,  2.23606801,  5.09901953,  1.73205078]], dtype=float32)

Here I demonstrate vectorization.

Suppose I want a truncation function: that is, `tr(a)` returns `a` if `a > 0`, and `0` otherwise.

Here is the Python version.

In [13]:
def tr(a):
    if (a > 0):
        return a
    else:
        return 0

# Testing tr()
tr(20)

20

In [14]:
tr(-20)

0

Looks good, but it's not vectorized. You can see this below.

In [15]:
# Every False should be 0, according to tr(a)
print(arr1 > 0)

[[ True  True  True False]
 [ True False  True  True]
 [False False  True False]
 [False False False False]]


In [16]:
tr(arr1)    # Won't work

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

We'll create a `ufunc` that is a vectorized version of `tr()`.

In [17]:
tr_vec = np.vectorize(tr)    # vectorize() takes a function as an argument and returns a function
tr_vec(arr1)

array([[ 8,  3, 12,  0],
       [ 4,  0,  3,  3],
       [ 0,  0,  2,  0],
       [ 0,  0,  0,  0]], dtype=int8)

`tr_vec()` works! But it's not the most efficient means for solving this problem. Here's a better solution:

In [18]:
def tr_vec_fast(arr):
    ret_arr = arr.copy()
    ret_arr[arr <= 0] = 0
    return ret_arr

tr_vec_fast(arr1)

array([[ 8,  3, 12,  0],
       [ 4,  0,  3,  3],
       [ 0,  0,  2,  0],
       [ 0,  0,  0,  0]], dtype=int8)

In [19]:
%timeit tr_vec(arr1)

1000 loops, best of 3: 210 µs per loop


In [20]:
%timeit tr_vec_fast(arr1)

The slowest run took 4.59 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 27.5 µs per loop


The non-vectorized version, depending only on NumPy operators, is almost six times faster! This shows vectorization is an ad-hoc measure; if you want speed, think creatively on how to use NumPy operators and `ufunc`s.