Let's run through an example showing how powerful NumPy is. Suppose we have two lists a and b, consisting of the first 100,000 non-negative numbers, and we want to create a new list c whose ith element is a[i] + 2 * b[i]. 
## Without NumPy:

In [18]:
%%time
a = [i for i in range(100000)]
b = [i for i in range(100000)]

Wall time: 12 ms


In [23]:
c = a + b
d = a + 2 * b
len(c), len(d), type(c)

(200000, 300000, list)

In [20]:
%%time
c = []
for i in range(len(a)):
    c.append(a[i] + 2 * b[i])
c[:10]

Wall time: 41 ms


[0, 3, 6, 9, 12, 15, 18, 21, 24, 27]

## With Numpy:

In [6]:
import numpy as np

In [7]:
%%time
a = np.arange(100000)
b = np.arange(100000)

Wall time: 4.99 ms


In [8]:
%%time
c = a + 2 * b

Wall time: 3.99 ms


In [22]:
len(c), type(c)

(100000, list)

# Why vectorize?
* Much faster
* Easier to read and fewer lines of code
* More closely assembles mathematical notation
* Vectorization is one of the main reasons why NumPy is so powerful.

## Vector/Matrix Initialization - 1

In [29]:
import numpy as np

# Can initialize ndarrays with Python lists, for example:
a = np.array([1, 2, 3])   # Create a rank 1 array
print(type(a))            # Prints "<class 'numpy.ndarray'>"
print(a.shape)            # Prints "(3,)"
print(a[0], a[1], a[2])   # Prints "1 2 3"
a[0] = 5                  # Change an element of the array
print(a)                  # Prints "[5, 2, 3]"

b = np.array([[1, 2, 3],
              [4, 5, 6]])    # Create a rank 2 array
print(b.shape)                     # Prints "(2, 3)"
# print(b[0, 0], b[0, 1], b[1, 0])   # Prints "1 2 4"
print(b[0][0], b[0][1], b[1][0])   # Prints "1 2 4"  --> ok

<class 'numpy.ndarray'>
(3,)
1 2 3
[5 2 3]
(2, 3)
1 2 4


In [32]:
b = [[1, 2, 3],
     [4, 5, 6]]
print(b[0][0], b[0][1], b[1][0])   # Prints "1 2 4"
# print(b[0, 0], b[0, 1], b[1, 0])   # Prints "1 2 4"  --> error

1 2 4


## Vector/Matrix Initialization - 2

In [11]:
a = np.zeros((2, 2))   # Create an array of all zeros
print(a)               # Prints "[[ 0.  0.]
                       #          [ 0.  0.]]"

b = np.full((2, 2), 7)  # Create a constant array
print(b)                # Prints "[[ 7.  7.]
                        #          [ 7.  7.]]"

c = np.eye(2)         # Create a 2 x 2 identity matrix
print(c)              # Prints "[[ 1.  0.]
                      #          [ 0.  1.]]"

d = np.random.random((2, 2))  # Create an array filled with random values
print(d)                      # Might print "[[ 0.91940167  0.08143941]
                              #               [ 0.68744134  0.87236687]]"

[[0. 0.]
 [0. 0.]]
[[7 7]
 [7 7]]
[[1. 0.]
 [0. 1.]]
[[0.29454178 0.98631201]
 [0.3518721  0.72035805]]


## Vector/Matrix Generation & Reshaping

In [2]:
import numpy as np

a = np.ones((2, 2))    # Create an array of all ones
print(a)

[[1. 1.]
 [1. 1.]]


In [3]:
import numpy as np

nums = np.arange(8)
print(nums)
print(nums.shape)

[0 1 2 3 4 5 6 7]
(8,)


In [4]:
nums = nums.reshape((2, 4))
print('Reshaped:\n', nums)
print(nums.shape)

Reshaped:
 [[0 1 2 3]
 [4 5 6 7]]
(2, 4)


* The -1 in reshape corresponds to an unknown dimension that numpy will figure out,
 based on all other dimensions and the array size.
* Can only specify one unknown dimension.
* For example, sometimes we might have an unknown number of data points, and
 so we can use -1 instead without worrying about the true number.

In [5]:
nums = nums.reshape((4, -1))
print('Reshaped with -1:\n', nums)
print(nums.shape)

Reshaped with -1:
 [[0 1]
 [2 3]
 [4 5]
 [6 7]]
(4, 2)


* NumPy supports an object-oriented paradigm, such that ndarray has a number of methods 
and attributes, with functions similar to ones in the outermost NumPy namespace. 
* For example, we can do both:

In [6]:
nums = np.arange(8)
print(nums.min())     # Prints 0
print(np.min(nums))   # Prints 0

0
0


## Array Operations/Math
* NumPy supports many elementwise operations:

In [34]:
import numpy as np

x = np.array([[1, 2],
              [3, 4]], dtype=np.float64)
y = np.array([[5, 6],
              [7, 8]], dtype=np.float64)

# Elementwise sum; both produce the array
# [[ 6.0  8.0]
#  [10.0 12.0]]
print(x + y)
print(np.add(x, y))

# Elementwise difference; both produce the array
# [[-4.0 -4.0]
#  [-4.0 -4.0]]
print(x - y)
print(np.subtract(x, y))

# Elementwise product; both produce the array
# [[ 5.0 12.0]
#  [21.0 32.0]]
print(x * y)
print(np.multiply(x, y))

# Elementwise square root; produces the array
# [[ 1.          1.41421356]
#  [ 1.73205081  2.        ]]
print(np.sqrt(x))

[[ 6.  8.]
 [10. 12.]]
[[ 6.  8.]
 [10. 12.]]
[[-4. -4.]
 [-4. -4.]]
[[-4. -4.]
 [-4. -4.]]
[[ 5. 12.]
 [21. 32.]]
[[ 5. 12.]
 [21. 32.]]
[[1.         1.41421356]
 [1.73205081 2.        ]]


* How do we elementwise divide between two arrays?

In [40]:
import numpy as np

x = np.array([[1, 2], [3, 4]], dtype=np.float64)
y = np.array([[5, 6], [7, 8]], dtype=np.float64)
# y = np.array([[5, 7], [7, 8]], dtype=np.float64)

print(y / x)
print(np.divide(y, x))
print(y // x)
div, mod = np.divmod(y, x)
print(div, mod)

[[5.         3.5       ]
 [2.33333333 2.        ]]
[[5.         3.5       ]
 [2.33333333 2.        ]]
[[5. 3.]
 [2. 2.]]
[[5. 3.]
 [2. 2.]] [[0. 1.]
 [1. 0.]]


* Note * is elementwise multiplication, not matrix multiplication. 
* We instead use the dot function to compute inner products of vectors, to multiply a vector by a matrix, and to multiply matrices. 
* dot is available both as a function in the numpy module and as an instance method of array objects:

In [35]:
import numpy as np

v = np.array([9, 10])
w = np.array([11, 12])

print(v * w)
print(v.dot(w))  # 219 : 9*11 + 10*12

[ 99 120]
219


In [11]:
import numpy as np

x = np.array([[1, 2], [3, 4]])
v = np.array([9, 10])

print(x * v)
print(x.dot(v))

[[ 9 20]
 [27 40]]
[29 67]


In [12]:
import numpy as np

x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6], [7, 8]])

print(x * y)
print(x.dot(y))

[[ 5 12]
 [21 32]]
[[19 22]
 [43 50]]


In [41]:
import numpy as np

x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6], [7, 8]])

v = np.array([9, 10])
w = np.array([11, 12])

# Inner product of vectors; both produce 219
print(v.dot(w))
print(np.dot(v, w))

# Matrix / vector product; both produce the rank 1 array [29 67]
print(x.dot(v))
print(np.dot(x, v))

# print(x.dot(w))  # [35 81]
# print(np.dot(x, w))  # [35 81]

# Matrix / matrix product; both produce the rank 2 array
# [[19 22]
#  [43 50]]
print(x.dot(y))
print(np.dot(x, y))

219
219
[29 67]
[29 67]
[35 81]
[35 81]
[[19 22]
 [43 50]]
[[19 22]
 [43 50]]


* There are many useful functions built into NumPy, and often we're able to express them across specific axes of the ndarray:

In [42]:
import numpy as np

x = np.array([[1, 2, 3], 
              [4, 5, 6]])

print(np.sum(x))          # Compute sum of all elements; prints "21"
print(np.sum(x, axis=0))  # Compute sum of each column; prints "[5 7 9]"
print(np.sum(x, axis=1))  # Compute sum of each row; prints "[6 15]"

print(np.max(x, axis=0))  # Compute max of each column; prints "[4 5 6]" 
print(np.max(x, axis=1))  # Compute max of each row; prints "[3 6]" 

21
[5 7 9]
[ 6 15]
[4 5 6]
[3 6]


* How can we compute the index of the max value of each row? Useful, to say, find the class that corresponds to the maximum score for an input image.

In [14]:
import numpy as np

x = np.array([[1, 2, 3], 
              [4, 5, 6]])

print(np.argmax(x, axis=0)) # Compute index of max of each column; prints "[1 1 1]"
print(np.argmax(x, axis=1)) # Compute index of max of each row; prints "[2 2]"

[1 1 1]
[2 2]


* Note the axis you apply the operation will have its dimension removed from the shape.
* This is useful to keep in mind when you're trying to figure out what axis corresponds to what.
* For example:

In [16]:
import numpy as np

x = np.array([[1, 7, 3], 
              [4, 5, 6]])

print(x.shape)               # Has shape (2, 3)
print((x.max(axis=0)))       # prints [4 7 6] 
print((x.max(axis=0)).shape) # Taking the max over axis 0 has shape (3,)
                             # corresponding to the 3 columns.

(2, 3)
[4 7 6]
(3,)


In [17]:
import numpy as np

# An array with rank 3
x = np.array([[[1, 7, 3], 
               [4, 5, 6]],
              [[10, 23, 33], 
               [43, 52, 16]]
             ])

print(x.shape)               # Has shape (2, 2, 3)
print(x.max(axis=1))         # prints [[4 7 6] [43 52 33]]
print((x.max(axis=1)).shape) # Taking the max over axis 1 has shape (2, 3)

(2, 2, 3)
[[ 4  7  6]
 [43 52 33]]
(2, 3)


In [43]:
import numpy as np

# An array with rank 3
x = np.array([[[1, 7, 3], 
               [4, 5, 6]],
              [[10, 23, 33], 
               [43, 52, 16]]
             ])

print(x.shape)               # Has shape (2, 2, 3)
print(x.max(axis=0))         # prints [[4 7 6] [43 52 33]]
print((x.max(axis=0)).shape) # Taking the max over axis 1 has shape (2, 3)

(2, 2, 3)
[[10 23 33]
 [43 52 16]]
(2, 3)


In [18]:
import numpy as np

# An array with rank 3
x = np.array([[[1, 7, 3], 
               [4, 5, 6]],
              [[10, 23, 33], 
               [43, 52, 16]]
             ])

print((x.max(axis=(1, 2))))       # Can take max over multiple axes; prints [7 52]
print((x.max(axis=(1, 2))).shape) # Taking the max over axes 1, 2 has shape (2,)

[ 7 52]
(2,)


In [44]:
import numpy as np

# An array with rank 3
x = np.array([[[1, 7, 3], 
               [4, 5, 6]],
              [[10, 23, 33], 
               [43, 52, 16]]
             ])

print((x.max(axis=(2, 1))))       # Can take max over multiple axes; prints [7 52]
print((x.max(axis=(2, 1))).shape) # Taking the max over axes 1, 2 has shape (2,)

[ 7 52]
(2,)


In [45]:
import numpy as np

# An array with rank 3
x = np.array([[[1, 7, 3], 
               [4, 5, 6]],
              [[10, 23, 33], 
               [43, 52, 16]]
             ])

print((x.max(axis=(0, 2))))       # Can take max over multiple axes; prints [7 52]
print((x.max(axis=(0, 2))).shape) # Taking the max over axes 1, 2 has shape (2,)

[33 52]
(2,)


## Indexing
* NumPy also provides powerful indexing schemes.

In [22]:
import numpy as np

a = np.array([[1, 2, 3, 4],
              [5, 6, 7, 8],
              [9, 10, 11, 12]])

print(a[1][2])   # Prints 7
print(a[1, 2])   # Prints 7

b = a[:2, 1:3]
print(b)

7
7
[[2 3]
 [6 7]]


* Often, it's useful to select or modify one element from each row of a matrix. 
* The following example employs fancy indexing, where we index into our array using an array of indices (say an array of integers or booleans):

In [49]:
import numpy as np

# Create a new array from which we will select elements
a = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9],
              [10, 11, 12]])

# Create an array of indices
b = np.array([0, 2, 0, 1])

# Select one element from each row of a using the indices in b
print(a[[0,1,2,3], [0,2,0,1]])  # Prints "[ 1  6  7 11]"  --> ok!!

# np.arange(4)  # array([0, 1, 2, 3])
# print(a[np.arange(4), b])  # Prints "[ 1  6  7 11]"  --> ok!!

# print(a[0:4, b])  # Prints "[ 1  6  7 11]"  --> no!!

[ 1  6  7 11]


In [24]:
# Mutate one element from each row of a using the indices in b
a[np.arange(4), b] += 10

print(a)  # prints "array([[11,  2,  3],
          #                [ 4,  5, 16],
          #                [17,  8,  9],
          #                [10, 21, 12]])

[[11  2  3]
 [ 4  5 16]
 [17  8  9]
 [10 21 12]]


* We can also use boolean indexing/masks. 
* Suppose we want to set all elements greater than MAX to MAX:

In [25]:
import numpy as np

MAX = 5
nums = np.array([1, 4, 10, -1, 15, 0, 5])
print(nums > MAX)            # Prints [False, False, True, False, True, False, False]

nums[nums > MAX] = MAX
print(nums)                  # Prints [1, 4, 5, -1, 5, 0, 5]

[False False  True False  True False False]
[ 1  4  5 -1  5  0  5]


* Note that the indices in fancy indexing can appear in any order and even multiple times:

In [26]:
import numpy as np

nums = np.array([1, 4, 10, -1, 15, 0, 5])
print(nums[[1, 2, 3, 1, 0]])  # Prints [4 10 -1 4 1]

[ 4 10 -1  4  1]
