# Numpy

* One of the most important package in Python.
* Almost become part of the standard Python language, if you want to do scientific and data computing in Python.


## What is NumPy
\[ Excerpted From NumPy Website\]

NumPy librariy:
* the fundamental package for scientific computing in Python. 
* provides 
    * a multidimensional array object, 
    * various derived objects (such as masked arrays and matrices), and 
* provides routines for fast operations on arrays, including:
    * mathematical, logical, shape manipulation, 
    * sorting, selecting, 
    * I/O, 
    * discrete Fourier transforms, 
    * basic linear algebra, 
    * basic statistical operations, 
    * random simulation 
    * and much more.
  
  
## Key features
* Core of the NumPy package:  **ndarray** object. 
* **ndarray** features:
  * Size fixed at creation
    * Unlike Python lists (which can grow dynamically). 
    * Changing the size of an ndarray will create a new array and delete the original.
  * The elements must be of the same data type 
    * Thus the same size in memory.
  * The exception: 
    * One can have arrays of objects, thereby allowing for arrays of different sized elements.
  
* **Numpy operations**
  * Support advanced mathematical operations, and other types of operations on large numbers of data. 
  * Such operations typically are executed **more efficiently and with less code** than is possible using Python�䏭 built-in sequences.

* **Numpy applications**
  * A growing number of scientific and mathematical Python-based packages are using NumPy arrays.  These packages typically:
    * support Python-sequence input, and convert such input to NumPy arrays before processing, 
    * output NumPy arrays. 


In order to efficiently use much (perhaps even most) of today�䏭 scientific/mathematical Python-based software, just knowing how to use Python�䏭 built-in sequence types is insufficient - one also needs to know how to use NumPy arrays.

## Installing numpy

You can install numpy by using:
* **pip install numpy**

Or, you can install a lot of useful packages at once, by typing (this is recomended by scipy.org, theofficial site for numpy.)
* **pip install numpy scipy matplotlib ipython jupyter pandas sympy nose**

## Quick demonstration of NumPy

In [0]:
# Comparison of plain Python and NumPy

# Multiplication of two vectors in plain Python (inner product)
a = [1, 2, 3, 4, 5]
b = [3, 3, 2, 2, 6]
c = []

for i in range(len(a)):
    c.append(a[i]*b[i])
print(c)


[3, 6, 6, 8, 30]


In [0]:
# Multiplication of two vector in NumPy
import numpy as np
npa = np.array([1, 2, 3, 4, 5])
npb = np.array([3, 3, 2, 2, 6])

npsum = npa + npb
npdiff = npa - npb
nppairprod = npa * npb
print(npsum)
print(npdiff)
print(nppairprod)

[ 4  5  5  6 11]
[-2 -1  1  2 -1]
[ 3  6  6  8 30]


In [0]:
print(npa**2)
print(npa + 10)
print(np.sin(npa))

[ 1  4  9 16 25]
[11 12 13 14 15]
[ 0.84147098  0.90929743  0.14112001 -0.7568025  -0.95892427]


## NumPy's Power
1. **Vectorization**
  * Directly operate in vectors.  No need to write for loops.
    * Vectorized code --> more concise and easier to read
    * Fewer lines of code --> fewer bugs
    * The code more closely resembles standard mathematical notation --> easier to correctly code mathematical constructs
  
2. **Broadcasting**
  * In an operation "a op b", numpy automatically extend the operand which is a scalar or an array of smaller size or lower dimension to much the operand of larger size or higher dimension, so as to make the operation making sense.
    * Broadcasting is the term used to describe the implicit element-by-element behavior of operations; 
    * In NumPy, broadcast occurs in all operations, not just arithmetic operations, but logical, bit-wise, functional, etc. 
    * Moreover, in the example above, a and b could be 
      * multidimensional arrays of the same shape, 
      * one scalar and one array, 
      * two arrays of with different shapes, provided that the smaller array is �𩃀xpandable�� to the shape of the larger in such a way that the resulting broadcast is unambiguous. 
  * For detailed �禃ules�� of broadcasting see numpy.doc.broadcasting.
3. **Indexing**
  * Naming or address a particular element, or a segment of data, in a numpy array.
    * NumPy�䏭 complete indexing schematics
      * Extend Python�䏭 slicing syntax
      * Are complicated (their own beast.) 
    * If you�胩e looking to read more on NumPy indexing, grab some coffee and head to the Indexing section in the NumPy docs.

## Getting started with NumPy

* NumPy�䏭 array class is called ndarray. 
  * It is also known by the alias array. 
  * Note that numpy.array is not the same as the Standard Python Library class array.array, which only handles one-dimensional arrays and offers less functionality. 
  
The more important attributes of an ndarray object are:
* ndarray.ndim
  * the number of axes (dimensions) of the array.
* ndarray.shape
  * the dimensions of the array. 
    * This is a tuple of integers indicating the size of the array in each dimension.
    * For a matrix with n rows and m columns, shape will be (n,m). 
    * The length of the shape tuple is therefore the number of axes, ndim.
* ndarray.size
  * the total number of elements of the array. 
  * This is equal to the product of the elements of shape.
* ndarray.dtype
  * an object describing the type of the elements in the array. 
  * One can create or specify dtype�䏭 using standard Python types. 
  * Additionally NumPy provides types of its own. numpy.int32, numpy.int16, and numpy.float64 are some examples.
* ndarray.itemsize
  * the size in bytes of each element of the array. 
  * For example, an array of elements of type float64 has itemsize 8 (=64/8), while one of type complex32 has itemsize 4 (=32/8). 
  * It is equivalent to ndarray.dtype.itemsize.
* ndarray.data
  * the buffer containing the actual elements of the array. 
  * Normally, we won�脌 need to use this attribute because we will access the elements in an array using indexing facilities.

In [8]:
# Example numpy arrary, and the attributes of the array
import numpy as np

a = np.arange(15).reshape(3, 5)    # create a 3 x 5 2dimensional array (matrix)
print("The array is: ", a)
print("The shape of the array is: ", a.shape)
print("The number of dimension of the array is: ", a.ndim)
print("The type of array element is: ", a.dtype.name)
print("The size of each elements of the array is: ", a.itemsize)
print("The size of the array is: ", a.size)
print("The type of the array is: ", type(a))


The array is:  [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]]
The shape of the array is:  (3, 5)
The number of dimension of the array is:  2
The type of array element is:  int64
The size of each elements of the array is:  8
The size of the array is:  15
The type of the array is:  <class 'numpy.ndarray'>


## Creating numpy arrays

numpy.array(): create an array with users specified the elements values

### array()
Write array elements directly.

In [0]:
# Example of numpy array creation
import numpy as np

b = np.array([6,7,8]) # create an array with values 6, 7, 8
c = np.array(["apple", "orange", "banana"])  # you can even create arrays whose values are not numbers
af = np.array([2.3, 3.6, 4.7]) # you create floating arrays, too, of course
print(b)
print(c)
print(af)

# you can create a new multi-dimensional array by specifying the values directly
f = np.array([[1, 3, 5],
             [2, 4, 6]])
print(f)


[6 7 8]
['apple' 'orange' 'banana']
[2.3 3.6 4.7]
[[1 3 5]
 [2 4 6]]


Specify element data type:

You can specify the data type of array element at creation time


In [0]:
a = np.array([2, 4, 6], dtype=float)
b = np.array([2, 4, 6], dtype=complex) # numpy supports complex numbers

print(a)
print(b)

[2. 4. 6.]
[2.+0.j 4.+0.j 6.+0.j]


### zeros(), ones(), empty()

In data processing, often the values of an array are unknown when the array is created, but its size is known. The values of the array will be filled in later.  NumPy offers several functions to create arrays with initial placeholder content:
* numpy.zeros() creates an array full of zeros
* numpy.ones() creates an array full of ones
* numpy.empty() creates an array whose initial content is random and depends on the state of the memory. 
* The argument is the shape and dtype(optionaal) of the array.

Notes:
  * These minimize the necessity of growing arrays, which is an expensive operation.
   * By default, the dtype of the created array is float64.

In [9]:
a = np.zeros( (3,4) )
b = np.ones( (2,3,4), dtype=np.int16 ) 
c = np.empty( (2,3) ) 

print(a)
print(b)
print(c)

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]
[[[1 1 1 1]
  [1 1 1 1]
  [1 1 1 1]]

 [[1 1 1 1]
  [1 1 1 1]
  [1 1 1 1]]]
[[2.46331250e-316 8.39911598e-323 2.90250257e+127]
 [6.19362300e+175 7.50189709e+247 5.38891779e+228]]


### arange()
numpy.arange(): 
* create an array and initialize it with sequential numbers
* Usually the arange() call is followed by a reshape() call.


In [10]:
a = np.arange(12)  # create an array with 12 elements
print(a)

# you can also create multi-dimensional arrays
d = np.arange(12).reshape(3,4)
# you can create a new multi-dimensional array by reshaping an old array
e = d.reshape(2,6)
print(d)
print(e)

[ 0  1  2  3  4  5  6  7  8  9 10 11]
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]]


In [0]:
# Example of NumPy ndarray
# forming a 3-dimensional array with 36 elements
arr = np.arange(36).reshape(3, 4, 3)
print(arr)

# describe the array's shape
print(arr.shape)

[[[ 0  1  2]
  [ 3  4  5]
  [ 6  7  8]
  [ 9 10 11]]

 [[12 13 14]
  [15 16 17]
  [18 19 20]
  [21 22 23]]

 [[24 25 26]
  [27 28 29]
  [30 31 32]
  [33 34 35]]]
(3, 4, 3)


####Advanced use of numpy.arange()
numpy.arange() behaves like the array version of the range() function.  You can supply 1, 2, or 3 arguments.

In [11]:
q = np.arange(2,8)
print(q)

a = np.arange( 10, 40, 6 )  # create an array from 10 to "before 30", with a distance of 5
print(a)

b = np.arange( 0, 2, 0.4 )
print(b)

[2 3 4 5 6 7]
[10 16 22 28 34]
[0.  0.4 0.8 1.2 1.6]


### linespace()
numpy.arange()
* create an array of numbers with equal space
* Input: start number, end number, number of end-points
* o do that, we can use numpy.linespace()

In [0]:
import numpy as np

a = np.linspace( 3, 6, 16 )    # from 3 to 6, create 16 element with equal space in between elements
b = np.linspace(0, 2*np.pi, 11)  # from 0 to 2*pi, create 11 element with equal space in between elements
c = np.linspace(0, 1, 5)
print(a)
print(b)
print(c)

[3.  3.2 3.4 3.6 3.8 4.  4.2 4.4 4.6 4.8 5.  5.2 5.4 5.6 5.8 6. ]
[0.         0.62831853 1.25663706 1.88495559 2.51327412 3.14159265
 3.76991118 4.39822972 5.02654825 5.65486678 6.28318531]
[0.   0.25 0.5  0.75 1.  ]


### The random family of function
create array usig numpy.random class
  * numpy.random.choice(): for labeled data and other data
  * numpy.random.randit(): for integers
  * numpy.random.random(): for floating points

In [12]:
import numpy as np

np.random.seed(444)

x = np.random.choice(["happy","sad"], size=20)
y = np.random.choice(["apple", "banana", "orange"], size=10)
z = np.random.choice([True, False], size=5)
w = np.random.randint(5, 10, 10)  # between 5 and "before 10", create 10 random numbers
v = np.random.random((2,3))  # create random floating number in [0, 1) to fill the shape (2,3)
print(x)
print(y)
print(z)
print(w)
print(v)

['sad' 'happy' 'sad' 'happy' 'sad' 'happy' 'sad' 'happy' 'happy' 'sad'
 'happy' 'sad' 'happy' 'happy' 'happy' 'sad' 'happy' 'happy' 'happy' 'sad']
['apple' 'orange' 'apple' 'apple' 'apple' 'orange' 'orange' 'apple'
 'orange' 'apple']
[ True  True False  True False]
[8 7 5 8 7 9 6 7 6 5]
[[0.56915293 0.64959807 0.94676303]
 [0.07810691 0.07490824 0.99666366]]


## ndarray operations

### Basic arithmetic operations

When **applying** arithmetic operators on two arrays, numpy applies the operation on each pair of corresponding elements (elementwise,) and creates a new array to hold the result array.

In [13]:
import numpy as np

# two arrays
a = np.array( [10, 20, 30, 40] )
b = np.array( [1, 2, 3, 4] )

c = a + b
d = a - b
e = a * b
f = a / b

print(c)
print(d)
print(e)
print(f)

[11 22 33 44]
[ 9 18 27 36]
[ 10  40  90 160]
[10. 10. 10. 10.]


In [15]:
# an array and a constant

a = np.array([1,2,3,4])
b = a**2
c = a + 3
print(b)
print(c)

d = np.array([2,2,4,4])
e = (a==d)
print(e)

f = a > 2.5
print(f)

[ 1  4  9 16]
[4 5 6 7]
[False  True False  True]
[False False  True  True]


#### += and \*= operators
Operations like  += and \*= modify arrays in place rather than create a new ones.

However, the value must be integer.   An error will be occur if the value is not integer.

In [16]:
import numpy as np
a = np.array([1,2,3])
b = np.array([4,5,6])

a += b
print(a)

b *= 3
print(b)

[5 7 9]
[12 15 18]


In [0]:
b = np.array([1.5,2.5,3.5])
a += b  # The value used by "+=", etc. must be an integer.  This will cause TypeError

When operating with arrays of different types, the type of the resulting array corresponds to the more general or precise one (a behavior known as upcasting).

This is similar in most programming languages.

### Element pair product vs. matrix product
Unlike in many matrix languages, the product operator * operates elementwise in NumPy arrays. The matrix product can be performed using the @ operator (in python >=3.5) or the dot function or method:


In [0]:
A = np.array([[1,1],
             [0,1]])
B = np.array([[2,0],
             [3,4]])
C = A * B      # eementwise product
D = A @ B      # matrix product
E = A.dot(B)   # matrix product

print(C)
print(D)
print(E)

[[2 0]
 [0 4]]
[[5 4]
 [3 4]]
[[5 4]
 [3 4]]


### Numpy built-in unary functions

* sum()
* max()
* min()

They can work on specific "axis"!

In [5]:
import numpy as np
a = np.array([[6.3, 5.7, 12.6, 2.8],[1,2,3,4]])

# very convenient and power funtions
print(a.sum())
print(sum(a))
print(a.max())
print(a.min())

37.4
[ 7.3  7.7 15.6  6.8]
12.6
1.0


In [0]:
b = np.arange(12).reshape(3,4)
print(b)

print(b.sum())

print(b.sum(axis=0))
print(b.sum(axis=1))

print(b.max(axis=1))


print("Cumulative sum along each row ---\n", b.cumsum(axis=1))   

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
66
[12 15 18 21]
[ 6 22 38]
[ 3  7 11]
Cumulative sum along each row ---
 [[ 0  1  3  6]
 [ 4  9 15 22]
 [ 8 17 27 38]]


## Numpy universal functions

NumPy provides familiar mathematical functions such as sin, cos, and exp. In NumPy, these are called �崬niversal functions��(ufunc). Within NumPy, these functions operate elementwise on an array, producing an array as output.



In [13]:
a = np.array([1,2,3,4])
print(np.exp(a))
print(np.sqrt(a))
print(np.sin(a))

b = np.array([1,1,2,2])
print(np.add(a,b))
print(a+b)


[ 2.71828183  7.3890561  20.08553692 54.59815003]
[1.         1.41421356 1.73205081 2.        ]
[ 0.84147098  0.90929743  0.14112001 -0.7568025 ]
[2 3 5 6]
[2 3 5 6]


## Indexing and slicing

One-dimensional numpy arrays:  similar to python lists.

Multi-dimensional indexing and slicing is powerful.

In [14]:
import numpy as np
# Let's first create a 2-d array that is easy to see
# function f() will used in 2-d array creation.
def f(x,y):
  return 10*x+y

# the position of each cell will be given to f(), return value will be put in the cell
b = np.fromfunction(f,(5,4), dtype=int)  
print(b)


[[ 0  1  2  3]
 [10 11 12 13]
 [20 21 22 23]
 [30 31 32 33]
 [40 41 42 43]]


In [0]:
# Index scheme similar to regular python
print(b[3,2])  # cell at row 3 and column 2
print(b[0:5, 1])  # all of column 1
print(b[:, 1])    # same as above, 
print(b[2, :])    # all of row 2
print(b[1:3, 2:4])   # a sub-matrix consisting of the intersection of row 1, 2 and column 2, 3


32
[ 1 11 21 31 41]
[ 1 11 21 31 41]
[20 21 22 23]
[[12 13]
 [22 23]]


In [0]:
# numpy indexing
print(b[0])       # when only 1 index, it means the whole row.  The first row
print(b[-1])      # The last row 
# "..." means filling in as many colons ":" as needed.
print(b[-1,...])  # Same as above, the last rwo
print(b[...,0])   # The first column 
print(b[...,1])   # The 2nd column

## Iterating over numpy arrays

Iterate over a multi-dimensional array:
* One rows at a time.
* One element at a time.

In [0]:
import numpy as np
# Let's first create a 2-d array that is easy to see
# function f() will used in 2-d array creation.
def f(x,y):
  return 10*x+y

# the position of each cell will be given to f(), return value will be put in the cell
b = np.fromfunction(f,(5,4), dtype=int)  
print(b)

[[ 0  1  2  3]
 [10 11 12 13]
 [20 21 22 23]
 [30 31 32 33]
 [40 41 42 43]]


In [0]:
# One row at a time
for row in b:
  print("Get one row:", row)

get one row: [0 1 2 3]
get one row: [10 11 12 13]
get one row: [20 21 22 23]
get one row: [30 31 32 33]
get one row: [40 41 42 43]


In [0]:
# To iterate one element at a time, we must use the "flat" attribute.
print("type of b.flat is:", type(b.flat))
for element in b.flat:
  print("Get one element:", element)

type of b.flat is: <class 'numpy.flatiter'>
Get one element: 0
Get one element: 1
Get one element: 2
Get one element: 3
Get one element: 10
Get one element: 11
Get one element: 12
Get one element: 13
Get one element: 20
Get one element: 21
Get one element: 22
Get one element: 23
Get one element: 30
Get one element: 31
Get one element: 32
Get one element: 33
Get one element: 40
Get one element: 41
Get one element: 42
Get one element: 43


## Change shape

the following three commands all return a modified array, but do not change the original array:
* ravel()
* reshape()
* T

In [20]:
# build a 3 by 4 array
a = np.floor(10*np.random.random((3,4)))
print(a)
a.shape
print(a.flatten())
print(a.ravel())

print(a.reshape(6,2))
print(a.T)

[[7. 9. 8. 7.]
 [1. 5. 1. 5.]
 [4. 4. 6. 4.]]
[7. 9. 8. 7. 1. 5. 1. 5. 4. 4. 6. 4.]
[7. 9. 8. 7. 1. 5. 1. 5. 4. 4. 6. 4.]
[[7. 9.]
 [8. 7.]
 [1. 5.]
 [1. 5.]
 [4. 4.]
 [6. 4.]]
[[7. 1. 4.]
 [9. 5. 4.]
 [8. 1. 6.]
 [7. 5. 4.]]


#### reshape() vs resize()
reshape() and resize() perform similar task.
 reshape() returns a new array with the specified shape.
 resize() changes the original array itself

In [0]:
print(a.reshape(2,6))  # reshape() returns a new array, but "a" is not change.
print(a)
print(a.resize(2,6))  # resize() returns none, but "a" is changed.
print(a)

[[1. 1. 9. 0. 3. 5.]
 [7. 0. 6. 7. 6. 6.]]
[[1. 1. 9. 0.]
 [3. 5. 7. 0.]
 [6. 7. 6. 6.]]
None
[[1. 1. 9. 0. 3. 5.]
 [7. 0. 6. 7. 6. 6.]]


## Stakcing arrays together

In [21]:
import numpy as np
a = np.array([[1,2],
             [3,4]])
b = np.array([[5,6],
             [7,8]])
c = np.vstack((a,b))
d = np.hstack((a,b))

print(c, c.shape)
print(d)

[[1 2]
 [3 4]
 [5 6]
 [7 8]] (4, 2)
[[1 2 5 6]
 [3 4 7 8]]


The function **column_stack()** stacks 1D arrays as columns into a 2D array. It is equivalent to hstack only for 2D arrays.

The function **row_stack()** is equivalent to vstack() in any case.

In [0]:
c = np.column_stack((a,b)) # for 2D arrays, it's the same hstack()
print(c)

a1 = np.array([1,2,3,4])
a2 = np.array([5,6,7,8])
a3 = np.array([9,10,11,12])
d = np.column_stack((a1,a2,a3))
print(d)


[[1 5]
 [2 6]
 [3 7]
 [4 8]]
[[ 1  5  9]
 [ 2  6 10]
 [ 3  7 11]
 [ 4  8 12]]


### newaxis

We can force a 1-d row into a 1-d column by using "newaxis"

In [0]:
import numpy as np
from numpy import newaxis

a = np.array([1,2,3,4])
print(a)
b = a[:,newaxis]
print(b)

[1 2 3 4]
[[1]
 [2]
 [3]
 [4]]


### r_ and c_
You can consider r_ and c_ as shorthand in the row direction and column direction

In [0]:
print(np.r_[1:4,0, 4])
print(np.r_[2:5, 11:13])


[1 2 3 0 4]
[ 2  3  4 11 12]


## Split an array into several 

Using hsplit, you can split an array along its horizontal axis, either by specifying the number of equally shaped arrays to return, or by specifying the columns after which the division should occur:

In [0]:
import numpy as np
# build a 2X9 array
a = np.r_[0:9]
b = np.r_[9:18]
c = np.vstack((a,b))
print(c)

# split c horizontally into 3 equal parts
d = np.hsplit(c,3)
print(d)

# split c into 3 parts, using column 2 and column 5 as bundary
d = np.hsplit(c,(2,5))
print(d)

[[ 0  1  2  3  4  5  6  7  8]
 [ 9 10 11 12 13 14 15 16 17]]
[array([[ 0,  1,  2],
       [ 9, 10, 11]]), array([[ 3,  4,  5],
       [12, 13, 14]]), array([[ 6,  7,  8],
       [15, 16, 17]])]
[array([[ 0,  1],
       [ 9, 10]]), array([[ 2,  3,  4],
       [11, 12, 13]]), array([[ 5,  6,  7,  8],
       [14, 15, 16, 17]])]


## Depth of copy

When we create one numpy array from another, there can be thre kinds of relationship between the two arrays:
* No copy: own a new name is create.  New name and old name refer to the same array.
* Shallow copy: a new array container is created, but the data is the same.
* Deep copy: a new array container and a new copy of data is created.





In [0]:
# Demonstration of no copy
c = np.arange(18).reshape(2,9)
print(c)
d = c
print(d is c)

print(c.shape)
d.shape = (3,6)
print(c.shape)

[[ 0  1  2  3  4  5  6  7  8]
 [ 9 10 11 12 13 14 15 16 17]]
True
(2, 9)
(3, 6)


In [0]:
# demonstration of shallow copy
c.shape = 2, 9
d = c.view()   # the view() function creates a shallow copy
print(d is c)
print(d.base is c)
print(d.flags.owndata)

print(c.shape)
d.shape = 3, 6
print(c.shape)

d[0,0] = 100
print(d)
print(c)

False
False
False
(2, 9)
(2, 9)
[[100   1   2   3   4   5]
 [  6   7   8   9  10  11]
 [ 12  13  14  15  16  17]]
[[100   1   2   3   4   5   6   7   8]
 [  9  10  11  12  13  14  15  16  17]]


In [0]:
# Demonstration of deep copy
c[0,0] = 0
d = c.copy()
print(d is c)
print(d.base is c)
print(d.flags.owndata)

d.shape = 3,6
print("c.shape is:", c.shape)
print("d.shape is:", d.shape)
d[0,0] = 200
print("c is", c)
print("d is", d)

False
False
True
c.shape is: (2, 9)
d.shape is: (3, 6)
c is [[ 0  1  2  3  4  5  6  7  8]
 [ 9 10 11 12 13 14 15 16 17]]
d is [[200   1   2   3   4   5]
 [  6   7   8   9  10  11]
 [ 12  13  14  15  16  17]]


## Functions and methods overview

Here is a list of some useful NumPy functions and methods names ordered in categories. See Routines for the full list.


* Array Creation
  * arange, array, copy, empty, empty_like, eye, fromfile, fromfunction, identity, linspace, logspace, mgrid, ogrid, ones, ones_like, r, zeros, zeros_like
* Conversions
  * ndarray.astype, atleast_1d, atleast_2d, atleast_3d, mat
* Manipulations
  * array_split, column_stack, concatenate, diagonal, dsplit, dstack, hsplit, hstack, ndarray.item, newaxis, ravel, repeat, reshape, resize, squeeze, swapaxes, take, transpose, vsplit, vstack
* Questions
  * all, any, nonzero, where
* Ordering
  * argmax, argmin, argsort, max, min, ptp, searchsorted, sort
* Operations
  * choose, compress, cumprod, cumsum, inner, ndarray.fill, imag, prod, put, putmask, real, sum
* Basic Statistics
  * cov, mean, std, var
* Basic Linear Algebra
  * cross, dot, outer, linalg.svd, vdot

## Vectorization?
Vectorization is a powerful ability within NumPy 
* to express operations as occurring on entire arrays rather than their individual elements. 

In [3]:
import numpy as np
x = np.random.choice(["happy","sad"], size=20)
print(x)

['happy' 'sad' 'happy' 'sad' 'sad' 'sad' 'sad' 'sad' 'happy' 'happy' 'sad'
 'happy' 'sad' 'sad' 'happy' 'sad' 'sad' 'happy' 'happy' 'sad']


In [4]:
# Count the number of transition in x
# That is, 1-->0 or 0-->1
def count_transitions(x):
  count = 0
  for i, j in zip(x[:-1], x[1:]):
    #print(i, j)
    if i != j:
      count += 1
  return count

print("The number of transition in x is: ", count_transitions(x))

The number of transition in x is:  11


In [6]:
# Count the number of transition in x, using numpy's ndarray operation directly
# Step by step explanation
y = x[:-1] != x[1:]
print(y)

print( np.count_nonzero(y))

[ True  True  True False False False False  True False  True  True  True
 False  True  True False  True False  True]
11


In [5]:
# Count the number of transition in x, using numpy's ndarray operation directly
print(np.count_nonzero(x[:-1] != x[1:]))


11


In [0]:
# Find out the performance difference between for implementation and numpy implementation
x = np.random.choice(["happy","sad"], size=1000)


from timeit import timeit
setup = 'from __main__ import count_transitions, x; import numpy as np'
num = 1000
t1 = timeit('count_transitions(x)', setup=setup, number=num)
t2 = timeit('np.count_nonzero(x[:-1] < x[1:])', setup=setup, number=num)
print('Speed difference: {:0.1f}x'.format(t1 / t2))

# --> numpy implementation is much much faster (and much simpler)

Speed difference: 36.2x


In [7]:
# A stock example
# Find out the maximum profit

prices = (20, 18, 14, 17, 20, 21, 15)

def profit(prices):
  max_prof = 0
  min_px = prices[0]
  for px in prices[1:]:
      min_px = min(min_px, px)
      max_prof = max(px - min_px, max_prof)
  return max_prof

profit(prices)


7

In [0]:
# implement using numpy
# do it in three lines for clarity
# ML: this version is not always correct
import numpy as np

prices = (20, 18, 14, 17, 20, 21, 15)
mn = np.argmin(prices)  # find out the position with the min value
mx = mn + np.argmax(prices[mn:])   # find out the position with the max value
print(mn)
print(mx)
max_profit = prices[mx] - prices[mn]
print(max_profit)

2
5
7


In [0]:
# numpy version 2
import numpy as np
prices = (20, 18, 14, 17, 20, 21, 15)

min_so_far = np.minimum.accumulate(prices)
print(min_so_far)
profitarray = prices - min_so_far
max_profit = np.max(profitarray)

print(max_profit)

[20 18 14 14 14 14 14]
7


In [0]:
# clearn up of numpy version 3
# only need 2 lines
min_so_far = np.minimum.accumulate(prices)
max_profit = np.max(prices - min_so_far)
print(max_profit)

7


In [0]:
# implement using numpy, the one line version
np.max(prices - np.minimum.accumulate(prices))

7