[Source]

<tr>
<td>
<a href="http://www3.canisius.edu/~yany/python/Python4DataAnalysis.pdf" target="_blank">
<img src="https://images-na.ssl-images-amazon.com/images/I/515XdK-YtFL._SX379_BO1,204,203,200_.jpg" style="float:left; width: 250px;">
</a>
</td>
</tr>

# NumPy Basics: Arrays and Vectorized Computation

- ***Num***erical ***Py***thon
- High performace scientific computing and data analysis
    - **ndarray**, a fast and space-efficient multidimensional array providing vectorized arithmetic operations and sophisticated *broadcast* capabilities
    - Standard mathematical functions for fast operations on entire arrays of data without having to write loops
    - Tools for reading / writing array data to disk and working with memory-mapped files
    - Linear algebra, random number generation, and Fourier transform capabilities
    - Tools for integration code written in C, C++, and Fortran

In [None]:
import numpy as np

## The NumPy ndarray: A Mutidimensional Array Object
- ndarray : N-dimensional array object
- All of the elements must be the **same** type

### Creating ndarrays

In [None]:
data1 = [6, 7.5, 8, 0, 1]

In [None]:
arr1 = np.array(data1)
arr1

In [None]:
data2 = [
    [1, 2, 3, 4], 
    [5, 6, 7, 8]
]

In [None]:
arr2 = np.array(data2)
arr2

In [None]:
arr2.ndim

In [None]:
arr1.ndim

In [None]:
arr2.shape

In [None]:
arr1.shape

In [None]:
arr1.dtype

In [None]:
arr2.dtype

In [None]:
np.zeros(10)

In [None]:
np.zeros((3, 6))

In [None]:
np.empty((2, 3, 2))   # w/o initializing

In [None]:
range(10)

In [None]:
np.arange(15)

*Array creatinal functions*

|Function|Description|
|--------|-----------|
|array|Cover input data to an ndarray either by inferring a dtype or explicitly secifying a dtype|
|asarray|Convert input to ndarray, but do not copy if the input is already an ndarray|
|arange|Like the built-in range but returns an ndaarys instead of a list.|
|ones, ones_like|Produce anarray of all 1's with the given shape and dtype.|
|zeros, zeros_like|Like ones and ones_like but producing arrays of 0's instead|
|empty, empty_like|Create new arrays by allocating new memory, but do not populate with any values like ones and zeros|
|eye, identiy|Create a square N x N identity matrix|
</div>

### Data Types for ndarrays

In [None]:
arr1 = np.array([1, 2, 3], dtype=np.float64)
arr1

In [None]:
arr2 = np.array([1, 2, 3], dtype=np.int32)
arr2

In [None]:
arr1.dtype

In [None]:
arr2.dtype

*NumPy data types*

|Type|Type Code|Description|
|----|----------|---------|
|int8, uint8|i1, u1||
|int16, uint16|i2, u2||
|int32, uint32|i4, u4||
|int64, uint64|i8, u8||
|float16|f2||
|float32|f4 or f||
|float64|f8 or d||
|float128|f16 or g||
|complex64, complex128, complex256|c8, c16, c32||
|bool|?||
|object|O||
|string_|S||
|unicode_|U|||


In [None]:
# type cast 1
arr = np.array([1, 2, 3, 4, 5])

In [None]:
arr.dtype

In [None]:
float_arr = arr.astype(np.float64)
float_arr

In [None]:
float_arr.dtype

In [None]:
# type cast 2
arr = np.array([3.7, -1.2, -2.6, 0.5, 12.9, 10.1])
arr

In [None]:
arr.astype(np.int32)

In [None]:
# type cast 3
numeric_strings = np.array(['1.25', '-9.6', '42'], dtype=np.string_)
numeric_strings

In [None]:
numeric_strings.astype(float)

In [None]:
# ?????
numeric_strings = np.array(['1.25', '-9.6', '42'], dtype=float)
numeric_strings

In [None]:
# type cast by another dtype
int_array = np.arange(10)

In [None]:
calibers = np.array([.22, .270, .357, .44, .50], dtype=np.float64)

In [None]:
int_array.astype(calibers.dtype)

In [None]:
# type cast 4
empty_uint32 = np.empty(8, dtype='u4')
empty_uint32

### Operations between Arrays and Scalars

In [None]:
arr = np.array([[1., 2., 3.], [4., 5., 6.]])
arr

In [None]:
# possible(?)
arr * arr

In [None]:
arr - arr

In [None]:
1 / arr

In [None]:
arr ** 0.5

Bradcasting : operations between differently sized arrays

In [None]:
arr * 10

### Basic Indexing and Slicing

In [None]:
arr = np.arange(10)
arr

In [None]:
arr[5]

In [None]:
arr[5:8]

In [None]:
arr[5:8] = 12

In [None]:
arr

In [None]:
# list 와 비교

In [None]:
list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [None]:
list[5:8] = [12, 12, 12]

In [None]:
list

* Distinction from lists
    - Array slices are *views* on the original array.
    - The data is not copied, and any modifications to the view will be reflected in the source array.

In [None]:
arr_slice = arr[5:8]

In [None]:
arr_slice

In [None]:
arr_slice[1] = 12345

In [None]:
arr_slice

In [None]:
arr

In [None]:
arr_slice[:] = 64

In [None]:
arr_slice

In [None]:
# ***************** 
arr

* explicitly copy the array : **arr[5:8].copy( )**

In [None]:
# 2D array
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
arr2d.shape

In [None]:
arr2d[2]

In [None]:
arr2d[0][2]

In [None]:
arr2d[0, 2]

<figure style="float:left; width:350px;">
    <center>
    <img src="https://www.safaribooksonline.com/library/view/python-for-data/9781449323592/httpatomoreillycomsourceoreillyimages1346880.png">
    <figcaption>[ Indexing elements in a NumPy array ]</figcaption>
    </center>
</figure>

In [None]:
# 3D array (2 x 2 x 3)     
arr3d = np.array([
        [[1, 2, 3], 
         [4, 5, 6]], 
        [[7, 8, 9], 
         [10, 11, 12]]
])
arr3d

In [None]:
arr3d[0]

In [None]:
old_values = arr3d[0].copy()
arr3d[0] = 42

In [None]:
arr3d

In [None]:
arr3d[0] = old_values

In [None]:
arr3d        # !!!

In [None]:
arr3d[1, 0]

#### Indexing with slices

In [None]:
arr[1:6]

In [None]:
arr2d

In [None]:
arr2d[:2]

In [None]:
arr2d[:2, 1:]

In [None]:
arr2d[1, :2]

In [None]:
arr2d[2, :1]

In [None]:
arr2d[:, :1]

In [None]:
# arr2d[:2, 1:] = 0

<figure style="float:left; width:350px;">
    <center>
    <img src="https://www.safaribooksonline.com/library/view/python-for-data/9781449323592/httpatomoreillycomsourceoreillyimages1346882.png">
    <figcaption>[ Two-dimensional array slicing ]</figcaption>
    </center>
</figure>

### Boolean Indexing

In [None]:
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
names


In [None]:
data = np.random.randn(7, 4)
data

In [None]:
names == 'Bob'

In [None]:
data[names == 'Bob']

In [None]:
data[names == 'Bob', 2:]

In [None]:
data[names == 'Bob', 3]

In [None]:
names != 'Bob'

In [None]:
data[-(names == 'Bob')]

In [None]:
mask = (names == 'Bob') | (names == 'Will')

In [None]:
mask

In [None]:
data[mask]

※ Selecting data from an array by boolean indexing *always* creates a **copy** of the data, even if the returned array is uncahnged.

In [None]:
data

In [None]:
data[data < 0]

In [None]:
data[data < 0] = 0

In [None]:
data

In [None]:
data[names != 'Joe'] = 7

In [None]:
data

### Fancy Indexing
Describe indexing using integer arrays

In [None]:
arr = np.empty((8, 4))

for i in range(8):
    arr[i] = i

In [None]:
#,    [[ ]]
arr[[4, 3, 0, 6]]

In [None]:
arr[[-3, -5, -7]]

In [None]:
arr = np.arange(32).reshape((8, 4))
arr

In [None]:
arr[[1, 5, 7, 2], [0, 3, 1, 2]]

In [None]:
arr[[1, 5, 7, 2]][:, [0, 3, 1, 2]]

In [None]:
arr[np.ix_([1, 5, 7, 2], [0, 3, 1, 2])]

Fancy indexing always copies the data into a new array, unlike sliciing.

### Transposing Arrays and Swapping Axes

In [None]:
arr = np.arange(15).reshape((3, 5))
arr

In [None]:
arr.T

In [None]:
arr = np.random.randn(6, 3)

In [None]:
arr

In [None]:
np.dot(arr.T, arr)

In [None]:
arr = np.arange(16).reshape((2, 2, 4))
arr

In [None]:
arr.transpose((1, 0 , 2))   # ???

In [None]:
arr

In [None]:
arr.swapaxes(1, 2)

## Universal Functions: Fast Element-wise Array Functions
*ufunc* : performs element-wise operations on data in ndarrys

In [None]:
arr = np.arange(10)

In [None]:
# unary ufunc
np.sqrt(arr)


In [None]:
np.exp(arr)

In [None]:
x = np.random.randn(8)
y = np.random.randn(8)

In [None]:
x

In [None]:
y

In [None]:
# binary ufunc
np.maximum(x, y)   # element-wise maximum

In [None]:
arr = np.random.randn(7) * 5
arr

In [None]:
np.modf(arr)    # returns the fraction and remains

*Unary ufuncs*

|Function|Description|
|---------|-----------|
|abs, fabs|| 
|sqrt|| 
|sqare||
|exp||
|log, log10, log2, log1p||
|sign||
|ceil||
|floor||
|rint||
|modf||
|isnan||
|isfinite, isinf||
|cos, cosh, sin, sinh, tan, tanh||
|arccos, arccosh, arsin, arcsinh, arctan, arctanh||
|logical_not|-arr|

*Binary universal functions*

|Function|Descritption|
|--------|------------|
|add  ||
|subtract  ||
|multiply  ||
|divide, floor_divide. ||
|power  ||
|maxium, fmax  ||
|minimum, fmin  ||
|mod  ||
|copysign  ||
|greater, greater_equal, less, less_equal, equal, not_equal  ||
|logical_and, logical_or, logical_xor||

## Data Processing Using Arrays

In [None]:
points = np.arange(-5, 5, 0.01)

In [None]:
xs, ys = np.meshgrid(points, points)

In [None]:
ys

In [None]:
import matplotlib.pyplot as plt

In [None]:
z = np.sqrt(xs ** 2 + ys ** 2)

In [None]:
z

In [None]:
%matplotlib inline
plt.imshow(z, cmap=plt.cm.gray); plt.colorbar()
plt.title("Image plot of $\sqrt{x^2 + y^2}$ for a grid of values")

### Expressing Conditinal Logic as Array Operations

In [None]:
xarr = np.array([1.1, 1.2, 1.3, 1.4, 1.5])
yarr = np.array([2.1, 2.2, 2.3, 2.4, 2.5])
cond = np.array([True, False, True, True, False])

In [None]:
[
    (x if c else y)
    for x, y, c in zip(xarr, yarr, cond)
]

It has multiple problems
- It will not be very fast for large arrays (because all the work is being done in pure Python)
- It will not work with multidimensional

In [None]:
np.where(cond, xarr, yarr)

In [None]:
# another example
arr = np.random.randn(4, 4)
arr

In [None]:
np.where(arr > 0, 2, -2)

In [None]:
np.where(arr > 0, 2, arr)

In [None]:
# complicated example
# 1. 
result = []
for i in range(n):
    if cond1[i] and cond2[i]:
        result.append(0)
    elif cond1[i]:
        result.append(1)
    elif cond2[i]:
        result.append(2)
    else:
        result.append(3)

In [None]:
# 2. 
np.where(cond1 & cond2, 0, np.where(cond1, 1, np.where(cond2, 2, 3)))

In [None]:
# 3.
result = 1 * (cond1 & -cond2) + 2 * ( cond2 & -cond1) + 3 * -(cond1 | cond2)

### Mathematical and Statistical Methods

In [None]:
arr = np.random.rand(5, 4)  # normally-distributed data

In [None]:
arr.mean()

In [None]:
np.mean(arr)

In [None]:
arr.sum()

In [None]:
arr.mean(axis=1)     # 1 : sum of row, 

In [None]:
arr.sum(0)

In [None]:
arr = np.array([
        [0, 1, 2], 
        [3, 4, 5,], 
        [6, 7, 8]])

In [None]:
arr.cumsum(0)

In [None]:
arr.cumprod(1)

*Basic array statistical methods*

|Method|Description|
|------|-----------|
|sum  ||
|mean  ||
|std, var  ||
|min, max  ||
|argmin, argmax  |Indices of minimum and maximum elements|
|cumsum  |Cumulative sum of elements starting from 0|
|cumprod  |Cumulative product of elements starting from 1|

### Methods for Boolean Arrays

In [None]:
arr = np.random.randn(100)

In [None]:
(arr > 0).sum()

In [None]:
bools = np.array([False, False, True, False])

In [None]:
bools.any()

In [None]:
bools.all()

※ Non-zero elements evalueate to **True**

### Sorting

In [None]:
arr = np.random.randn(8)

In [None]:
arr

In [None]:
arr.sort()

In [None]:
arr

In [None]:
# sort of multidimensinal arrays

In [None]:
arr = np.random.rand(5, 3)

In [None]:
arr

In [None]:
arr.sort(1)

In [None]:
arr

### Unique and Other Set Logic

In [None]:
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])

In [None]:
np.unique(names)

In [None]:
# Pure python
sorted(set(names))

In [None]:
ints = np.array([3, 3, 3, 2, 2, 1, 1, 4, 4])

In [None]:
np.unique(ints)

In [None]:
# tests membership
values = np.array([6, 0, 0, 3, 2, 5, 6])

In [None]:
# test membership
np.in1d(values, [2, 3, 6])

*Array set operations*

|Method|Description|
|------|-----------|
|unique(x)  ||
|intersect1d(x, y)  ||
|union1d(x, y)  ||
|in1d(x, y)  ||
|setdiff1d(x, y)  ||
|setxor1d(x,  y)||

## File Input and Ouput with Arrays
### Storing Arrays on Disk in Binary Format
- np.save
- np.load

In [None]:
arr = np.arange(10)

In [None]:
np.save('some_array', arr)

In [None]:
#!!
np.load('some_array.npy')

In [None]:
arr
arr2

In [None]:
# save multiple arrays
np.savez('array_archive.npz', aa=arr, b=arr2)

In [None]:
arch = np.load('array_archive.npz')

### Saving and Loading Text Files
- np.loadtxt
- np.savetxt

In [None]:
np.loadtxt?

## Linear Algebra

In [None]:
x = np.array([[1., 2., 3.], [4., 5., 6.]])
y = np.array([[6., 23.], [-1, 7], [8, 9]])

In [None]:
x

In [None]:
y

In [None]:
x.dot(y)

In [None]:
np.dot(x, np.ones(3))

**numpy.linalg** : standard set of matrix decompositions and things like inverse and determinant

In [None]:
from numpy.linalg import inv, qr

In [None]:
X = np.random.randn(5, 5)

In [None]:
mat = X.T.dot(X)

In [None]:
inv(mat)

In [None]:
mat.dot(inv(mat))    # ???

In [None]:
q, r = qr(mat)

In [None]:
r

*Commonly-used numpy.linalg funcions*

|Function|Description|
|--------|-----------|
|diag |Return the diagonal elelments of a squre matrix as a 1D array, or convert a 1D array into a square matrix with zeros ont he off-diagonal| 
|dot  ||
|trace|Compute the sum of the diagonal elements|  
|det  |Compute the matrix determinant|
|eig  |Compute the eigenvalues and eigenvectors of a square matrix|
|inv  |Compute the inverse of a square matrix|
|pinv |Compute the Moore-Penrose pseudo-inverse inverse of a matrix| 
|qr  |Compute the QR decomposition|
|svd  |Compute the singula value decomposition(SVD)|
|solve|Solve the linear system Ax=b for x, where A is a square matrix|  
|lstsq|Compute the least-squares solution to Ax=b|

## Random Number Generation

In [None]:
samples = np.random.normal(size=(4,4))
samples

*Partial list of numpy.random functions*

|Function|Description|
|--------|------------|
|seed  |Seed the random number generator|
|permutation|Return a random permutation of a sequence, or return a permuted range|  
|shuffle  |Randomly permute a sequence in place|
|rand  |Draw samples froma uniform distribution|
|randint |Draw random integers from a given low-to-high range|
|randn  |Draw samples from a normal distribution with mean 0 and stardard deviation 1|
|binomial  |Draw samples from a binomial distribution|
|normal  |Draw samples from a normal (Gaussian) distribution|
|beta  |Draw samples from a beta distribution|
|chisquare  |Draw samples from a chi-square distribution|
|gamma  |Draw samples from a gamma distribution|
|uniform |Draw samples from a uniform [0, 1) distribution|

## Example: Random Walks

In [None]:
import random

position = 0
walk = [position]
steps = 1000
for i in range(steps):
    step = 1 if random.randint(0, 1) else -1
    position += step
    walk.append(position)

In [None]:
nsteps = 1000
draws = np.random.randint(0, 2, size=steps)
steps = np.where(draws > 0, 0, -1)
walk = steps.cumsum()

In [None]:
walk.min()

In [None]:
walk.max()

In [None]:
(np.abs(walk) >= 10).argmax()

### Simulating Many Random Wlask at Once

In [None]:
nwalks = 500
nsteps = 1000
draws = np.random.randint(0, 2, size=(nwalks, nsteps))
steps = np.where(draws > 0, 1, -1)
walks = steps.cumsum(1)

In [None]:
walks

In [None]:
walks.max()

In [None]:
walks.min()

In [None]:
hits30 = (np.abs(walks) >= 30).any(1)

In [None]:
hits30

In [None]:
hits30.sum()

In [None]:
crossing_times = (np.abs(walks[hits30]) >= 30).argmax(1)

In [None]:
crossing_times.mean()

In [None]:
steps = np.random.normal(loc=0, scale=0.25, size=(nwalks, nsteps))