In [2]:
# libraries
import numpy as np
import pandas as pd
from pandas import Series, DataFrame
import matplotlib.pyplot as plt

# options
pd.options.display.max_rows = 11

# Chap4. NumPy Basic: Arrays and Vectorized Computation

## ndarray: n-dimentional array
`ndarray`是一种多维数组对象。  
每个数组都有一个`shape`和`dtype`属性。除非特殊说明，`np.array`会寻找合适的data type。

### Creating
Function            | Description 
---                 | ---
array               | accepts any sequence-like object (including other arrays)
asarray             | np.ndarray as input?
arange              | `arange(-5,5,0.1)`
ones, ones_like     |
zeroes, zeroes_like |
eye, identity       | N\*N identity matrix

***Notes***
- `np.array` tries to infer a good data type for the array that it creats.

In [3]:
# from a list
data1 = [6, 7.5, 8, 0, 1]
arr1 = np.array(data1)
# from nested sequence
data2 = [[1, 2, 3, 4], [5, 6, 7, 8]]
arr2 = np.array(data2)
# type = np.ndarray

### Data Types for ndarrays
The numerical dtypes are names the same way as C or Fortan: a type name, like `float` or `int`, followed by a number indicating the number of bits per element.

类型             | 类型码  | 描述                                      
---              | ---     | ---
int8, uint8      | i1, u1  | 有符号和无符号8位（1字节）整数类型        
int16, uint16    | i2, u2  | 有符号和无符号16位整数类型                
int32, uint32    | i4, u4  | 有符号和无符号32位整数类型                
int64, uint64    | i8, u8  | 有符号和无符号64位整数类型                
float16          | f2      | 半精度浮点类型                            
float32          | f4 or f | 标准精度浮点。与C的 `float` 兼容        
float64, float128| f8 or d | 标准双精度浮点。与C的 `double` 和Python 的 `float` 对象兼容              
float128         | f16 or g|  扩展精度浮点                             
complex64, complex128, complex256       | c8, c16, c32 | 分别使用两个32，64，128位浮点表示的复数 
bool             | ?       |  布尔值，存储 `True` 和 `False`       
object           | O       |  Python对象类型                           
string\_         | S       |  定长字符窜类型（每字符一字节）。例如，为了生成长度为10的字符窜，使用 'S10'     
unicode\_        | f16 or g|  扩展精度浮点（字节书依赖平台）。同 `string_` 有相同的语义规范（例如：`U10` ）

***Note***
- `astype` converte strings to numeric form

In [7]:
numeric_string = np.array(['1.25', '-9.6', '42'], dtype=np.string_)
numeric_string.dtype

dtype('S4')

In [8]:
np.array(['1.25'])

array(['1.25'],
      dtype='<U4')

In [12]:
# astype example
# 用 astype 把一个数组的dtype转换或投射到另外的类型
arr = np.array([1, 2, 3])
print(arr.dtype)
print(arr.astype(np.float64).dtype)

int32
float64


### Operations between Arrays and Scalars
Arrays are important because they enable you to expres **batch** operations on data without writing any `for` loops. This is usually called ***vectorization（矢量化）***.

Operations between differntly sized arrays is called ***broadcasting（广播）*** and will be discussed in more detail in Chapter 12.

### Basic Indexing and Slicing

One-dimensional arrays are simple; on the surface they act similarly to Python lists.

Array slices are *views* on the original array. This means that the data is not copied, and any modifications to the view will be reflected in the source array.

For higher dimensional arrays, two ways:
- `arr[0][2]`
- `arr[0, 2]`

### Boolean Indexing

Selecting data from an array by boolean indexing always creates a copy of the data, even if the returned array is unchanged.

### Fancy Indexing

***Fancy indexing*** is a term adopted by NumPy to describe indexing using integer arrays.

To select out a subset of the rows in a particular order, you can simply pass a list or ndarray of integers specifying the desired order.

Fancy indexing, unlike slicing, always copies the data into a new array.

In [4]:
arr = np.empty((8, 4))
for i in range(8):
    arr[i] = i
arr[[4, 3, 0, 6]]

array([[ 4.,  4.,  4.,  4.],
       [ 3.,  3.,  3.,  3.],
       [ 0.,  0.,  0.,  0.],
       [ 6.,  6.,  6.,  6.]])

In [5]:
arr = np.arange(32).reshape((8, 4))
arr[[1, 5, 7, 2], [0, 3, 1, 2]]

array([ 4, 23, 29, 10])

In [8]:
# to get the rectangular regions formed by selecing a subset of the matrix's rows and columns
print(arr[[1, 5, 7, 2]][:, [0, 3, 1, 2]])
print(arr[np.ix_([1, 5, 7, 2], [0, 3, 1, 2])])

[[ 4  7  5  6]
 [20 23 21 22]
 [28 31 29 30]
 [ 8 11  9 10]]
[[ 4  7  5  6]
 [20 23 21 22]
 [28 31 29 30]
 [ 8 11  9 10]]


### Tranposing Arrays and Swapping Axes

Transposing is a special form of reshaping which similarly returns a view on the underlying data without copying anything. Arrays have the transpose method and also the special `T` attribute.

## Universal Functions: Fast Element-wise Array Functions

A universal function, or `ufunc`, is a function that performs elementwise operations on data in ndarrays.

Unary `ufuncs`

Functions            | Description
---                  | ---
abs, fabs            | 计算基于元素的整形，浮点或复数的绝对值。fabs对于没有复数数据的快速版本
sqrt                 | 计算每个元素的平方根。等价于 `arr ** 0.5`
square               | 计算每个元素的平方。等价于 `arr ** 2`
exp                  | 计算每个元素的指数。
log, log10, log2, log1p | 自然对数（基于e），基于10的对数，基于2的对数和 `log(1 + x)`
sign                 | 计算每个元素的符号：1(positive)，0(zero)， -1(negative)
ceil                 | 计算每个元素的天花板，即大于或等于每个元素的最小值
floor                | 计算每个元素的地板，即小于或等于每个元素的最大值
rint                 | 圆整每个元素到最近的整数，保留dtype
modf                 | 分别返回分数和整数部分的数组
isnan                | 返回布尔数组标识哪些元素是 `NaN` （不是一个数）
isfinite, isinf      | 分别返回布尔数组标识哪些元素是有限的（non-inf, non-NaN）或无限的
cos, cosh, sin sinh, tan, tanh  | regular 和 hyperbolic 三角函数
arccos, arccosh, arcsin, arcsinh, arctan, arctanh | 反三角函数
logical_not          | 计算基于元素的非x的真值。等价于 `-arr`

Binary `ufuncs`

Functions                     | Description
---                           | ---
add                           | 在数组中添加相应的元素
substract                     | 在第一个数组中减去第二个数组
multiply                      | 对数组元素相乘
divide, floor_divide          | 除和地板除（去掉余数）
power                         | 使用第二个数组作为指数提升第一个数组中的元素
maximum, fmax                 | 基于元素的最大值。 `fmax` 忽略 `NaN`
minimum, fmin                 | 基于元素的最小值。 `fmin` 忽略 `NaN`
mod                           | 基于元素的模（取余）                 
copysign                      | 拷贝第二个参数的符号到第一个参数
greater, greater_equal, less, less_equal, not_equal | 基于元素的比较，产生布尔数组。等价于中缀操作符 `>, >=, <, <=, ==, !=`
logical_and, logical_or, logical_xor | 计算各个元素逻辑操作的真值。等价于中缀操作符 &, &#x7c;, ^

## Data Processing Using Arrays
### Expressing Conditional Logic as Array Operations
### Mathematical and Statistical Methods
### Methods for Boolean Arrays
### Sorting
### Unique and Other Set Logic


## File Input and Output with Arrays
### Stroing Arrays on Disk in Binary Format
### Saving and Loading Text Files


## Linear Algebra


## Random Number Generation


## Example: Random Walks