### NumPy
NumPy: a multidimensional array object.  
The rationale behind NumPy is the following: Python being a high-level dynamic language, it is easier to use but slower than a low-level language such as C. NumPy implements the multidimensional array structure in C and provides a convenient Python interface, thus bringing together high performance and ease of use. NumPy is used by many Python libraries. For example, pandas is built on top of NumPy.  

ndarray: n-dimensional arrays of homogeneous data types  
-- fixed size at creation  
-- same data type  

attributes of an ndarray object are:  
-- ndarray.ndim：the number of axes (dimensions) of the array.  
-- ndarray.shape：the dimensions of the array.   
-- ndarray.size：the total number of elements of the array.   
-- ndarray.dtype：an object describing the type of the elements in the array.   
-- ndarray.itemsize：the size in bytes of each element of the array.   
-- ndarray.data：the buffer containing the actual elements of the array.   

`np.set_printoptions(threshold=sys.maxsize)`

`python3 -c "import numpy; numpy.info(numpy.add)"`

In [2]:
import random
import numpy as np

In [14]:
# python 对比 numpy
# 1. In NumPy, array operations are implemented internally with C loops rather than Python loops. 
#    Python is typically slower than C because of its interpreted and dynamically-typed nature.
# 2. The data in a NumPy array is stored in a contiguous block of memory in RAM. 
#    This property leads to more efficient use of CPU cycles and cache.

n = 1000000
x = [random.random() for _ in range(n)]
y = [random.random() for _ in range(n)]
print(x[:3])
print(y[:3])
z = [x[i] + y[i] for i in range(n)]
print(z[:3])
%timeit [x[i] + y[i] for i in range(n)]

xa = np.array(x)
ya = np.array(y)
za = xa + ya
print(za[:3])
%timeit xa + ya

%timeit sum(x)
%timeit np.sum(xa)

d = [abs(x[i] - y[j])
     for i in range(1000)
     for j in range(1000)]
# np.newaxis, add a new axis to broadcasting 
da = np.abs(xa[:1000, np.newaxis] - ya[:1000])

[0.211522339052433, 0.6463004828834406, 0.33267705861101304]
[0.24731978603035942, 0.5552296500719227, 0.9454938018764355]
[0.45884212508279243, 1.2015301329553632, 1.2781708604874487]
254 ms ± 31.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
[0.45884213 1.20153013 1.27817086]
3.97 ms ± 454 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
7.13 ms ± 599 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
642 µs ± 10.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


Element-wise : the same shape.  
Broadcasting : different shapes.

In [6]:
a = np.array([[1,2,3], [4,5,6]])
print(a)
print(a.ndim)
print(a.shape)
print(a.sum(axis=0), a.sum(axis=1))

[[1 2 3]
 [4 5 6]]
2
(2, 3)
[5 7 9] [ 6 15]


In [15]:
b = np.array([[[1,2,3], [4,5,6]],[[11,12,13],[14,15,16]]])
print(b)
print(b.ndim)
print(b.shape)
print(b.sum(axis=0))
print(b.sum(axis=1))
print(b.sum(axis=2))

A = np.array([1,2,3])
B = np.array([4,5,6])
C = np.array([11,12,13])
D = np.array([14,15,16])
new_b = np.array([[A, B], [C, D]])
print(new_b)

[[[ 1  2  3]
  [ 4  5  6]]

 [[11 12 13]
  [14 15 16]]]
3
(2, 2, 3)
[[12 14 16]
 [18 20 22]]
[[ 5  7  9]
 [25 27 29]]
[[ 6 15]
 [36 45]]
[[[ 1  2  3]
  [ 4  5  6]]

 [[11 12 13]
  [14 15 16]]]
