# Array Aggregation

### Arithmatic Funcs.
Numpy內提供多樣聚合運算，多為數學指標的計算:  
> .mean() : 根據指定維度計算元素數值平均  
> .std() : 根據指定維度計算元素樣本標準差  
> .var() : 根據指定維度計算元素變異數  
> .sum() : 根據指定維度計算元素總和  
> .prod() : 根據指定維度計算元素連乘積  
> .cumsum()	: 根據指定維度計算元素累積總和  
> .cumprod() : 根據指定維度計算元素連乘積累積總和  
> .min(), .max() : 根據指定維度計算元素最小/最大值  
> .argmin(), argmax() : 回傳最小/最大值的索引位置  
> .all(condition) : 判斷是否全部元素都達成特定條件，回傳布林值  
> .any(condition) : 判斷是否至少有一元素都達成特定條件，回傳布林值  
> .percentile() : 計算元素的排名統計

In [2]:
import numpy as np

In [2]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 101, 102]).reshape(4,3)

print(f'arr\t=\n{arr}\n\n'
      f'arr.sum()\t= {arr.sum()}\n'
      f'np.sum(arr)\t= {np.sum(arr)}\n'
      f'arr.max()\t= {arr.max()}\n'
      f'arr.min()\t= {arr.min()}\n'
      f'arr.mean()\t= {arr.mean()}\n'
      f'arr.std()\t= {arr.std()}\n'
      f'np.median(arr)\t= {np.median(arr)}\n')

arr	=
[[  1   2   3]
 [  4   5   6]
 [  7   8   9]
 [ 10 101 102]]

arr.sum()	= 258
np.sum(arr)	= 258
arr.max()	= 102
arr.min()	= 1
arr.mean()	= 21.5
arr.std()	= 35.87362076326652
np.median(arr)	= 6.5



In [4]:
arr.prod() , arr.cumprod()

(-1270808064,
 array([          1,           2,           6,          24,         120,
                720,        5040,       40320,      362880,     3628800,
          366508800, -1270808064]))

In [6]:
# axis=1代表第2維度，也就是[...]內部，所以這個運算等價於計算[...]內元素和!
arr.sum(axis = 1)

array([  6,  15,  24, 213])

In [8]:
arr.argmax() , arr.argmin()

(11, 0)

In [30]:
np.percentile(arr , 25) , np.quantile(arr,0.25)

(3.75, 3.75)

In [17]:
x = np.array([False, False, True, False])
x.any(), x.all()

(True, False)

In [11]:
o=np.array(False)
z=np.any([-1, 4, 5], out=o)
z, o

(array(True), array(True))

In [13]:
# Check now that z is a reference to o
z is o

True

In [15]:
# 兩者記憶體存放位址相同
id(z), id(o)             

(2819467649328, 2819467649328)

### Logical Funcs.
判斷多個陣列對應T/F值可以使用logical_函數:
> logical_and(arr1 , arr2)   
> logical_or(arr1 , arr2)   
> logical_xor(arr1 , arr2)  
> logical_not(arr1 , arr2)   

In [19]:
l1 = np.array([True, False, True, False])
l2 = np.array([False, False, True, False])

In [20]:
np.logical_and(l1, l2), l1*l2 

(array([False, False,  True, False]), array([False, False,  True, False]))

In [21]:
np.logical_or(l1, l2), l1+l2

(array([ True, False,  True, False]), array([ True, False,  True, False]))

In [22]:
np.logical_xor(l1, l2)

array([ True, False, False, False])

In [23]:
np.logical_not(l2)

array([ True,  True, False,  True])

### Broadcasting 機制
Numpy在運算不同維度的陣列時，會啟動broadcasting機制，其規則如下: 
1. 較低維度的陣列，會在左側墊新維度，維度值(元素個數)為1  
2. 如果兩陣列不能於任一維度中符合，具維度1的陣列，該維度會被提升至符合另一陣列的維度，同時複製元素  
3. 如果提升後兩陣列的shape還是無法完全符合，則報錯  

In [33]:
# Ex1 - 單一 broadcasting
M = np.ones((2,3))
a = np.arange(3)
M+a

# shape of a : (3,) => (1,3) => (2,3)
# 所以最後加法會是
# [[0,1,2],  加   [[1,1,1],
# [0,1,2]]   上   [1,1,1]] 

array([[1., 2., 3.],
       [1., 2., 3.]])

In [35]:
# Ex2 - 雙重broadcasting
M = np.arange(3).reshape(3,1)
a = np.arange(3)
M+a

# shape of a : (3,)  => (1,3) => (3,3)
# shape of M : (3,1) => -    => (3,3)

array([[0, 1, 2],
       [1, 2, 3],
       [2, 3, 4]])

In [36]:
# Ex3 - 無法運算
M = np.ones((3,2))
a = np.arange(3)
M+a


ValueError: operands could not be broadcast together with shapes (3,2) (3,) 

In [8]:
# 經典應用1 - (見02_ArrayManipulation Sorting陣列排序)
X=np.random.randn(10,2)
X
dist_q = np.sum( (X[:,np.newaxis,:]-X[np.newaxis,:,:])**2 , axis=-1)
dist_q

array([[ 0.        ,  9.96775381,  1.55930007,  2.86005518,  7.18412371,
         1.17028387,  0.13250094,  9.85380194,  4.64774115,  3.47061987],
       [ 9.96775381,  0.        ,  4.38746059,  2.45341074,  2.6677942 ,
         8.91448262, 10.18061969,  7.37444553,  2.62080291,  4.70270749],
       [ 1.55930007,  4.38746059,  0.        ,  0.2791195 ,  2.08663767,
         0.84913154,  1.3350174 ,  4.36614484,  2.99182129,  0.57901696],
       [ 2.86005518,  2.45341074,  0.2791195 ,  0.        ,  1.39523333,
         2.05324998,  2.74276254,  4.278097  ,  2.0829946 ,  0.78450865],
       [ 7.18412371,  2.6677942 ,  2.08663767,  1.39523333,  0.        ,
         3.90240743,  6.36622476,  1.22011102,  5.94414964,  0.84806277],
       [ 1.17028387,  8.91448262,  0.84913154,  2.05324998,  3.90240743,
         0.        ,  0.56753488,  4.63808709,  6.56572472,  1.11414585],
       [ 0.13250094, 10.18061969,  1.3350174 ,  2.74276254,  6.36622476,
         0.56753488,  0.        ,  8.25906647

In [5]:
# 經典應用2 - (見02_ArrayManipulation Fancy索引)
X=np.arange(12).reshape(3,4)
row = np.array([0,1,2]) 
col = np.array([2,1,3])
X[row[:,np.newaxis] , col]

# row的shape從(3,)變成(3,1) 
# 根據廣播原則，加入新維度到左邊，此時col的shape(3,)會變成(1,3)
# (3,1)和(1,3)套用2次廣播 => (3,3) 與 (3,3)

array([[ 2,  1,  3],
       [ 6,  5,  7],
       [10,  9, 11]])