# Aggregations: Min, Max, and Everything In Between

데이터에서 특징값이나 대표값을 계산하는 것은 매우 빈번하다.
 - 평균, 표준편차
 - 최소값, 최대값, 중간값, 분위값 등

NumPy는 그러한 계산을 위한 built-in  function들을 제공한다. 

## Summing the Values in an Array


In [0]:
import numpy as np

In [3]:
L = np.random.random(100)
sum(L)

48.92258357388564

NumPy 패키지의 ``sum`` 함수를 똑같이 사용할 수 있다.

In [4]:
np.sum(L)

48.92258357388566

그러나 NumPy version 이 훨씬 빠르다.

In [5]:
big_array = np.random.rand(10000000)
%timeit sum(big_array)
%timeit np.sum(big_array)

1 loop, best of 3: 924 ms per loop
100 loops, best of 3: 9.1 ms per loop


## Minimum and Maximum

Python built-in function인 ``min``과  ``max``를 통해 최소값과 최대값을 구할 수 있다. 

In [6]:
min(big_array), max(big_array)

(3.0741408330037245e-07, 0.9999999916707769)

마찬가지로 NumPy version이 존재하며 더 빠르다. 

In [7]:
np.min(big_array), np.max(big_array)

(3.0741408330037245e-07, 0.9999999916707769)

In [8]:
%timeit min(big_array)
%timeit np.min(big_array)

1 loop, best of 3: 616 ms per loop
100 loops, best of 3: 9.76 ms per loop


Object method를 호출하는 형태도 가능하다.

In [9]:
print(big_array.min(), big_array.max(), big_array.sum())
%timeit big_array.min()

3.0741408330037245e-07 0.9999999916707769 4999776.902031196
100 loops, best of 3: 9.78 ms per loop


### Multi dimensional aggregates

Muti-dimensional array에서도 가능하며, row 혹은 column 단위 적용에도 유용하다. 

In [10]:
M = np.random.random((3, 4))
print(M)

[[0.62608199 0.54785511 0.06788672 0.78666662]
 [0.82705194 0.78730405 0.01606485 0.14580166]
 [0.81594492 0.66182143 0.98550994 0.07666495]]


In [11]:
M.sum()

6.34465418291964

*axis* argument를 통해 축을 지정할 수 있다.** (x축 방향으로 axis, y축 방향으로 column)**


In [12]:
M.min(axis=0)

array([0.62608199, 0.54785511, 0.01606485, 0.07666495])

In [13]:
M.max(axis=1)

array([0.78666662, 0.82705194, 0.98550994])

In [14]:
X=np.random.random((3,4,5))
print(X)
print(X.min(axis=0))
print(X.min(axis=2))

[[[0.94824711 0.37162791 0.00527233 0.89928834 0.46046801]
  [0.47842962 0.13026198 0.65985931 0.50585459 0.74120714]
  [0.3873338  0.40612624 0.64353675 0.78596622 0.9796971 ]
  [0.14882668 0.40035696 0.68811616 0.08646612 0.45010584]]

 [[0.46382713 0.99341781 0.84097154 0.98563442 0.70844273]
  [0.87902012 0.42811665 0.57059865 0.62716617 0.08264858]
  [0.14056881 0.99729929 0.34928111 0.81716249 0.19790847]
  [0.42809678 0.94815324 0.82686706 0.6339326  0.27985192]]

 [[0.96068862 0.34789951 0.05266733 0.84212431 0.62538431]
  [0.07313173 0.76679394 0.5152389  0.63708015 0.39850965]
  [0.29345605 0.5064507  0.32384479 0.2968774  0.77505964]
  [0.0517643  0.12976861 0.77870676 0.87867527 0.49127667]]]
[[0.46382713 0.34789951 0.00527233 0.84212431 0.46046801]
 [0.07313173 0.13026198 0.5152389  0.50585459 0.08264858]
 [0.14056881 0.40612624 0.32384479 0.2968774  0.19790847]
 [0.0517643  0.12976861 0.68811616 0.08646612 0.27985192]]
[[0.00527233 0.13026198 0.3873338  0.08646612]
 [0.46

### Other aggregation functions


|Function Name      |   NaN-safe Version  | Description                                   |
|-------------------|---------------------|-----------------------------------------------|
| ``np.sum``        | ``np.nansum``       | Compute sum of elements                       |
| ``np.prod``       | ``np.nanprod``      | Compute product of elements                   |
| ``np.mean``       | ``np.nanmean``      | Compute mean of elements                      |
| ``np.std``        | ``np.nanstd``       | Compute standard deviation                    |
| ``np.var``        | ``np.nanvar``       | Compute variance                              |
| ``np.min``        | ``np.nanmin``       | Find minimum value                            |
| ``np.max``        | ``np.nanmax``       | Find maximum value                            |
| ``np.argmin``     | ``np.nanargmin``    | Find index of minimum value                   |
| ``np.argmax``     | ``np.nanargmax``    | Find index of maximum value                   |
| ``np.median``     | ``np.nanmedian``    | Compute median of elements                    |
| ``np.percentile`` | ``np.nanpercentile``| Compute rank-based statistics of elements     |
| ``np.any``        | N/A                 | Evaluate whether any elements are true        |
| ``np.all``        | N/A                 | Evaluate whether all elements are true        |



In [21]:
sample = [[1,2,3,4,5],[6,7,8,9,10]]

np.percentile(sample, 100)

10.0