# Aggregations: Min, Max, and Everything In Between

## Summing the Values in an Array

As a quick example, consider computing the sum of all values in an array.
Python itself can do this using the built-in ``sum`` function:

In [6]:
import numpy as np

In [7]:
# randomly generated number from a gaussian/normal distribution between 0 and 1 

L = np.random.random(10)
sum(L)


5.369664854089518

In [15]:
L

[0.6032385296899062,
 0.23308537147246888,
 0.41401048934283813,
 0.4415433651011791,
 0.7211875111231658,
 0.958759644266595,
 0.4627474410087715,
 0.8637258661476861,
 0.35846963731884485,
 0.3128969986180635]

In [16]:
# Calculating Numpy Sum method

np.sum(L)

5.369664854089519

However, because it executes the operation in compiled code, NumPy's version of the operation is computed much more quickly:

In [17]:
big_array = np.random.rand(1000000)
%timeit sum(big_array)
%timeit np.sum(big_array)

78.2 ms ± 257 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
188 µs ± 335 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


## Minimum and Maximum

Similarly, Python has built-in ``min`` and ``max`` functions, used to find the minimum value and maximum value of any given array:

In [37]:
min(big_array), max(big_array)

(1.4059931222609734e-06, 0.9999991826833814)

NumPy's corresponding functions have similar syntax, and again operate much more quickly:

In [38]:
np.min(big_array), np.max(big_array)

(1.4059931222609734e-06, 0.9999991826833814)

In [39]:
%timeit min(big_array)
%timeit np.min(big_array)

54.7 ms ± 416 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
288 µs ± 2.54 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


For ``min``, ``max``, ``sum``, and several other NumPy aggregates, a shorter syntax is to use methods of the array object itself:

In [18]:
print(big_array.min(), big_array.max(), big_array.sum())


# Most of these utilities in numpy are exposed both as function and methods.

# There is no difference between np.min(array) and array.min()
# First is a function call of min in module numpy. Later is a method exposed on object numpy array. 
# Its a question of readability and preference. I prefer the later.


#A method, on the other hand, is a function that is associated with an object. 
#It is called on an object, using the dot notation, and it can access and modify the object's properties.
#Methods are defined within a class, and they can only be called on an instance of that class.
#
#In summary, the main difference between a function and a method is that a function is 
#a standalone block of code that can be called by name, while a method is a 
#function that is associated with an object and can only be called on an instance of that object's class.

1.4059931222609734e-06 0.9999991826833814 499594.0056677854


### Multi dimensional aggregates

One common type of aggregation operation is an aggregate along a row or column.
Say you have some data stored in a two-dimensional array:

In [40]:
M = np.random.random((3, 4))
print(M)

[[0.86801634 0.69651846 0.05937256 0.6996731 ]
 [0.84244288 0.28072694 0.26898981 0.02445196]
 [0.14280812 0.29095161 0.24783402 0.09488798]]


By default, each NumPy aggregation function will return the aggregate over the entire array:

In [41]:
M.sum()

4.516673784668968

In [42]:
# Python Sum - Do not use. 
sum(M)

array([1.85326734, 1.26819702, 0.57619639, 0.81901303])

Aggregation functions take an additional argument specifying the *axis* along which the aggregate is computed. For example, we can find the minimum value within each column by specifying ``axis=0``:

In [43]:
M.sum(axis=0)

array([1.85326734, 1.26819702, 0.57619639, 0.81901303])

In [44]:
M.max(axis=1)

# Axis which is mentioned will be collapsed. 

array([0.86801634, 0.84244288, 0.29095161])


 ``axis=0`` means that the first axis will be collapsed: for two-dimensional arrays, this means that values within each column will be aggregated.

In [45]:
M.sum(axis=0)

array([1.85326734, 1.26819702, 0.57619639, 0.81901303])

In [46]:
L = np.random.randn(10)
L[1]=np.nan
print(L)
print( " Normal function call - " , np.sum(L) )
print( " Nan safe function call - " , np.nansum(L) )


## nansum and others are function and not method which is callable on objects. L.nansum() will not work

[ 1.02132766         nan -1.01572761 -1.26578379 -2.36861326 -1.88737034
  0.92802256 -0.46733915 -0.05887439  0.50546987]
 Normal function call -  nan
 Nan safe function call -  -4.608888466716256


### Other aggregation functions


The following table provides a list of useful aggregation functions available in NumPy:

|Function Name      |   NaN-safe Version  | Description                                   |
|-------------------|---------------------|-----------------------------------------------|
| ``np.sum``        | ``np.nansum``       | Compute sum of elements                       |
| ``np.prod``       | ``np.nanprod``      | Compute product of elements                   |
| ``np.mean``       | ``np.nanmean``      | Compute mean of elements                      |
| ``np.std``        | ``np.nanstd``       | Compute standard deviation                    |
| ``np.var``        | ``np.nanvar``       | Compute variance                              |
| ``np.min``        | ``np.nanmin``       | Find minimum value                            |
| ``np.max``        | ``np.nanmax``       | Find maximum value                            |
| ``np.argmin``     | ``np.nanargmin``    | Find index of minimum value                   |
| ``np.argmax``     | ``np.nanargmax``    | Find index of maximum value                   |
| ``np.median``     | ``np.nanmedian``    | Compute median of elements                    |
| ``np.percentile`` | ``np.nanpercentile``| Compute rank-based statistics of elements     |
| ``np.any``        | N/A                 | Evaluate whether any elements are true        |
| ``np.all``        | N/A                 | Evaluate whether all elements are true        |

