<h1>Statistics with Numpy Arrays</h1><p><img src="images/1line.png" width="100%" /></p>
<ul>
<li>NumPy has quite a few useful statistical functions for finding minimum, maximum, percentile standard deviation and variance, etc. from the given elements in the array.&nbsp;</li>

</ul>
<h3>numpy.amin() and numpy.amax()</h3>
<ul>
<li>These functions return the minimum and the maximum from the elements in the given array (along the specified axis for multi-dimensional arrays).</li>
</ul>

In [2]:
import numpy as np

# Array of 20 random numbers between 0 and 10.
x = np.random.random(20)*10
print(x)
print('Minimum: {0:8.6f}, Maximum: {1:8.6f}'.format(np.amin(x), np.amax(x)))

[8.60247691 4.46315047 7.20059397 4.28241302 4.04127719 5.15461061
 3.98114808 5.97519773 4.34306247 9.49033591 9.1202005  5.25182255
 2.76623943 6.15041097 7.56462365 4.58024319 7.1062114  2.5829754
 9.54145709 9.333849  ]
Minimum: 2.582975, Maximum: 9.541457


<h3 class="dd-content">numpy.argmax(),&nbsp;numpy.argmin() &amp;&nbsp;numpy.argsort()</h3>
<ul>
<li class="dd-content"><span>argmax: Returns the indices of the maximum values along an axis.<br /><code>numpy.argmax(a, axis=None, out=None)</code><br /></span></li>
<li class="dd-content"><span>argmin: Returns the indices of the maximum values along an axis.</span><span><code>numpy.argmin(a, axis=None, out=None)</code></span></li>
<li class="dd-content"><span>argsort: Returns the indices that would sort an array.<br /><code>numpy.argsort(a, axis=-1, kind='quicksort', order=None)</code></span></li>
</ul>

In [3]:
print('x[{}]={}'.format(x.argmax(), x[x.argmax()]))


x[18]=9.541457088376937


In [4]:
# sorting
y = np.argsort(x)
z = x[y]
z

array([2.5829754 , 2.76623943, 3.98114808, 4.04127719, 4.28241302,
       4.34306247, 4.46315047, 4.58024319, 5.15461061, 5.25182255,
       5.97519773, 6.15041097, 7.1062114 , 7.20059397, 7.56462365,
       8.60247691, 9.1202005 , 9.333849  , 9.49033591, 9.54145709])

<h3>numpy.median()</h3>
<ul>
<li><strong>Median</strong><span>&nbsp;</span>is defined as the value separating the higher half of a data sample from the lower half.</li>
<li>The<span>&nbsp;</span><strong>numpy.median()</strong><span>&nbsp;</span>function is used below.</li>
</ul>

In [5]:
print(np.median(x))

5.613510138743397


<h3>numpy.mean()</h3>
<ul>
<li>Arithmetic mean is the sum of elements along an axis divided by the number of elements.</li>
<li>The<span>&nbsp;</span><strong>numpy.mean()</strong><span>&nbsp;</span>function returns the arithmetic mean of elements in the array.</li>
<li>If the axis is mentioned, it is calculated along it.</li>
</ul>

In [6]:
print(np.mean(x) )

6.076614976484812


<h2>Standard Deviation</h2>
<ul>
<li>Standard deviation is the square root of the average of squared deviations from mean.</li>
<li>The formula for standard deviation is:</li>
</ul>
<pre class="result notranslate">std = sqrt(mean(abs(x - x.mean())**2))
</pre>
<ul>
<li>If the array is [1, 2, 3, 4], then its mean is 2.5.</li>
<li>Hence the squared deviations are [2.25, 0.25, 0.25, 2.25] and the square root of its mean divided by 4, i.e., sqrt (5/4) is 1.1180339887498949.</li>
</ul>

In [7]:
print(np.std(x))

2.2221578012031307


<h3>Variance</h3>
<ul>
<li>Variance is the average of squared deviations, i.e.,<span>&nbsp;</span><strong>mean(abs(x - x.mean())**2)</strong>. In other words, the standard deviation is the square root of variance.</li>
<li>It will produce the following output &minus;</li>
</ul>

In [8]:
print(np.var(x))

4.937985293447932


<hr><h3>Reference</h3>
<p>NumPy Statistical Functions, TutorialsPoint, <a href="https://www.tutorialspoint.com/numpy/numpy_statistical_functions.htm" target="_blank" rel="noopener">https://www.tutorialspoint.com/numpy/numpy_statistical_functions.htm</a> <span>&nbsp;</span></p>