Numpy is a Python package useful for numerical analysis. The central object in Numpy is the Numpy array. A Numpy array is like a Python list that can store many data without assigning individual names to the data points, but also more powerful and comes with more convenient functionalities. For example, while adding two Python lists means concatenating them, adding two Numpy arrays, which must be the same length, means adding the elements on the same position:

In [1]:
import numpy as np
import scipy

a = np.array([1,3,3,5,6])
b = np.array([2,8,4,6,9])
a+b

array([ 3, 11,  7, 11, 15])

Numpy has an arange command that is similar to the Python range, but with the step size capable of taking a non-integer value:

In [2]:
x = np.arange(0,10,0.1)
x

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. , 1.1, 1.2,
       1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2. , 2.1, 2.2, 2.3, 2.4, 2.5,
       2.6, 2.7, 2.8, 2.9, 3. , 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8,
       3.9, 4. , 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5. , 5.1,
       5.2, 5.3, 5.4, 5.5, 5.6, 5.7, 5.8, 5.9, 6. , 6.1, 6.2, 6.3, 6.4,
       6.5, 6.6, 6.7, 6.8, 6.9, 7. , 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 7.7,
       7.8, 7.9, 8. , 8.1, 8.2, 8.3, 8.4, 8.5, 8.6, 8.7, 8.8, 8.9, 9. ,
       9.1, 9.2, 9.3, 9.4, 9.5, 9.6, 9.7, 9.8, 9.9])

Then, for example, we can apply a function to this series of numbers, to create a list of results of these numbers. Almost every conceivable common mathematical function has a Numpy equivalent, so you no longer have to import the default Python math module.

In [4]:
y = np.sin(x)
y

array([ 0.        ,  0.09983342,  0.19866933,  0.29552021,  0.38941834,
        0.47942554,  0.56464247,  0.64421769,  0.71735609,  0.78332691,
        0.84147098,  0.89120736,  0.93203909,  0.96355819,  0.98544973,
        0.99749499,  0.9995736 ,  0.99166481,  0.97384763,  0.94630009,
        0.90929743,  0.86320937,  0.8084964 ,  0.74570521,  0.67546318,
        0.59847214,  0.51550137,  0.42737988,  0.33498815,  0.23924933,
        0.14112001,  0.04158066, -0.05837414, -0.15774569, -0.2555411 ,
       -0.35078323, -0.44252044, -0.52983614, -0.61185789, -0.68776616,
       -0.7568025 , -0.81827711, -0.87157577, -0.91616594, -0.95160207,
       -0.97753012, -0.993691  , -0.99992326, -0.99616461, -0.98245261,
       -0.95892427, -0.92581468, -0.88345466, -0.83226744, -0.77276449,
       -0.70554033, -0.63126664, -0.55068554, -0.46460218, -0.37387666,
       -0.2794155 , -0.1821625 , -0.0830894 ,  0.0168139 ,  0.1165492 ,
        0.21511999,  0.31154136,  0.40484992,  0.49411335,  0.57

The random submodule in Numpy also replaces the random package. It can also draw sample random numbers from other types of distribution, such as a normal distribution with any mean and standard deviation:

In [6]:
z = np.random.normal(10,5,100)
z

array([ 6.32661061,  9.22603113, 10.20753968, 15.2987159 , 11.71995738,
       10.08237665,  1.2730159 ,  8.29468201, 11.3348266 , 13.57250763,
       18.12756801,  8.8344966 ,  9.65816501, 11.62704738, 10.7873862 ,
        6.40608349, 12.92403389, 12.8169504 ,  5.73724094, 10.63830116,
       10.06882997,  5.28577132,  8.20066165,  7.30547912,  9.52376403,
        5.25350411, 11.72785505, 11.56155754,  3.71727915,  5.32415184,
        9.58428154, 15.59374174,  7.67125332,  6.77004703, 12.82008549,
       19.20255035, 11.20658359,  4.30911651, 11.59940019,  8.38598113,
       15.62626207, 16.21742913, 14.6361134 ,  3.56297859, 10.30487784,
       10.01509941, 11.32627059,  3.93477757,  5.51552932,  8.54985805,
       14.33844832, -1.3161907 ,  9.42233148,  5.56752074, 15.05572506,
       12.75809277, 12.30482692, 19.49862948,  7.1261972 , 13.46588559,
        7.5569572 , 12.42580985,  4.88674583,  5.51461038, 11.48708075,
        4.75068464,  9.34265276,  7.40918616, 15.29790835, 11.00

Numpy also has a lot of built-in statistical functions widely used in data analysis, such as finding the mean and median:

In [8]:
print(np.mean(z))
print(np.median(z))

10.126156010407772
10.124034592723532


Standard deviation:

In [11]:
np.std(a)

5.280775517006143

Percentile:

In [12]:
np.percentile(a,90)

9.457512704066193

A sister package to Numpy, Scipy, provides more functionalities, notably the Pearson's coefficient of correlation of two Numpy arrays:

In [8]:
a = np.array([1,3,3,5,6])
b = np.array([2,8,4,6,9])

from scipy import stats
stats.pearsonr(a,b)

Numpy regularly uses arrays within arrays, or 2-dimensional arrays. 2-dimensional arrays must be rectangular, and are used to represent matrices in Mathematics. There are two ways of indexing entries in a Numpy array. The first number is the index of row, and the second the index of column:

In [6]:
A = np.array([[1,2],
            [3,4]])
print(A[0,1])
print(A[0][1])

2
2


As matrices are represented by Numpy arrays, there are many Numpy functions that relate to matrices. For example matrix inverse:

In [7]:
np.linalg.inv(A)

array([[-2. ,  1. ],
       [ 1.5, -0.5]])

And matrix transpose:

In [None]:
A.T

Read the documentation for more Numpy and Scipy functionalities, especially if you are a math and science enthusiast. Chances are there will be a Numpy or Scipy function already made for your intended use.

Exercise: <br>
Read up on the median function in Numpy. Find the median of the list <code>a</code> above with Numpy. <br>
Either write a program that counts the number of elements in a Numpy array that are positive, or search for an efficient way to do that in Numpy. <br>
Search the Numpy documentation for the function that computes the eigenvalues of the matrix A above. <br>
Search the Numpy documentation for the function that solves a system of linear equations.