# Introduction to NumPy


NumPy is a python library which you will likely use a lot for data science.

It provides a lot of numerical tools and calculators through it's functions.

We start by importing numpy:

In [5]:
!pip install numpy

In [1]:
import numpy as np

**Arrays**

A numpy array is a collection of objects that is mutable, similar to a list in python.

In [2]:
x = np.zeros(8)
y = np.ones(5)

We have just created an array of zeros by using the zeros function. The dimensions of this array are 8, so essentially we have a list of 8 zeros.

Likewise, we have created an array of 5 ones.

When we print these, we can see that they are floating point numbers.

In [3]:
print(x)
print(y)

print("Numbers in this array are: ", type(x[1]))

[0. 0. 0. 0. 0. 0. 0. 0.]
[1. 1. 1. 1. 1.]
Numbers in this array are:  <class 'numpy.float64'>


Arrays have what is called a shape. We can view the shape of the array like so:

In [5]:
print("shape of x:", x.shape)

shape of a: (8,)


This shows that x is 8 long in 1 dimension, however arrays in numpy can be multi-dimensional.

We cannot change the size of the array and it is set to have 8 items in it.

We can, however, change the shape so that it is 4 x 2 as a two dimensional array.

In [9]:
x.shape = (4,2)
print("shape of x:", x.shape)

shape of x: (4, 2)


Using lists of lists, we can also create multi-dimensional arrays.

In [10]:
alistoflists = [[1,2,3,4,5],[1,2,3,4,5]]
z = np.array(alistoflists)

print("shape of z:", z.shape)

shape of z: (2, 5)


We can also create random arrays in NumPy. To do this we must first seed the random number generator.


In [13]:
np.random.seed(0)
randArray = np.random.randint(10, size=35)
print("Array randArray: ", randArray)

Array randArray:  [5 0 3 3 7 9 3 5 2 4 7 6 8 8 1 6 7 7 8 1 5 9 8 9 4 3 0 3 5 0 2 3 8 1 3]


Now that we have covered the basics of arrays, let's go through how NumPy can be used to perform important calculations on arrays.

NumPy has a lot of built in functions for mathematical calculations that can be performed on arrays and values. Below are a few examples:


In [15]:
print("sin:" , np.sin(randArray))
print("sum:" , np.sum(randArray))
print("prod:" , np.prod(randArray))
print("mean:" , np.mean(randArray))
print("std:" , np.std(randArray))
print("min:" , np.min(randArray))
print("Index of min value:" , np.argmin(randArray))
print("Index of max value:" , np.argmax(randArray))

sin: [-0.95892427  0.          0.14112001  0.14112001  0.6569866   0.41211849
  0.14112001 -0.95892427  0.90929743 -0.7568025   0.6569866  -0.2794155
  0.98935825  0.98935825  0.84147098 -0.2794155   0.6569866   0.6569866
  0.98935825  0.84147098 -0.95892427  0.41211849  0.98935825  0.41211849
 -0.7568025   0.14112001  0.          0.14112001 -0.95892427  0.
  0.90929743  0.14112001  0.98935825  0.84147098  0.14112001]
sum: 163
prod: 0
mean: 4.6571428571428575
std: 2.8177281339289446
min: 0
Index of min value: 1
Index of max value: 5


We can also use numpy to perform masking tasks, which would otherwise require several if statements etc.

Two examples are as follows:
*   If we want to test if numbers are greater than 20, returning True/False
*   If we want to mask for only numbers that are greater than 35.





In [17]:
numbers = np.array([10,20,30,40,50])
print("Tested for <20:", numbers < 20)
print("Masked for >35:", numbers[numbers > 35])

Tested for <20: [ True False False False False]
Masked for >35: [40 50]


Lastly, we will touch briefly on vectors. Vectors can be treated as arrays in NumPy.

We can perform vector operations in a similar way to normal python arithmetic as demonstrated below.

In [None]:
a = z+10
print("Array z:", z)
print("Array a:", a)

az = a+z
print("Array az:", az)

#dot product 
a_dot_z = a@z
print("a z dot product:", a_dot_z)