NumPy - Numerical Python

NumPy is an open source Python library that’s used in almost every field of science and engineering. It’s the universal standard for working with numerical data in Python, and it’s at the core of the scientific Python and PyData ecosystems. NumPy users include everyone from beginning coders to experienced researchers doing state-of-the-art scientific and industrial research and development. The NumPy API is used extensively in Pandas, SciPy, Matplotlib, scikit-learn, scikit-image and most other data science and scientific Python packages.

The NumPy library contains multidimensional array and matrix data structures (you’ll find more information about this in later sections). It provides ndarray, a homogeneous n-dimensional array object, with methods to efficiently operate on it. NumPy can be used to perform a wide variety of mathematical operations on arrays. It adds powerful data structures to Python that guarantee efficient calculations with arrays and matrices and it supplies an enormous library of high-level mathematical functions that operate on these arrays and matrices.

Advantages of Numpy Arrays:

1. Allows several Mathematical Operations
2. Faster operations

 Use "pip install numpy" to install numpy

To access NumPy and its functions import it in your Python code like this:

In [2]:
import numpy as np

List vs Numpy - Time Taken

What’s the difference between a Python list and a NumPy array?

NumPy gives you an enormous range of fast and efficient ways of creating arrays and manipulating numerical data inside them. While a Python list can contain different data types within a single list, all of the elements in a NumPy array should be homogeneous. The mathematical operations that are meant to be performed on arrays would be extremely inefficient if the arrays weren’t homogeneous.

Why use NumPy?

NumPy arrays are faster and more compact than Python lists. An array consumes less memory and is convenient to use. NumPy uses much less memory to store data and it provides a mechanism of specifying the data types. This allows the code to be optimized even further.

In [None]:
from time import process_time

Time taken by a list

In [None]:
python_list = [i for i in range(10000)]

start_time = process_time()  #n in seconds

python_list = [i+5 for i in python_list]  #add 5 to each of the elements in the list

end_time = process_time()

print(end_time - start_time)

0.0006923420000002345


In [None]:
np_array = np.array([i for i in range(10000)])

start_time = process_time()

np_array += 5  #shorhand representation of np_array=np_array + 5

end_time = process_time()

print(end_time - start_time)

0.00011753800000002812


Numpy Arrays

What is an array?

An array is a central data structure of the NumPy library. An array is a grid of values and it contains information about the raw data, how to locate an element, and how to interpret an element. It has a grid of elements that can be indexed in various ways. The elements are all of the same type, referred to as the array dtype.

An array can be indexed by a tuple of nonnegative integers, by booleans, by another array, or by integers. The rank of the array is the number of dimensions. The shape of the array is a tuple of integers giving the size of the array along each dimension.

One way we can initialize NumPy arrays is from Python lists, using nested lists for two- or higher-dimensional data.

In [None]:
# list
list1 = [1,2,3,4,5]
print(list1)
type(list1)

[1, 2, 3, 4, 5]


list

Values are separated by comma

In [None]:
np_array = np.array([1,2,3,4,5])  #declared np_array variable to store the array
print(np_array)
type(np_array)

[1 2 3 4 5]


numpy.ndarray

In [None]:
# creating a 1 dim array
a = np.array([1,2,3,4])
print(a)

[1 2 3 4]


values are not separated by comma. it is like matrics

In [None]:
a.shape

(4,)

2D array or matrix with two rows

In [None]:
b = np.array([(1,2,3,4),(5,6,7,8)])
print(b)

[[1 2 3 4]
 [5 6 7 8]]


In [None]:
b.shape

(2, 4)

In [None]:
c = np.array([(1,2,3,4),(5,6,7,8)],dtype=float)
print(c)

[[1. 2. 3. 4.]
 [5. 6. 7. 8.]]


The NumPy ndarray class is used to represent both matrices and vectors. A vector is an array with a single dimension (there’s no difference between row and column vectors), while a matrix refers to an array with two dimensions. For 3-D or higher dimensional arrays, the term tensor is also commonly used.

Initial Placeholders in numpy arrays

In [None]:
a = np.zeros(2)
print(a)

[0. 0.]


In [None]:
# create a numpy array of Zeros
x = np.zeros((4,5))
print(x)

[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]


In [None]:
b=np.ones(2)
print(b)

[1. 1.]


In [None]:
# create a numpy array of ones
y = np.ones((3,3))
print(y)

[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


 The function empty creates an array whose initial content is random and depends on the state of the memory. The reason to use empty over zeros (or something similar) is speed - just make sure to fill every element afterwards!

In [None]:
 # Create an empty array with 2 elements
np.empty(2)

array([1., 1.])

In [None]:
# array of a particular value
z = np.full((5,4),5)
print(z)

[[5 5 5 5]
 [5 5 5 5]
 [5 5 5 5]
 [5 5 5 5]
 [5 5 5 5]]


In [None]:
# create an identity matrix
a = np.eye(5)  #not possible eye(5,4)
print(a)

[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]


In [None]:
# create a numpy array with random values
b = np.random.random((3,4))
print(b)

[[0.4266738  0.8005022  0.83193969 0.61212403]
 [0.43586797 0.6559109  0.32552164 0.50222238]
 [0.29984265 0.29563813 0.55025982 0.55121929]]


In [None]:
# random integer values array within a specific range
c = np.random.randint(10,100,(3,5))
print(c)

[[87 42 26 16 96]
 [70 20 29 39 70]
 [91 62 67 13 61]]


In [None]:
# array of evenly spaced values --> specifying the number of values required
d = np.linspace(10,30,6)
print(d)

[10. 14. 18. 22. 26. 30.]


In [None]:
d = np.linspace(10,30,5)
print(d)

[10. 15. 20. 25. 30.]


In [None]:
# array of evenly spaced values --> specifying the step
e = np.arange(10,30,5)
print(e)

[10 15 20 25]


In [None]:
# convert a list to a numpy array
list2 = [10,20,20,20,50]

np_array = np.asarray(list2)
print(np_array)
type(np_array)

[10 20 20 20 50]


numpy.ndarray

Specifying your data type

While the default data type is floating point (np.float64), you can explicitly specify which data type you want using the dtype keyword.

In [None]:
x = np.ones(2, dtype=np.int64)
print(x)
type(x)

[1 1]


numpy.ndarray

Adding, removing, and sorting elements

Sorting an element is simple with np.sort(). You can specify the axis, kind, and order when you call the function.

In [3]:
arr = np.array([2, 1, 5, 3, 7, 4, 6, 8])
print(np.sort(arr))

[1 2 3 4 5 6 7 8]


np.concatenate()

In [None]:
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])
np.concatenate((a, b))

array([1, 2, 3, 4, 5, 6, 7, 8])

In [None]:
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6]])
a=np.concatenate((x, y))
print(a)

[[1 2]
 [3 4]
 [5 6]]


(3, 2)

Analysing a numpy array

In [None]:
c = np.random.randint(10,90,(5,5))
print(c)

[[69 47 76 41 51]
 [54 42 87 31 78]
 [36 86 60 61 71]
 [88 46 33 11 12]
 [38 45 32 36 74]]


In [None]:
# array dimension
print(c.shape)

(5, 5)


In [None]:
# number of dimensions
print(c.ndim)

2


In [None]:
# number of elements in an array
print(c.size)

25


In [None]:
# checking the data type of the values in the array
print(c.dtype)

int64


Mathematical operations on a np array

In [None]:
list1 = [1,2,3,4,5]
list2 = [6,7,8,9,10]

print(list1 + list2)    # concatenate or joins two list    ie. it cannot add the values

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


In [None]:
a = np.random.randint(0,10,(3,3))
b = np.random.randint(10,20,(3,3))

In [None]:
print(a)
print(b)

[[9 2 7]
 [0 7 5]
 [9 1 6]]
[[11 12 16]
 [16 10 19]
 [13 11 10]]


In [None]:
print(a+b)
print(a-b)
print(a*b)
print(a/b)

[[20 14 23]
 [16 17 24]
 [22 12 16]]
[[ -2 -10  -9]
 [-16  -3 -14]
 [ -4 -10  -4]]
[[ 99  24 112]
 [  0  70  95]
 [117  11  60]]
[[0.81818182 0.16666667 0.4375    ]
 [0.         0.7        0.26315789]
 [0.69230769 0.09090909 0.6       ]]


In [None]:
a = np.random.randint(0,10,(3,3))
b = np.random.randint(10,20,(3,3))

In [None]:
print(a)
print(b)

[[1 3 2]
 [8 6 6]
 [1 5 7]]
[[12 15 14]
 [15 17 15]
 [15 10 12]]


In [None]:
print(np.add(a,b))
print(np.subtract(a,b))
print(np.multiply(a,b))
print(np.divide(a,b))

[[13 18 16]
 [23 23 21]
 [16 15 19]]
[[-11 -12 -12]
 [ -7 -11  -9]
 [-14  -5  -5]]
[[ 12  45  28]
 [120 102  90]
 [ 15  50  84]]
[[0.08333333 0.2        0.14285714]
 [0.53333333 0.35294118 0.4       ]
 [0.06666667 0.5        0.58333333]]


Array Manipulation

In [None]:
array = np.random.randint(0,10,(2,3))
print(array)
print(array.shape)

[[5 6 6]
 [8 7 7]]
(2, 3)


In [None]:
# transpose
trans = np.transpose(array)
print(trans)
print(trans.shape)

[[5 8]
 [6 7]
 [6 7]]
(3, 2)


In [None]:
array = np.random.randint(0,10,(2,3))
print(array)
print(array.shape)

[[5 1 4]
 [9 5 0]]
(2, 3)


In [None]:
trans2 = array.T
print(trans2)
print(trans2.shape)

[[5 9]
 [1 5]
 [4 0]]
(3, 2)


In [None]:
# reshaping a array
a = np.random.randint(0,10,(2,3))
print(a)
print(a.shape)

[[3 6 0]
 [2 9 6]]
(2, 3)


In [None]:
b = a.reshape(3,2)
print(b)
print(b.shape)

[[3 6]
 [0 2]
 [9 6]]
(3, 2)


Indexing and slicing

In [None]:
data = np.array([1, 2, 3])
print(data[1])
print(data[0:2])
print(data[1:])
print(data[-2:])   #read from end index -2

2
[1 2]
[2 3]
[2 3]
