# Numpy

## What is Numpy?

NumPy is a Python library used for working with arrays.

It also has functions for working in domain of linear algebra, fourier transform, and matrices.

NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use it freely.

NumPy stands for Numerical Python

## Why Use NumPy?

In Python we have lists that serve the purpose of arrays, but they are slow to process.

NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.

The array object in NumPy is called `ndarray`, it provides a lot of supporting functions that make working with `ndarray` very easy.

Arrays are very frequently used in data science, where speed and resources are very important.

<a id="section-one"></a>
# Section  1 - Import Numpy & basic check

In [1]:
import numpy as np

In [2]:
# code to check version of numpy library

np.__version__

'1.22.4'

In [3]:
#  to display all the contents of the numpy namespace, you can type this, np.<TAB>
# And to display NumPy’s built-in documentation, you can use this: np?

np?

In [4]:
p=np.array([3.14,6,8,9,10,15,19,21])
print(p)

[ 3.14  6.    8.    9.   10.   15.   19.   21.  ]


## Array creation

NumPy arrays can be created from Python lists

In [5]:
my_array = np.array([1, 2, 3])
my_array

array([1, 2, 3])

NumPy supports array of arbitrary dimension. For example, we can create two-dimensional arrays (e.g. to store a matrix) as follows

In [6]:
my_2d_array = np.array([[1, 2, 3], [4, 5, 6]])
my_2d_array

array([[1, 2, 3],
       [4, 5, 6]])

The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.

In [7]:
print(my_2d_array.shape)

(2, 3)


Numpy also provides many functions to create arrays:

In [8]:
a = np.zeros((2,2))  # Create an array of all zeros
print(a)

[[0. 0.]
 [0. 0.]]


In [9]:
b = np.ones((1,2))   # Create an array of all ones
print(b)

[[1. 1.]]


In [10]:
c = np.full((2,2), 7) # Create a constant array
print(c)

[[7 7]
 [7 7]]


In [11]:
d = np.eye(2)        # Create a 2x2 identity matrix
print(d)

[[1. 0.]
 [0. 1.]]


In [12]:
e = np.random.random((2,2)) # Create an array filled with random values
print(e)

[[0.21519715 0.08954442]
 [0.8298968  0.28524857]]


In [13]:
# Create a numpy with random integers and size
a = np.random.randint(10, size = 5)

print(a)

[9 7 8 8 0]


In [14]:
# Create a numpy with random integers and size
a=np.random.randint(1,20,size=(2,3))

print(a,'a.shape :',a.shape)
print('a.dtype :',a.dtype)

[[ 8  8  1]
 [ 9  1 10]] a.shape : (2, 3)
a.dtype : int32


## Datatypes

Every numpy array is a grid of elements of the same type. Numpy provides a large set of numeric datatypes that you can use to construct arrays. Numpy tries to guess a datatype when you create an array, but functions that construct arrays usually also include an optional argument to explicitly specify the datatype. Here is an example:

In [15]:
x = np.array([1, 2])  # Let numpy choose the datatype
y = np.array([1.0, 2.0])  # Let numpy choose the datatype
z = np.array([1, 2], dtype=np.int64)  # Force a particular datatype

print(x.dtype, y.dtype, z.dtype)

int32 float64 int64


## Array indexing and slicing

Numpy offers several ways to index into arrays.

Slicing: Similar to Python lists, numpy arrays can be sliced. Since arrays may be multidimensional, you must specify a slice for each dimension of the array:

In [16]:
import numpy as np

# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2; b is the following array of shape (2, 2):
# [[2 3]
#  [6 7]]
b = a[:2, 1:3]
print(b)

[[2 3]
 [6 7]]


A slice of an array is a view into the same data, so modifying it will modify the original array.

In [17]:
print(a[0, 1])
b[0, 0] = 77    # b[0, 0] is the same piece of data as a[0, 1]
print(a[0, 1]) 

2
77


You can also mix integer indexing with slice indexing. However, doing so will yield an array of lower rank than the original array. Note that this is quite different from the way that MATLAB handles array slicing:

In [18]:
# Create the following rank 2 array with shape (3, 4)
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(a)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


Two ways of accessing the data in the middle row of the array. Mixing integer indexing with slices yields an array of lower rank, while using only slices yields an array of the same rank as the original array:

In [19]:
row_r1 = a[1, :]    # Rank 1 view of the second row of a  
row_r2 = a[1:2, :]  # Rank 2 view of the second row of a
row_r3 = a[[1], :]  # Rank 2 view of the second row of a

print(row_r1, row_r1.shape)
print(row_r2, row_r2.shape)
print(row_r3, row_r3.shape)

[5 6 7 8] (4,)
[[5 6 7 8]] (1, 4)
[[5 6 7 8]] (1, 4)


In [20]:
# We can make the same distinction when accessing columns of an array:
col_r1 = a[:, 1]
col_r2 = a[:, 1:2]
print(col_r1, col_r1.shape)
print()
print(col_r2, col_r2.shape)

[ 2  6 10] (3,)

[[ 2]
 [ 6]
 [10]] (3, 1)


We can also select only certain elements from the array

In [21]:
x = np.arange(10)
mask = x >= 5
x[mask]

array([5, 6, 7, 8, 9])

## Basic operations

In NumPy, we express computations directly over arrays. This makes the code much more succint

Arithmetic operations can be performed directly over arrays. For instance, assuming two arrays have a compatible shape, we can add them as follows

In [22]:
array_a = np.array([1, 2, 3])
array_b = np.array([4, 5, 6])
array_a + array_b

array([5, 7, 9])

Compare this with the equivalent computation using a for loop

In [23]:
array_out = np.zeros_like(array_a)
for i in range(len(array_a)):
    array_out[i] = array_a[i] + array_b[i]
array_out

array([5, 7, 9])

Not only this code is more verbose, it will also run much more slowly.

In NumPy, functions that operates on arrays in an element-wise fashion are called [universal functions](https://numpy.org/doc/stable/reference/ufuncs.html). For instance, this is the case of `np.sin`

In [24]:
np.sin(array_a)

array([0.84147098, 0.90929743, 0.14112001])

Vector inner product can be performed using `np.dot`

In [25]:
np.dot(array_a, array_b)

32

When the two arguments to `np.dot` are both 2d arrays, `np.dot` becomes matrix multiplication

In [26]:
array_A = np.random.rand(5, 3)
array_B = np.random.randn(3, 4)
np.dot(array_A, array_B)

array([[-1.72028946,  0.71865934, -2.35388235,  0.23162774],
       [-1.34698384,  0.53372523, -1.8373839 ,  0.23185902],
       [-1.34153361,  0.55001382, -1.74264567,  0.38897184],
       [-1.474681  ,  0.43400299, -1.73194549,  1.03872229],
       [-1.00765838,  0.36273305, -1.53493937, -0.11345402]])

Matrix transpose can be done using `.transpose()` or `.T` for short

In [27]:
array_A.T

array([[0.8759689 , 0.71282475, 0.68832786, 0.90786101, 0.57544646],
       [0.85984638, 0.65212531, 0.72891982, 0.78578637, 0.33958507],
       [0.77103784, 0.59808777, 0.43256276, 0.16719383, 0.74756399]])

## Computation on Arrays: Broadcasting

In [28]:
a = np.array([1, 1, 2])
b = np.array([5, 6, 5])
a + b

array([6, 7, 7])

In [29]:
a+5

array([6, 6, 7])

In [30]:
# We can similarly extend this to arrays of higher dimension. Observe the result when we add a one-dimensional array to a two-dimensional array

M=np.ones((3,3))
M

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [31]:
M+a
# Here the one-dimensional array a is stretched, or broadcast, across the second dimension in order to match the shape of M

array([[2., 2., 3.],
       [2., 2., 3.],
       [2., 2., 3.]])

In [32]:
# While these examples are relatively easy to understand, more complicated cases can involve broadcasting of both arrays. Consider the following example:

a=np.arange(3)
b=np.arange(3)[:,np.newaxis]
print(a)
print(b)

print(a+b)

[0 1 2]
[[0]
 [1]
 [2]]
[[0 1 2]
 [1 2 3]
 [2 3 4]]


In [33]:
# Rules of Broadcasting
# Broadcasting in NumPy follows a strict set of rules to determine the interaction between the two arrays
# Rule 1: If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side.
# Rule 2: If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape
# Rule 3: If in any dimension the sizes disagree and neither is equal to 1, an error is raised.

#Broadcasting example 1
#Let’s look at adding a two-dimensional array to a one-dimensional array:

M=np.ones((2,3))
a=np.arange((3))

# Let’s consider an operation on these two arrays. The shapes of the arrays are:

M.shape

(2, 3)

In [34]:
a.shape
# We see by rule 1 that the array a has fewer dimensions, so we pad it on the left with ones:

(3,)

In [35]:
M+a   # addition will be a 2x3 matrix

array([[1., 2., 3.],
       [1., 2., 3.]])

In [36]:
# Broadcasting example 2

# Let’s take a look at an example where both arrays need to be broadcast:

a=np.arange(3).reshape((3,1))
b=np.arange(3)

a+b

array([[0, 1, 2],
       [1, 2, 3],
       [2, 3, 4]])

## Fancy Indexing

In [37]:
# Fancy indexing is like the simple indexing we’ve already seen, but we pass arrays of indices in place of single scalars. This allows us to very quickly access and
# modify complicated subsets of an array’s values.

import numpy as np
rand = np.random.RandomState(42)

x = rand.randint(100, size=10)
print(x)

[51 92 14 71 60 20 82 86 74 74]


In [38]:
# Suppose we want to access three different elements. We could do it like this
[x[3],x[5],x[7]]

[71, 20, 86]

In [39]:
# Alternatively, we can pass a single list or array of indices to obtain the same result

ind=[3,4,7]
x[ind]

array([71, 60, 86])

In [40]:
# With fancy indexing, the shape of the result reflects the shape of the index arrays rather than the shape of the array being indexed:

ind=np.array([[3,7],[4,5]])
x[ind]

array([[71, 86],
       [60, 20]])

In [41]:
# Fancy indexing also works in multiple dimensions. Consider the following array

X=np.arange(12).reshape((3,4))
X

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [42]:
# Like with standard indexing, the first index refers to the row, and the second to the column

row=np.array([0,1,2]) 
col=np.array([2,1,3])

X[row,col]
# X(0, 2) first row third colum

array([ 2,  5, 11])

In [43]:
print(row[:, np.newaxis])

X[row[:, np.newaxis], col]

[[0]
 [1]
 [2]]


array([[ 2,  1,  3],
       [ 6,  5,  7],
       [10,  9, 11]])

In [44]:
row[:, np.newaxis] * col

array([[0, 0, 0],
       [2, 1, 3],
       [4, 2, 6]])

In [45]:
# Combined Indexing

# For even more powerful operations, fancy indexing can be combined with the other indexing schemes we’ve seen

print(X)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


In [46]:
# We can combine fancy and simple indices

X[2, [2, 0, 1]]

array([10,  8,  9])

In [47]:
# We can also combine fancy indexing with slicing

X[1:, [2, 0, 1]]

array([[ 6,  4,  5],
       [10,  8,  9]])

In [48]:
# And we can combine fancy indexing with masking

mask = np.array([1, 0, 1, 0], dtype=bool)
X[row[:, np.newaxis], mask]

array([[ 0,  2],
       [ 4,  6],
       [ 8, 10]])

## Sorting Arrays

In [49]:
# Fast Sorting in NumPy: np.sort and np.argsort

x = np.array([2, 1, 4, 3, 5])
np.sort(x)

array([1, 2, 3, 4, 5])

In [50]:
x.sort()
print(x)

[1 2 3 4 5]


In [51]:
# A related function is argsort, which instead returns the indices of the sorted elements

x = np.array([2, 1, 4, 3, 5])
i = np.argsort(x)
print(i)

[1 0 3 2 4]


In [52]:
x[i]

array([1, 2, 3, 4, 5])

In [53]:
# Sorting along rows or columns

# A useful feature of NumPy’s sorting algorithms is the ability to sort along specific rows or columns of a multidimensional array using the axis argument

rand = np.random.RandomState(42)
X = rand.randint(0, 10, (4, 6))
X

array([[6, 3, 7, 4, 6, 9],
       [2, 6, 7, 4, 3, 7],
       [7, 2, 5, 4, 1, 7],
       [5, 1, 4, 0, 9, 5]])

In [54]:
# sort each column of X
np.sort(X, axis=0)

array([[2, 1, 4, 0, 1, 5],
       [5, 2, 5, 4, 3, 7],
       [6, 3, 7, 4, 6, 7],
       [7, 6, 7, 4, 9, 9]])

In [55]:
# sort each row of X
np.sort(X, axis=1)

array([[3, 4, 6, 6, 7, 9],
       [2, 3, 4, 6, 7, 7],
       [1, 2, 4, 5, 7, 7],
       [0, 1, 4, 5, 5, 9]])