# Numpy

## 1. Introduction

**NumPy** stands for _**Num**erical **Py**thon_ and it's a basic package for scientific computation in Python. NumPy provides Python with an extensive math library capable of performing numerical computations effectively and efficiently. In this script, we will provide an overview of Numpy and introduce some features of the package Numpy.

In the following you will learn:

* How to import NumPy
* How to create multidimensional NumPy ndarrays using various methods
* How to access and change elements in ndarrays
* How to load and save ndarrays
* How to use slicing to select or change subsets of an ndarray
* Understand the difference between a view and a copy an of ndarray
* How to use Boolean indexing and set operations to select or change subsets of an ndarray
* How to sort ndarrays
* How to perform element-wise operations on ndarrays
* Understand how NumPy uses broadcasting to perform operations on ndarrays of different sizes.

## 2. Downloading NumPy
**NumPy** is included with Anaconda. If you don't already have Anaconda installed on your computer, please refer to the Anaconda section to get clear instructions on how to install Anaconda on your PC or Mac.

In [3]:
# check the version of package numpy

import numpy  # import the numpy package
print(numpy.__version__)  # print the current numpy version
# print(numpy.__doc__)  # print the brief documentation of package numpy

1.19.2


An alternative way to check which version of **NumPy** you have by typing <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">!conda list numpy</code> in your **Jupyter Notebook** or by typing <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">conda list numpy</code> in the **Anaconda prompt**. If you have another version of NumPy installed in your computer, you can degrade your version by typing <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">conda install numpy=1.13</code> in the **Anaconda prompt**. As newer versions of NumPy are released, some functions may become obsolete or replaced, so make sure you have the correct NumPy version before running the code.

In [2]:
!conda list numpy

# packages in environment at C:\ProgramData\Anaconda3:
#
# Name                    Version                   Build  Channel
numpy                     1.19.2           py38hadc3359_0  
numpy-base                1.19.2           py38ha3acd2a_0  
numpydoc                  1.1.0              pyhd3eb1b0_1  


## 3. NumPy Documentation
NumPy is a remarkable math library and it has many functions and features. In these introductory lessons we will only scratch the surface of what NumPy can do. If you want to explore this package and know more about NumPy, make sure you check out the NumPy Documentation:

* [NumPy Manual](https://numpy.org/doc/stable/contents.html)
* [NumPy User Guide](https://numpy.org/doc/stable/user/index.html)
* [NumPy Reference](https://docs.scipy.org/doc/numpy-1.13.0/reference/index.html#reference)

## 4. Why Numpy?!

You may be wondering why we use NumPy. After all, Python can handle lists, as you learned in the class.

1. **Speed**

   When executing operations on large arrays, NumPy can often perform several **orders of magnitude faster** than Python lists. This speed comes from the nature of NumPy arrays being memory-efficient and from optimized algorithms used by NumPy for doing arithmetic, statistical, and linear algebra operations.

2. **Array structures**

   Another great feature of NumPy is that it has **multidimensional array data structures** that can represent vectors and matrices. Nowadays, a lot of machine learning algorithms rely on matrix operations. For example, when training a Neural Network, you often have to carry out many matrix multiplications. NumPy is optimized for matrix operations and it allows us to do Linear Algebra operations effectively and efficiently, making it very suitable for solving machine learning problems.

3. **Optimized built-in mathematical functions**

   Another great advantage of NumPy is that it has a large number of optimized built-in mathematical functions. These functions enable us to do a variety of complex mathematical computations very fast and with very little code (avoiding the use of complicated loops) making your programs more readable and easier to understand.

These are just part of the key features that have made NumPy an essential package for scientific computing in Python. In fact, NumPy has become so popular that a lot of Python packages, such as Pandas, are built on top of NumPy.

In [4]:
# Example: Experience the computation speed with Numpy

import numpy as np  # import the Numpy package into Python
import time         # import the time package to calculate the command execution time

numbers = np.random.random(100000000)   # randomly generate a large list of float numbers

# Test the speed of the code to calculate the mean value
starttime = time.time()            # record the start time
mean = sum(numbers) / len(numbers) # execute the mean of the numbers
endtime = time.time()              # record the end time
print(endtime - starttime)

15.920655250549316


In [4]:
# Test the speed of the build-in mean function of Numpy
starttime = time.time()
np_mean = np.mean(numbers)
endtime = time.time()
print(endtime - starttime)

0.14168286323547363


## 5. Explore the NumPy ndarrays

At the core of NumPy is the **ndarray**, where **nd** stands for _n-dimensional_. An ndarray is a multidimensional array of elements all of the same type. In other words, an ndarray is a grid that can take on many shapes and can hold either numbers or strings. In many Machine Learning problems you will often find yourself using ndarrays in many different ways. For instance, you might use an ndarray to hold the pixel values of an image that will be fed into a Neural Network for image classification.

But before we can dive in and start using NumPy to create ndarrays we need to import it into Python. We can import packages into Python using the **import** command and it has become a convention to import NumPy as **np**. Therefore, you can import NumPy by typing the following command in your Jupyter notebook:

<p style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px;"><code style="color:#fff;background-color:#2f3d48">import numpy as np</code></p>

In [3]:
import numpy as np

There are several ways to create ndarrays in NumPy. In the following lessons we will see two ways to create ndarrays:

1. Using regular Python lists

2. Using built-in NumPy functions

In the following, we will create ndarrays by providing Python lists to the NumPy **np.array()** function. The **np.array()** is a function that returns an **ndarray**. We should note that for the purposes of clarity, the examples throughout this section will use small and simple ndarrays. Let's start by creating 1-Dimensional (1D) ndarrays.

<!-- <p style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px;"><code style="color:#fff;background-color:#2f3d48"># import the Numpy package into Python
import numpy as np
<\br>
# We create a 1D ndarray that contains only integers
x = np.array([1, 2, 3, 4, 5])
<\br>
# Let's print the ndarray we just created using the print() command
print('x = ', x)
</code></p> -->

In [6]:
import numpy as np  # import the Numpy package into Python

# We create a 1D ndarray that contains only integers
x = np.array([1, 2, 3, 4, 5])

# Let's print the ndarray we just created using the print() command
print('x = ', x)

x =  [1 2 3 4 5]


Okay, now we introduce some useful terminology before we continue to learn. We refer to 1D arrays as rank 1 arrays. In general N-Dimensional arrays have rank N. Therefore, we refer to a 2D array as a rank 2 array. Another important property of arrays is their shape. The shape of an array is the size along each of its dimensions. For example, the shape of a rank 2 array will correspond to the number of rows and columns of the array. As you will see, NumPy ndarrays have attributes that allows us to get information about them in a very intuitive way. For example, the shape of an ndarray can be obtained using the .shape attribute. The shape attribute returns a tuple of N positive integers that specify the sizes of each dimension. In the example below we will create a rank 1 array and learn how to obtain its shape, its type, and the data-type (dtype) of its elements.

In [6]:
# Practice: Create a 1D ndarray that contains only integers
import numpy as np
x = np.array([1, 2, 3, 4, 5])

# Print x
print()
print('x = ', x)
print()

# Print information about x
print('x has dimensions:', x.shape)
print('x is an object of type:', type(x))
print('The elements in x are of type:', x.dtype)


x =  [1 2 3 4 5]

x has dimensions: (5,)
x is an object of type: <class 'numpy.ndarray'>
The elements in x are of type: int32


We can see that the shape attribute returns the tuple <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">(5,)</code> telling us that <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">x</code> is of rank 1 (i.e. <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">x</code> only has 1 dimension ) and it has 5 elements. The <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">type()</code> function tells us that <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">x</code> is indeed a NumPy ndarray. Finally, the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">.dtype</code> attribute tells us that the elements of <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">x</code> are stored in memory as signed *64-bit integers*. Another great advantage of NumPy is that it can handle more data-types than Python lists. You can check out all the different data types NumPy supports in the link below:

[NumPy Data Types](https://numpy.org/doc/stable/user/basics.types.html)

Another thing, ndarrays can also hold strings. Let's see how we can create a rank 1 ndarray of strings in the same manner as before, by providing the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.array()</code> function a Python list of strings.

In [7]:
# Practice: Create a rank 1 ndarray that only contains strings
import numpy as np
x = np.array(['Hello', 'World'])

# Print x
print()
print('x = ', x)
print()

# Print information about x
print('x has dimensions:', x.shape)
print('x is an object of type:', type(x))
print('The elements in x are of type:', x.dtype)


x =  ['Hello' 'World']

x has dimensions: (2,)
x is an object of type: <class 'numpy.ndarray'>
The elements in x are of type: <U5


As we can see the shape attribute tells us that <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">x</code> now has only 2 elements, and even though <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">x</code> now holds strings, the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">type()</code> function tells us that <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">x</code> is still an ndarray as before. In this case however, the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">.dtype</code> attribute tells us that the elements in <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">x</code> are stored in memory as **Unicode strings of 5 characters**.

It is important to remember that one big difference between Python lists and ndarrays, is that unlike Python lists, all the elements of an ndarray must be of the same type. So, while we can create Python lists with both integers and strings, **we can't mix types in ndarrays**. If you mix different types of data in an ndarray, NumPy will interpret all elements as strings. For example,

In [8]:
# Practice: Create a rank 1 ndarray with mix types of data
x = np.array([1, 2, 'World'])

# Print the ndarray
print()
print('x = ', x)
print()

# Print information about x
print('x has dimensions:', x.shape)
print('x is an object of type:', type(x))
print('The elements in x are of type:', x.dtype)


x =  ['1' '2' 'World']

x has dimensions: (3,)
x is an object of type: <class 'numpy.ndarray'>
The elements in x are of type: <U11


## 6. Using Built-in Functions to Create ndarrays

There are several built-in functions in Numpy to create ndarrays:

* <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.zeros()</code>
* <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.ones()</code>
* <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.eye()</code>
* <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.diag()</code>
* <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.arange()</code>

Let's start by creating an ndarray with a specified shape that is full of zeros. We can do this by using the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.zeros()</code> function, which creates an ndarray full of zeros with the given shape. So, for example, if you wanted to create a rank 2 array with 3 rows and 4 columns, you will pass the shape to the function in the form of (rows, columns), as in the example below:

In [14]:
# Practice: Create a 3 x 4 ndarray full of zeros. 
X = np.zeros((3,4))  # X = np.zeros((3,4), dtype=np.float64)

# Print X
print()
print('X = \n', X)
print()

# Print information about X
print('X has dimensions:', X.shape)
print('X is an object of type:', type(X))
print('The elements in X are of type:', X.dtype)


X = 
 [[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]

X has dimensions: (3, 4)
X is an object of type: <class 'numpy.ndarray'>
The elements in X are of type: float64


Similarly, we can create an ndarray with a specified shape that is full of ones. We can do this by using the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.ones()</code> function. For example,

In [10]:
# Practice: Create a rank 1 ndarray with mix types of data
X = np.ones((3, 2))

# Print X
print()
print('X = \n', X)
print()

# Print information about X
print('X has dimensions:', X.shape)
print('X is an object of type:', type(X))
print('The elements in X are of type:', X.dtype) 


X = 
 [[1. 1.]
 [1. 1.]
 [1. 1.]]

X has dimensions: (3, 2)
X is an object of type: <class 'numpy.ndarray'>
The elements in X are of type: float64


An **Identity matrix** is a square matrix that has only 1s in its main diagonal and zeros everywhere else. The function <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.eye(N)</code> creates a square **N x N** ndarray corresponding to the Identity matrix. 

In [13]:
# Practice: Create a 5 x 5 Identity matrix. 
X = np.eye(5)

# Print X
print()
print('X = \n', X)
print()

# Print information about X
print('X has dimensions:', X.shape)
print('X is an object of type:', type(X))
print('The elements in X are of type:', X.dtype)  


X = 
 [[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]

X has dimensions: (5, 5)
X is an object of type: <class 'numpy.ndarray'>
The elements in X are of type: float64


The <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.diag()</code> function creates an ndarray corresponding to a diagonal matrix , as shown in the example below:

In [19]:
# Practice: Create a 4 x 4 diagonal matrix that contains the numbers 10,20,30, and 50
# on its main diagonal
X = np.diag([10, 20, 30, 50])

# Print X
print()
print('X = \n', X)
print()


X = 
 [[10  0  0  0]
 [ 0 20  0  0]
 [ 0  0 30  0]
 [ 0  0  0 50]]



NumPy's <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.arange()</code> function is very versatile and can be used with either one, two, or three arguments. Below we will see examples of each case and how they are used to create different kinds of ndarrays.

1.  When used with only **one argument**, <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.arange(N)</code> will create a rank 1 ndarray with consecutive integers between **0** and **N - 1**.

In [21]:
# Practice: Create a rank 1 ndarray that has sequential integers from 0 to 9
x = np.arange(10)

# Print the ndarray
print()
print('x = ', x)
print()

# We print information about the ndarray
print('x has dimensions:', x.shape)
print('x is an object of type:', type(x))
print('The elements in x are of type:', x.dtype) 


x =  [0 1 2 3 4 5 6 7 8 9]

x has dimensions: (10,)
x is an object of type: <class 'numpy.ndarray'>
The elements in x are of type: int32


2. When used with **two arguments**, <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.arange(start,stop)</code> will create a rank 1 ndarray with evenly spaced values within the half-open interval <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">[start, stop)</code>. This means the evenly spaced numbers will include <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">start</code> but exclude <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">stop</code>. Let's see an example

In [22]:
# Practice: Create a rank 1 ndarray that has sequential integers from 4 to 9. 
x = np.arange(4, 10)

# Print the ndarray
print()
print('x = ', x)
print()

# We print information about the ndarray
print('x has dimensions:', x.shape)
print('x is an object of type:', type(x))
print('The elements in x are of type:', x.dtype) 


x =  [4 5 6 7 8 9]

x has dimensions: (6,)
x is an object of type: <class 'numpy.ndarray'>
The elements in x are of type: int32


Even though the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.arange()</code> function allows for non-integer steps, such as 0.3, the output is usually inconsistent, due to the finite floating point precision. For this reason, in the cases where non-integer steps are required, it is usually better to use the function <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.linspace()</code>. The <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.linspace(start, stop, N)</code> function returns <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">N</code> evenly spaced numbers over the closed interval <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">[start, stop]</code>. This means that both the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">start</code> and the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">stop</code> values are included.

In [23]:
# Practice: Create a rank 1 ndarray that has 10 integers evenly spaced between 0 and 25.
x = np.linspace(0, 25, 10)

# Print the ndarray
print()
print('x = \n', x)
print()

# Print information about the ndarray
print('x has dimensions:', x.shape)
print('x is an object of type:', type(x))
print('The elements in x are of type:', x.dtype) 


x = 
 [ 0.          2.77777778  5.55555556  8.33333333 11.11111111 13.88888889
 16.66666667 19.44444444 22.22222222 25.        ]

x has dimensions: (10,)
x is an object of type: <class 'numpy.ndarray'>
The elements in x are of type: float64


However, you can let the endpoint of the interval be excluded (just like in the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.arange()</code> function) by setting the keyword <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">endpoint = False</code> in the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.linspace()</code> function.

In [3]:
# Practice: Create a rank 1 ndarray that has 10 integers evenly spaced between 0 and 25,
# with 25 excluded.
x = np.linspace(0, 25, 10, endpoint=False)
y = np.linspace(0, 25, 10)  # for comparison

# Print the ndarray
print()
print('x = ', x)
print('y = ', y)
print()

# Print information about the ndarray
print('x has dimensions:', x.shape)
print('x is an object of type:', type(x))
print('The elements in x are of type:', x.dtype) 


x =  [ 0.   2.5  5.   7.5 10.  12.5 15.  17.5 20.  22.5]
y =  [ 0.          2.77777778  5.55555556  8.33333333 11.11111111 13.88888889
 16.66666667 19.44444444 22.22222222 25.        ]

x has dimensions: (10,)
x is an object of type: <class 'numpy.ndarray'>
The elements in x are of type: float64


As we can see, because we have excluded the endpoint, the spacing between values had to change in order to fit 10 evenly spaced numbers in the given interval.

we can use these functions to create rank 2 ndarrays of any shape by combining them with the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.reshape()</code> function. The <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.reshape(ndarray, new_shape)</code> function converts the given **ndarray** into the specified **new_shape**. It is important to note that the **new_shape** should be compatible with the number of elements in the given **ndarray**. For example, you can convert a rank 1 ndarray with 6 elements, into a 3 x 2 rank 2 ndarray, or a 2 x 3 rank 2 ndarray, since both of these rank 2 arrays will have a total of 6 elements. However, you can't reshape the rank 1 ndarray with 6 elements into a 3 x 3 rank 2 ndarray, since this rank 2 array will have 9 elements, which is greater than the number of elements in the original ndarray. Let's see some examples:

In [7]:
# Create a rank 1 ndarray with 6 integers evenly spaced between 0 and 50,
# with 50 excluded. We then reshape it to a 3 x 2 ndarray.
X = np.linspace(0, 50, 6, endpoint=False).reshape(3, 2)

# Print X
print()
print('X = \n', X)
print()

# Print information about X
print('X has dimensions:', X.shape)
print('X is an object of type:', type(X))
print('The elements in x are of type:', X.dtype)


X = 
 [[ 0.          8.33333333]
 [16.66666667 25.        ]
 [33.33333333 41.66666667]]

X has dimensions: (3, 2)
X is an object of type: <class 'numpy.ndarray'>
The elements in x are of type: float64


In [9]:
# or you can also do the same thing with the separate steps

# 1. Create a rank 1 ndarray with 6 integers evenly spaced between 0 and 50.
X = np.linspace(0, 50, 6, endpoint=False)

# 2. reshape it to a 3 x 2 ndarray.
X = X.reshape(3, 2)

# Print X
print()
print('X = \n', X)
print()

# Print information about X
print('X has dimensions:', X.shape)
print('X is an object of type:', type(X))
print('The elements in x are of type:', X.dtype)


X = 
 [[ 0.          8.33333333]
 [16.66666667 25.        ]
 [33.33333333 41.66666667]]

X has dimensions: (3, 2)
X is an object of type: <class 'numpy.ndarray'>
The elements in x are of type: float64


The last type of ndarrays we are going to create are **random ndarrays**. Random ndarrays are arrays that contain random numbers. Often in Machine Learning, you need to create random matrices, for example, when initializing the weights of a Neural Network. NumPy offers a variety of random functions to help us create random ndarrays of any shape.

Let's start by using the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.random.random(shape)</code> function to create an ndarray of the given **shape** with random floats in the half-open interval [0.0, 1.0).

In [None]:
# Create a 3 x 3 ndarray with random floats in the half-open interval [0.0, 1.0).
X = np.random.random((3,3))

# Print X
print()
print('X = \n', X)
print()

# Print information about X
print('X has dimensions:', X.shape)
print('X is an object of type:', type(X))
print('The elements in x are of type:', X.dtype)

NumPy also allows us to create ndarrays with random integers within a particular interval. The function <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.random.randint(start, stop, size = shape)</code> creates an ndarray of the given **shape** with random integers in the half-open interval **[start, stop)**. Let's see an example:

In [11]:
# Create a 3 x 2 ndarray with random integers in the half-open interval [4, 15).
X = np.random.randint(4,15,size=(3,2))

# Print X
print()
print('X = \n', X)
print()

# Print information about X
print('X has dimensions:', X.shape)
print('X is an object of type:', type(X))
print('The elements in X are of type:', X.dtype)


X = 
 [[11 10]
 [ 6  8]
 [ 6 10]]

X has dimensions: (3, 2)
X is an object of type: <class 'numpy.ndarray'>
The elements in X are of type: int32


In some cases, you may need to create ndarrays with random numbers that satisfy certain statistical properties. For example, you may want the random numbers in the ndarray to have an average of 0. NumPy allows you create random ndarrays with numbers drawn from various probability distributions. The function np.random.normal(mean, standard deviation, size=shape), for example, creates an ndarray with the given shape that contains random numbers picked from a normal (Gaussian) distribution with the given mean and standard deviation. Let's create a 1,000 x 1,000 ndarray of random floating point numbers drawn from a normal distribution with a mean (average) of zero and a standard deviation of 0.1.

## 7. Accessing, Deleting, and Inserting Elements Into ndarrays

We will now see how NumPy allows us to effectively manipulate the data within the ndarrays. NumPy ndarrays are **mutable**, meaning that the elements in ndarrays can be changed after the ndarray has been created. NumPy ndarrays can also be **sliced**, which means that ndarrays can be split in many different ways. This allows us, for example, to retrieve any subset of the ndarray that we want. Often in Machine Learning you will use slicing to separate data, as for example when dividing a data set into training, cross validation, and testing sets.

We will start by looking at how the elements of an ndarray can be **accessed** or modified by **indexing**. Elements can be accessed using indices inside square brackets, <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">[ ]</code>. NumPy allows you to use both positive and negative indices to access elements in the ndarray. Positive indices are used to access elements from the beginning of the array, while negative indices are used to access elements from the end of the array. Let's see how we can access elements in rank 1 ndarrays:

In [14]:
# Create a rank 1 ndarray that contains integers from 1 to 5
x = np.array([1, 2, 3, 4, 5])

# Print x
print()
print('x = ', x)
print()

# Access some elements with positive indices
print('This is First Element in x:', x[0]) 
print('This is Second Element in x:', x[1])
print('This is Fifth (Last) Element in x:', x[4])
print()

# Access the same elements with negative indices
print('This is First Element in x:', x[-5])
print('This is Second Element in x:', x[-4])
print('This is Fifth (Last) Element in x:', x[-1])


x =  [1 2 3 4 5]

This is First Element in x: 1
This is Second Element in x: 2
This is Fifth (Last) Element in x: 5

This is First Element in x: 1
This is Second Element in x: 2
This is Fifth (Last) Element in x: 5


Notice that to access the **first element** in the ndarray we have to use the **index 0** not 1. Also notice, that the same element can be accessed using both positive and negative indices. As mentioned earlier, positive indices are used to access elements from the beginning of the array, while negative indices are used to access elements from the end of the array.

Now let's see how we can change the elements in rank 1 ndarrays. We do this by accessing the element we want to change and then using the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">=</code> sign to assign the new value:

In [15]:
# Create a rank 1 ndarray that contains integers from 1 to 5
x = np.array([1, 2, 3, 4, 5])

# Print the original x
print()
print('Original:\n x = ', x)
print()

# Change the fourth element in x from 4 to 20
x[3] = 20

# Print x after it was modified 
print('Modified:\n x = ', x)


Original:
 x =  [1 2 3 4 5]

Modified:
 x =  [ 1  2  3 20  5]


Similarly, we can also access and modify specific elements of rank 2 ndarrays. To access elements in rank 2 ndarrays we need to provide 2 indices in the form <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">[row, column]</code>. Let's see some examples

In [10]:
# Create a 3 x 3 rank 2 ndarray that contains integers from 1 to 9
X = np.array([[1,2,3],[4,5,6],[7,8,9]])

# Print X
print()
print('X = \n', X)
print()

# Access some elements in X
print('This is (0,0) Element in X:', X[0,0])
print('This is (0,1) Element in X:', X[0,1])
print('This is (2,2) Element in X:', X[2,2])


X = 
 [[1 2 3]
 [4 5 6]
 [7 8 9]]

This is (0,0) Element in X: 1
This is (0,1) Element in X: 2
This is (2,2) Element in X: 9


Remember that the index <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">[0, 0]</code>. refers to the element in the first row, first column.

Elements in rank 2 ndarrays can be modified in the same way as with rank 1 ndarrays. 

In [11]:
# Create a 3 x 3 rank 2 ndarray that contains integers from 1 to 9
X = np.array([[1,2,3],[4,5,6],[7,8,9]])

# Print the original x
print()
print('Original:\n X = \n', X)
print()

# We change the (0,0) element in X from 1 to 20
X[0,0] = 20

# Print X after it was modified 
print('Modified:\n X = \n', X)


Original:
 X = 
 [[1 2 3]
 [4 5 6]
 [7 8 9]]

Modified:
 X = 
 [[20  2  3]
 [ 4  5  6]
 [ 7  8  9]]


Now, let's take a look at how we can add and delete elements from ndarrays. We can delete elements using the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.delete(ndarray, elements, axis)</code> function. This function **deletes** the given list of **elements** from the given **ndarray** along the specified **axis**. For rank 1 ndarrays the **axis** keyword is not required. For rank 2 ndarrays, **axis = 0** is used to select rows, and **axis = 1** is used to select columns. Let's see some examples:

In [12]:
# Create a rank 1D ndarray 
x = np.array([1, 2, 3, 4, 5])

# Create a rank 2D ndarray
Y = np.array([[1,2,3],[4,5,6],[7,8,9]])

# Print x
print()
print('Original x = ', x)

# Delete the first and last element of x
x = np.delete(x, [0,4])

# Print x with the first and last element deleted
print()
print('Modified x = ', x)

# Print Y
print()
print('Original Y = \n', Y)

# Delete the first row of y
w = np.delete(Y, 0, axis=0)

# Delete the first and last column of y
v = np.delete(Y, [0,2], axis=1)

# Print w
print()
print('w = \n', w)

# Print v
print()
print('v = \n', v)


Original x =  [1 2 3 4 5]

Modified x =  [2 3 4]

Original Y = 
 [[1 2 3]
 [4 5 6]
 [7 8 9]]

w = 
 [[4 5 6]
 [7 8 9]]

v = 
 [[2]
 [5]
 [8]]


Now, let's see how we can **append** elements to ndarrays. We can append elements to ndarrays using the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.append(ndarray, elements, axis)</code> function. This function appends the given list of **elements** to **ndarray** along the specified **axis**. Let's see some examples:

In [14]:
# Create a rank 1 ndarray 
x = np.array([1, 2, 3, 4, 5])

# Create a rank 2 ndarray 
Y = np.array([[1,2,3],[4,5,6]])

# Print x
print()
print('Original x = ', x)

# Append the integer 6 to x
x = np.append(x, 6)

# Print x
print()
print('x = ', x)

# Append the integer 7 and 8 to x
x = np.append(x, [7,8])

# Print x
print()
print('x = ', x)

# Print Y
print()
print('Original Y = \n', Y)

# Append a new row containing 7,8,9 to y
v = np.append(Y, [[7,8,9]], axis=0)

# Append a new column containing 9 and 10 to y
q = np.append(Y,[[9],[10]], axis=1)

# We print v
print()
print('v = \n', v)

# We print q
print()
print('q = \n', q)


Original x =  [1 2 3 4 5]

x =  [1 2 3 4 5 6]

x =  [1 2 3 4 5 6 7 8]

Original Y = 
 [[1 2 3]
 [4 5 6]]

v = 
 [[1 2 3]
 [4 5 6]
 [7 8 9]]

q = 
 [[ 1  2  3  9]
 [ 4  5  6 10]]


**Notice**: when appending rows or columns to rank 2 ndarrays the rows or columns must have the correct shape, so as to match the shape of the rank 2 ndarray.

Now let's see now how we can insert values to ndarrays. We can insert values to ndarrays using the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.insert(ndarray, index, elements, axis)</code> function. This function inserts the given list of **elements** to **ndarray** right before the given **index** along the specified axis. Let's see some examples:

In [22]:
# Create a rank 1 ndarray 
x = np.array([1, 2, 5, 6, 7])

# Create a rank 2 ndarray 
Y = np.array([[1,2,3],[7,8,9]])

# Print x
print()
print('Original x = ', x)

# Insert the integer 3 and 4 between 2 and 5 in x. 
x = np.insert(x,2,[3,4])

# Print x with the inserted elements
print()
print('x = ', x)

# Print Y
print()
print('Original Y = \n', Y)

# Insert a row between the first and last row of y
w = np.insert(Y,1,[4,5,6],axis=0)

# Insert a column full of 5s between the first and second column of y
v = np.insert(Y,1,5, axis=1)

# Print w
print()
print('w = \n', w)

# Print v
print()
print('v = \n', v)


Original x =  [1 2 5 6 7]

x =  [1 2 3 4 5 6 7]

Original Y = 
 [[1 2 3]
 [7 8 9]]

w = 
 [[1 2 3]
 [4 5 6]
 [7 8 9]]

v = 
 [[1 5 2 3]
 [7 5 8 9]]


NumPy also allows us to stack ndarrays on top of each other, or to stack them side by side. The stacking is done using either the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.vstack()</code> function for vertical stacking, or the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.hstack()</code> function for horizontal stacking. It is important to note that in order to stack ndarrays, the shape of the ndarrays must match. Let's see some examples:

In [25]:
# Create a rank 1 ndarray 
x = np.array([1,2])

# Create a rank 2 ndarray 
Y = np.array([[3,4],[5,6]])

# Print x
print()
print('x = ', x)

# Print Y
print()
print('Y = \n', Y)

# Stack x on top of Y
z = np.vstack((x,Y))

# Stack x on the right of Y. We need to reshape x in order to stack it on the right of Y. 
w = np.hstack((Y,x.reshape(2,1)))

# Print z
print()
print('z = \n', z)

# Print w
print()
print('w = \n', w)


x =  [1 2]

Y = 
 [[3 4]
 [5 6]]

z = 
 [[1 2]
 [3 4]
 [5 6]]

w = 
 [[3 4 1]
 [5 6 2]]


## 8. Slicing ndarrays

In addition to being able to access individual elements one at a time, NumPy provides a way to **access subsets of ndarrays**. This is known as **slicing**. Slicing is performed by combining indices with the colon : symbol inside the square brackets. In general you will come across three types of slicing:

1. <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">ndarray[start:end]</code>
2. <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">ndarray[start:]</code>
3. <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">ndarray[:end]</code>

* The first method is used to select elements between the **start** and **end** indices. 
* The second method is used to select all elements from the **start** index till the **last** index.
* The third method is used to select all elements from the **first** index till the **end** index.

We should note that in methods one and three, the **end index is excluded**. We should also note that since ndarrays can be multidimensional, when doing slicing you usually have to specify a slice for each dimension of the array.

We will now see some examples of how to use the above methods to select different subsets of a rank 2 ndarray.

In [26]:
# Create a 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)

# Print X
print()
print('X = \n', X)
print()

# Select all the elements that are in the 2nd through 4th rows and in the 3rd to 5th columns
Z = X[1:4,2:5]

# Print Z
print('Z = \n', Z)

# Can select the same elements as above using method 2
W = X[1:,2:5]

# Print W
print()
print('W = \n', W)

# Select all the elements that are in the 1st through 3rd rows and in the 3rd to 4th columns
Y = X[:3,2:5]

# Print Y
print()
print('Y = \n', Y)

# Select all the elements in the 3rd row
v = X[2,:]

# Print v
print()
print('v = ', v)

# Select all the elements in the 3rd column
q = X[:,2]

# Print q
print()
print('q = ', q)

# Select all the elements in the 3rd column but return a rank 2 ndarray
R = X[:,2:3]

# Print R
print()
print('R = \n', R)


X = 
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]

Z = 
 [[ 7  8  9]
 [12 13 14]
 [17 18 19]]

W = 
 [[ 7  8  9]
 [12 13 14]
 [17 18 19]]

Y = 
 [[ 2  3  4]
 [ 7  8  9]
 [12 13 14]]

v =  [10 11 12 13 14]

q =  [ 2  7 12 17]

R = 
 [[ 2]
 [ 7]
 [12]
 [17]]


Notice that when we selected all the elements in the 3rd column, variable **q** above, the slice returned a rank 1 ndarray instead of a rank 2 ndarray. However, slicing **X** in a slightly different way, variable **R** above, we can actually get a rank 2 ndarray instead.

---

### * Extended Reading

It is important to note that when we perform slices on ndarrays and save them into new variables, as we did above, the data is not copied into the new variable. This is one feature that often causes confusion for beginners. Therefore, we will look at this in a bit more detail.

In the above examples, when we make assignments, such as:

In [27]:
Z = X[1:4,2:5]

the slice of the original array **X** is not copied in the variable **Z**. Rather, **X** and **Z** are now just two different names for the same ndarray. We say that slicing only creates a view of the original array. This means that if you make changes in **Z** you will be in effect changing the elements in **X** as well. Let's see this with an example:

In [29]:
# Create a 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)

# Print X
print()
print('X = \n', X)
print()

# Select all the elements that are in the 2nd through 4th rows and in the 3rd to 4th columns
Z = X[1:4,2:5]

# Print Z
print()
print('Z = \n', Z)
print()

# Change the last element in Z to 555
Z[2,2] = 555

# Print X
print()
print('X = \n', X)
print()


X = 
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]


Z = 
 [[ 7  8  9]
 [12 13 14]
 [17 18 19]]


X = 
 [[  0   1   2   3   4]
 [  5   6   7   8   9]
 [ 10  11  12  13  14]
 [ 15  16  17  18 555]]



We can clearly see in the above example that if we make changes to **Z**, **X** changes as well.

However, if we want to create a new ndarray that contains a copy of the values in the slice we need to use the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.copy()</code> function. The <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.copy(ndarray)</code> function creates a copy of the given **ndarray**. This function can also be used as a method, in the same way as we did before with the reshape function. Let's do the same example we did before but now with copies of the arrays. We'll use **copy** both as a function and as a method.

In [30]:
# Create a 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)

# Print X
print()
print('X = \n', X)
print()

# Create a copy of the slice using the np.copy() function
Z = np.copy(X[1:4,2:5])

# Create a copy of the slice using the copy as a method
W = X[1:4,2:5].copy()

# Change the last element in Z to 555
Z[2,2] = 555

# Change the last element in W to 444
W[2,2] = 444

# Print X
print()
print('X = \n', X)

# Print Z
print()
print('Z = \n', Z)

# Print W
print()
print('W = \n', W)


X = 
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]


X = 
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]

Z = 
 [[  7   8   9]
 [ 12  13  14]
 [ 17  18 555]]

W = 
 [[  7   8   9]
 [ 12  13  14]
 [ 17  18 444]]


---

NumPy also offers built-in functions to select specific elements within ndarrays. For example, the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.diag(ndarray, k=N)</code> function extracts the elements along the **diagonal** defined by **N**. As default is **k=0**, which refers to the main diagonal. Values of **k > 0** are used to select elements in diagonals above the main diagonal, and values of **k < 0** are used to select elements in diagonals below the main diagonal. Let's see an example:

In [31]:
# Create a 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(25).reshape(5, 5)

# Print X
print()
print('X = \n', X)
print()

# Print the elements in the main diagonal of X
print('z =', np.diag(X))
print()

# Print the elements above the main diagonal of X
print('y =', np.diag(X, k=1))
print()

# Print the elements below the main diagonal of X
print('w = ', np.diag(X, k=-1))


X = 
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]

z = [ 0  6 12 18 24]

y = [ 1  7 13 19]

w =  [ 5 11 17 23]


It is often useful to extract only the unique elements in an ndarray. We can find the unique elements in an ndarray by using the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.unique()</code> function. The <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.unique(ndarray)</code> function returns the **unique** elements in the given **ndarray**, as in the example below:

In [32]:
# Create 3 x 3 ndarray with repeated values
X = np.array([[1,2,3],[5,2,8],[1,2,3]])

# Print X
print()
print('X = \n', X)
print()

# Print the unique elements of X 
print('The unique elements in X are:',np.unique(X))


X = 
 [[1 2 3]
 [5 2 8]
 [1 2 3]]

The unique elements in X are: [1 2 3 5 8]


## 9. Boolean Indexing, Set Operations, and Sorting

Up to now we have seen how to make slices and select elements of an ndarray using indices. This is useful when we know the exact indices of the elements we want to select. However, there are many situations in which **we don't know the indices of the elements we want to select**. For example, suppose we have a 10,000 x 10,000 ndarray of random integers ranging from 1 to 15,000 and we only want to select those integers that are less than 20. Boolean indexing can help us in these cases, by allowing us select elements using logical arguments instead of explicit indices. Let's see some examples:

In [34]:
# Create a 5 x 5 ndarray that contains integers from 0 to 24
X = np.arange(25).reshape(5, 5)

# Print X
print()
print('Original X = \n', X)
print()

# Use Boolean indexing to select elements in X:
print('The elements in X that are greater than 10:', X[X > 10])
print('The elements in X that less than or equal to 7:', X[X <= 7])
print('The elements in X that are between 10 and 17:', X[(X > 10) & (X < 17)])

# Use Boolean indexing to assign the elements that are between 10 and 17 the value of -1
X[(X > 10) & (X < 17)] = -1

# Print X
print()
print('X = \n', X)
print()


Original X = 
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]

The elements in X that are greater than 10: [11 12 13 14 15 16 17 18 19 20 21 22 23 24]
The elements in X that less than or equal to 7: [0 1 2 3 4 5 6 7]
The elements in X that are between 10 and 17: [11 12 13 14 15 16]

X = 
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 -1 -1 -1 -1]
 [-1 -1 17 18 19]
 [20 21 22 23 24]]



---

### * Extended Reading
In addition to Boolean Indexing NumPy also allows for set operations. This useful when comparing ndarrays, for example, to find common elements between two ndarrays. Let's see some examples:

In [35]:
# Create a rank 1 ndarray
x = np.array([1,2,3,4,5])

# Create a rank 1 ndarray
y = np.array([6,7,2,8,4])

# Print x
print()
print('x = ', x)

# Print y
print()
print('y = ', y)

# Use set operations to compare x and y:
print()
print('The elements that are both in x and y:', np.intersect1d(x,y))
print('The elements that are in x that are not in y:', np.setdiff1d(x,y))
print('All the elements of x and y:',np.union1d(x,y))


x =  [1 2 3 4 5]

y =  [6 7 2 8 4]

The elements that are both in x and y: [2 4]
The elements that are in x that are not in y: [1 3 5]
All the elements of x and y: [1 2 3 4 5 6 7 8]


---

We can also sort ndarrays in NumPy. We will learn how to use the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.sort()</code> function to sort rank 1 and rank 2 ndarrays in different ways. Like with other functions we saw before, the **sort** function can also be used as a method. However, there is a big difference on how the data is stored in memory in this case. When <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.sort()</code> is used as a function, it sorts the ndrrays out of place, meaning, that it doesn't change the original ndarray being sorted.

In [36]:
# Create an unsorted rank 1 ndarray
x = np.random.randint(1,11,size=(10,))

# Print x
print()
print('Original x = ', x)

# Sort x and print the sorted array using sort as a function.
print()
print('Sorted x (out of place):', np.sort(x))

# When we sort out of place the original array remains intact. To see this we print x again
print()
print('x after sorting:', x)


Original x =  [ 6  3  1  8  4  5 10  7  6  5]

Sorted x (out of place): [ 1  3  4  5  5  6  6  7  8 10]

x after sorting: [ 6  3  1  8  4  5 10  7  6  5]


In [39]:
# Add assignment to store the sorted 1D ndarray
x = np.sort(x)
print('x after sorting (stored):', x)

x after sorting (stored): [ 1  3  4  5  5  6  6  7  8 10]


However, when you use **sort** as a method, <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">ndarray.sort()</code> sorts the ndarray in place, meaning, that the original array will be changed to the sorted one. Let's see some examples:

In [41]:
# Create an unsorted rank 1 ndarray
x = np.random.randint(1,11,size=(10,))

# Print x
print()
print('Original x = ', x)

# Sort x and print the sorted array using sort as a method.
x.sort()

# When we sort in place the original array is changed to the sorted array. To see this we print x again
print()
print('x after sorting:', x)


Original x =  [ 7  6  6  6  8  7 10 10  2  2]

x after sorting: [ 2  2  6  6  6  7  7  8 10 10]


When sorting rank 2 ndarrays, we need to specify to the <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.sort()</code> function whether we are sorting by rows or columns. This is done by using the **axis** keyword. Let's see some examples:

In [40]:
# Create an unsorted rank 2 ndarray
X = np.random.randint(1,11,size=(5,5))

# Print X
print()
print('Original X = \n', X)
print()

# Sort the columns of X and print the sorted array
print()
print('X with sorted columns :\n', np.sort(X, axis = 0))

# Sort the rows of X and print the sorted array
print()
print('X with sorted rows :\n', np.sort(X, axis = 1))


Original X = 
 [[ 6  3  7  8  2]
 [ 5 10 10  8  7]
 [ 3  1  7  3  3]
 [ 2 10  6  7  8]
 [ 8  3  7  6  1]]


X with sorted columns :
 [[ 2  1  6  3  1]
 [ 3  3  7  6  2]
 [ 5  3  7  7  3]
 [ 6 10  7  8  7]
 [ 8 10 10  8  8]]

X with sorted rows :
 [[ 2  3  6  7  8]
 [ 5  7  8 10 10]
 [ 1  3  3  3  7]
 [ 2  6  7  8 10]
 [ 1  3  6  7  8]]


## 10. Arithmetic operations and Broadcasting

NumPy allows element-wise operations on ndarrays as well as matrix operations. In this lesson we will only be looking at element-wise operations on ndarrays. In order to do element-wise operations, NumPy sometimes uses something called Broadcasting. Broadcasting is the term used to describe how NumPy handles element-wise arithmetic operations with ndarrays of different shapes. For example, broadcasting is used implicitly when doing arithmetic operations between scalars and ndarrays.

Let's start by doing element-wise addition, subtraction, multiplication, and division, between ndarrays. To do this, NumPy provides a functional approach, where we use functions such as <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">np.add()</code>, or by using arithmetic symbols, such as <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">+</code>, that resembles more how we write mathematical equations. Both forms will do the same operation, the only difference is that if you use the function approach, the functions usually have options that you can tweak using keywords. It is important to note that when performing element-wise operations, the shapes of the ndarrays being operated on, must have the same shape or be broadcastable. We'll explain more about this later in this lesson. Let's start by performing element-wise arithmetic operations on rank 1 ndarrays:

In [42]:
# Create two rank 1 ndarrays
x = np.array([1,2,3,4])
y = np.array([5.5,6.5,7.5,8.5])

# Print x
print()
print('x = ', x)

# Print y
print()
print('y = ', y)
print()

# Perfrom basic element-wise operations using arithmetic symbols and functions
print('x + y = ', x + y)
print('add(x,y) = ', np.add(x,y))
print()
print('x - y = ', x - y)
print('subtract(x,y) = ', np.subtract(x,y))
print()
print('x * y = ', x * y)
print('multiply(x,y) = ', np.multiply(x,y))
print()
print('x / y = ', x / y)
print('divide(x,y) = ', np.divide(x,y))


x =  [1 2 3 4]

y =  [5.5 6.5 7.5 8.5]

x + y =  [ 6.5  8.5 10.5 12.5]
add(x,y) =  [ 6.5  8.5 10.5 12.5]

x - y =  [-4.5 -4.5 -4.5 -4.5]
subtract(x,y) =  [-4.5 -4.5 -4.5 -4.5]

x * y =  [ 5.5 13.  22.5 34. ]
multiply(x,y) =  [ 5.5 13.  22.5 34. ]

x / y =  [0.18181818 0.30769231 0.4        0.47058824]
divide(x,y) =  [0.18181818 0.30769231 0.4        0.47058824]


We can also perform the same element-wise arithmetic operations on **rank 2 ndarrays**. Again, remember that in order to do these operations the shapes of the ndarrays being operated on, must have the same shape or be broadcastable.

In [43]:
# Create two rank 2 ndarrays
X = np.array([1,2,3,4]).reshape(2,2)
Y = np.array([5.5,6.5,7.5,8.5]).reshape(2,2)

# Print X
print()
print('X = \n', X)

# Print Y
print()
print('Y = \n', Y)
print()

# Perform basic element-wise operations using arithmetic symbols and functions
print('X + Y = \n', X + Y)
print()
print('add(X,Y) = \n', np.add(X,Y))
print()
print('X - Y = \n', X - Y)
print()
print('subtract(X,Y) = \n', np.subtract(X,Y))
print()
print('X * Y = \n', X * Y)
print()
print('multiply(X,Y) = \n', np.multiply(X,Y))
print()
print('X / Y = \n', X / Y)
print()
print('divide(X,Y) = \n', np.divide(X,Y))


X = 
 [[1 2]
 [3 4]]

Y = 
 [[5.5 6.5]
 [7.5 8.5]]

X + Y = 
 [[ 6.5  8.5]
 [10.5 12.5]]

add(X,Y) = 
 [[ 6.5  8.5]
 [10.5 12.5]]

X - Y = 
 [[-4.5 -4.5]
 [-4.5 -4.5]]

subtract(X,Y) = 
 [[-4.5 -4.5]
 [-4.5 -4.5]]

X * Y = 
 [[ 5.5 13. ]
 [22.5 34. ]]

multiply(X,Y) = 
 [[ 5.5 13. ]
 [22.5 34. ]]

X / Y = 
 [[0.18181818 0.30769231]
 [0.4        0.47058824]]

divide(X,Y) = 
 [[0.18181818 0.30769231]
 [0.4        0.47058824]]


We can also apply mathematical functions, such as <code style="color:#fff;background-color:#2f3d48;border-radius: 4px;border: 1px solid #737b83;padding: 2px 4px">sqrt(x)</code>, to all elements of an ndarray at once.

In [44]:
# Create a rank 1 ndarray
x = np.array([1,2,3,4])

# Print x
print()
print('x = ', x)

# Apply different mathematical functions to all elements of x
print()
print('EXP(x) =', np.exp(x))
print()
print('SQRT(x) =',np.sqrt(x))
print()
print('POW(x,2) =',np.power(x,2)) # We raise all elements to the power of 2


x =  [1 2 3 4]

EXP(x) = [ 2.71828183  7.3890561  20.08553692 54.59815003]

SQRT(x) = [1.         1.41421356 1.73205081 2.        ]

POW(x,2) = [ 1  4  9 16]


Another great feature of NumPy is that it has a wide variety of statistical functions. Statistical functions provide us with statistical information about the elements in an ndarray. Let's see some examples:

In [45]:
# Create a 2 x 2 ndarray
X = np.array([[1,2], [3,4]])

# Print x
print()
print('X = \n', X)
print()

print('Average of all elements in X:', X.mean())
print('Average of all elements in the columns of X:', X.mean(axis=0))
print('Average of all elements in the rows of X:', X.mean(axis=1))
print()
print('Sum of all elements in X:', X.sum())
print('Sum of all elements in the columns of X:', X.sum(axis=0))
print('Sum of all elements in the rows of X:', X.sum(axis=1))
print()
print('Standard Deviation of all elements in X:', X.std())
print('Standard Deviation of all elements in the columns of X:', X.std(axis=0))
print('Standard Deviation of all elements in the rows of X:', X.std(axis=1))
print()
print('Median of all elements in X:', np.median(X))
print('Median of all elements in the columns of X:', np.median(X,axis=0))
print('Median of all elements in the rows of X:', np.median(X,axis=1))
print()
print('Maximum value of all elements in X:', X.max())
print('Maximum value of all elements in the columns of X:', X.max(axis=0))
print('Maximum value of all elements in the rows of X:', X.max(axis=1))
print()
print('Minimum value of all elements in X:', X.min())
print('Minimum value of all elements in the columns of X:', X.min(axis=0))
print('Minimum value of all elements in the rows of X:', X.min(axis=1))


X = 
 [[1 2]
 [3 4]]

Average of all elements in X: 2.5
Average of all elements in the columns of X: [2. 3.]
Average of all elements in the rows of X: [1.5 3.5]

Sum of all elements in X: 10
Sum of all elements in the columns of X: [4 6]
Sum of all elements in the rows of X: [3 7]

Standard Deviation of all elements in X: 1.118033988749895
Standard Deviation of all elements in the columns of X: [1. 1.]
Standard Deviation of all elements in the rows of X: [0.5 0.5]

Median of all elements in X: 2.5
Median of all elements in the columns of X: [2. 3.]
Median of all elements in the rows of X: [1.5 3.5]

Maximum value of all elements in X: 4
Maximum value of all elements in the columns of X: [3 4]
Maximum value of all elements in the rows of X: [2 4]

Minimum value of all elements in X: 1
Minimum value of all elements in the columns of X: [1 2]
Minimum value of all elements in the rows of X: [1 3]


Finally, let's see how NumPy can add single numbers to all the elements of an ndarray without the use of complicated loops.

In [46]:
# Create a 2 x 2 ndarray
X = np.array([[1,2], [3,4]])

# Print x
print()
print('X = \n', X)
print()

print('3 * X = \n', 3 * X)
print()
print('3 + X = \n', 3 + X)
print()
print('X - 3 = \n', X - 3)
print()
print('X / 3 = \n', X / 3)


X = 
 [[1 2]
 [3 4]]

3 * X = 
 [[ 3  6]
 [ 9 12]]

3 + X = 
 [[4 5]
 [6 7]]

X - 3 = 
 [[-2 -1]
 [ 0  1]]

X / 3 = 
 [[0.33333333 0.66666667]
 [1.         1.33333333]]


In the examples above, NumPy is working behind the scenes to **broadcast** **3** along the ndarray so that they have the same shape. This allows us to add 3 to each element of **X** with just one line of code.

Subject to certain constraints, Numpy can do the same for two ndarrays of different shapes, as we can see below.

In [47]:
# Create a rank 1 ndarray
x = np.array([1,2,3])

# Create a 3 x 3 ndarray
Y = np.array([[1,2,3],[4,5,6],[7,8,9]])

# Create a 3 x 1 ndarray
Z = np.array([1,2,3]).reshape(3,1)

# Print x
print()
print('x = ', x)
print()

# Print Y
print()
print('Y = \n', Y)
print()

# Print Z
print()
print('Z = \n', Z)
print()

print('x + Y = \n', x + Y)
print()
print('Z + Y = \n',Z + Y)


x =  [1 2 3]


Y = 
 [[1 2 3]
 [4 5 6]
 [7 8 9]]


Z = 
 [[1]
 [2]
 [3]]

x + Y = 
 [[ 2  4  6]
 [ 5  7  9]
 [ 8 10 12]]

Z + Y = 
 [[ 2  3  4]
 [ 6  7  8]
 [10 11 12]]


As before, NumPy is able to add 1 x 3 and 3 x 1 ndarrays to 3 x 3 ndarrays by broadcasting the smaller ndarrays along the big ndarray so that they have compatible shapes. In general, NumPy can do this provided that the smaller ndarray, such as the 1 x 3 ndarray in our example, can be expanded to the shape of the larger ndarray in such a way that the resulting broadcast is unambiguous.

Make sure you check out the NumPy Documentation for more information on Broadcasting and its rules:
[Broadcasting](https://numpy.org/doc/stable/user/basics.broadcasting.html)