# PETI8123 Lab 2: NumPy

<!--

Ex 1. np.arange(12, 38)

Ex 2. np.arange(2, 11).reshape(3, 3)

Ex 3. print(a.max(), a.mean())

Ex 4.
np.eye(4) * 5
np.array([[5., 0., 0., 0.],
          [0., 5., 0., 0.],
          [0., 0., 5., 0.],
          [0., 0., 0., 5.]])

Ex 5.
a = np.arange(0, 51, 10).reshape(2, 3)
a[a > 15]

-->

**NumPy** (Numerical Python) is an open-source numerical computing extension of Python. This tool can be used to store and process large matrices, which is much more efficient than Python's own nested list structure (which can also be used to represent matrices). It supports a large number of dimensional arrays and matrix operations, and also provides a large library of mathematical functions for array operations. Therefore, given its useful features, it becomes a tool that must be learnt for Python data analysis.

## 1. Modules and Import


A **module** is a code library containing a set of functions you want to include in your application. Modules need to be imported before use.

In [1]:
# This line of code imports the math module in Python,
# which contains various mathematical functions and constants,
# including the ceiling function ``ceil``.
import math

# This line of code calls the ``ceil`` function from the math module
# and passes 8.7 as its argument.
# The ceil function's purpose is to round up the given parameter to
# the nearest integer. In this example, 8.7 will be rounded up to 9.
# Therefore, the result of this line of code is 9.
math.ceil(8.7)

9

There are multiple ways to import a module in Python:

1. ``import <module_name>``
1. ``from <module_name> import <name(s)>``
1. ``import <module_name> as <another_name>``


In [2]:
# Now, you don’t need to write math every time when you use ceil()
from math import ceil

# This line of code directly calls the ceil function that was imported from the math module
# and passes 8.7 as its argument.
ceil(8.7)

9

## 2. Arrays in NumPy

In [3]:
# Lists are good for storing small amounts of one-dimensional data
# This line of code creates a list named a containing the integers 1, 3, 5, 7, and 9.
a = [1, 3, 5, 7, 9]

# This line of code creates a list named b containing the integers 3, 5, 6, 7, and 9.
b = [3, 5, 6, 7, 9]

# This line of code concatenates (joins) the two lists a and b, creating a new list, and assigns it to the variable c.
# So, the value of c will be [1, 3, 5, 7, 9, 3, 5, 6, 7, 9], which includes all the elements from both a and b.
c = a + b
print(c)

[1, 3, 5, 7, 9, 3, 5, 6, 7, 9]


Lists can’t be used directly with arithmetical operators (+, -, *, /, …). We need efﬁcient arrays with arithmetic and better multidimensional tools. NumPy arrays are similar to lists, but much more capable, except ﬁxed size.


### 2.1. Creating NumPy arrays

There are a number of ways to initialize new numpy arrays, for example, from:

1. Python lists or tuples

2. Using functions that are dedicated to generating numpy arrays, such as arange, linspace, etc.

3. Reading data from files


NumPy adds a new data structure to Python – **ndarray**

1. An N-dimensional array is a homogeneous collection of “items” indexed using N integers
2. Defined by:
    1. the shape of the array, and
    1. the kind of item the array is composed of

## 2.2. Array Shapes

ndarrays are rectangular: the shape of the array is a tuple of N integers (one for each dimension).

In [4]:
# The convention is to import numpy as np
# And we only need to import once and we can use it all the time

import numpy as np

In [7]:
# Now, the functions of NumPy can be access from np
a = np.array([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

# a.shape returns the dimensions of a NumPy array a as a tuple,
# representing the size along each axis or dimension of the array.
a.shape


(3, 3)

## 2.3. Array data types (dtype)

Every ndarray is a homogeneous collection of exactly the same data type

1. every item takes up the same size block of memory
2. each block of memory in the array is interpreted in exactly the same way


In [8]:
a = np.array([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

# a.dtype retrieves and returns the data type (e.g., integer, float)
# of the elements within the NumPy array a.
a.dtype


dtype('int64')

## 2.4. Some ndarray methods

1. ``ndarray.tolist()``
    * Returns the content of the array as Python list(s)

1. ``ndarray.fill(scalar)``
    * Fills an array with the scalar value

1. ``ndarray.min()``
    * Finds the minimum value in the array

1. ``ndarray.max()``
    * Finds the maximum value in the array

1. ``ndarray.mean()``
    * Computes the mean of the values in the array


## 3. Create Arrays from Lists


In [9]:
# This line creates a NumPy array a containing the elements 1, 3, 5, 7, and 9.
a = np.array([1, 3, 5, 7, 9])

# This line creates another NumPy array b containing the elements 3, 5, 6, 7, and 9.
b = np.array([3, 5, 6, 7, 9])

#This line performs element-wise addition of arrays a and b and
# stores the result in a new NumPy array c. The result of the addition is
# [4, 8, 11, 14, 18], so c will contain these values.
c = a + b

# This line prints the array c to the console, which will display [4, 8, 11, 14, 18].
print(c)

# This line retrieves and prints the shape of the array c, which
# is a tuple representing the dimensions of the array. In this case, since c
# is a 1-dimensional array, c.shape will return (5,), indicating that c has 5
# elements along its only dimension.
print(c.shape)

[ 4  8 11 14 18]
(5,)


In [10]:
# create a list
l = [[1, 2, 3], [3, 6, 9], [2, 4, 6]]

# convert a list to an array
a = np.array(l)

print(a)

[[1 2 3]
 [3 6 9]
 [2 4 6]]


In [11]:
# let's show the shape and the dtype of an array
print(a.shape, a.dtype)

(3, 3) int64


We can also create a matrix (that is, a two dimensional array):

In [12]:
# This line creates a NumPy array M which is a 2x2 matrix.
# It's initialized with the values [1, 2] in the first row and
# [3, 4] in the second row.
M = np.array([[1, 2], [3, 4]])

# This line retrieves and returns the shape of the array M,
# which is a tuple (2, 2). This indicates that M has 2 rows and 2 columns.
print(M.shape)

(2, 2)


## 4. Indexing

Array indexing always uses square brackets ([ ]) to index the elements of the array so that the elements can then be referred individually


In [13]:
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

print(a)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [14]:
# Accessing an element of the NumPy array 'a' at row index 1 and column index 2.
a[1, 2]

6

In [15]:
# N-dimensional array:
a[0, :]

array([1, 2, 3])

In [16]:
# Accessing all elements in the first column of the NumPy array 'a'.
a[:, 0]

array([1, 4, 7])

In [17]:
# Accessing a 2x2 subarray of 'a' starting from the top-left corner.
a[0:2, 0:2]

array([[1, 2],
       [4, 5]])

## 5. Creating Arrays using Functions

In [18]:
# Create a range of numbers from 0 to 9 with a step of 1.
np.arange(0, 10, 1)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [19]:
# Generate an array of 25 evenly spaced values between 0 and 10 (inclusive).
np.linspace(0, 10, 25)

array([ 0.        ,  0.41666667,  0.83333333,  1.25      ,  1.66666667,
        2.08333333,  2.5       ,  2.91666667,  3.33333333,  3.75      ,
        4.16666667,  4.58333333,  5.        ,  5.41666667,  5.83333333,
        6.25      ,  6.66666667,  7.08333333,  7.5       ,  7.91666667,
        8.33333333,  8.75      ,  9.16666667,  9.58333333, 10.        ])

In [20]:
np.arange(3, 7, 0.5)  # arange with arbitrary start, stop and step

array([3. , 3.5, 4. , 4.5, 5. , 5.5, 6. , 6.5])

In [22]:
# Generate an array of 10 logarithmically spaced values between e^0 and e^10.
np.logspace(0, 10, 10, base=np.e)

array([1.00000000e+00, 3.03773178e+00, 9.22781435e+00, 2.80316249e+01,
       8.51525577e+01, 2.58670631e+02, 7.85771994e+02, 2.38696456e+03,
       7.25095809e+03, 2.20264658e+04])

In [23]:
# a diagonal matrix
np.diag([1, 2, 3])

array([[1, 0, 0],
       [0, 2, 0],
       [0, 0, 3]])

In [24]:
# Create a NumPy array filled with zeros and containing 5 elements.
np.zeros(5)

array([0., 0., 0., 0., 0.])

In [25]:
# Create a 3x3 NumPy array filled with ones.
np.ones((3, 3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [26]:
# Create a 3x3 NumPy array filled with ones on the diagonal line.
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [27]:
# Create a NumPy array with values ranging from 0 to 5 and reshape it into a 3x2 matrix.
d = np.arange(6).reshape(3, 2)
d

array([[0, 1],
       [2, 3],
       [4, 5]])

In [28]:
d * 0.4  # operations create a new array

array([[0. , 0.4],
       [0.8, 1.2],
       [1.6, 2. ]])

## 6. Arithmetic operations and filtering


In [29]:
# Create a NumPy array 'a' with values ranging from 0.0 to 3.0.
# Can you guess what's in array 'a'?
a = np.arange(4.0)

# Multiply 'a' by 23.4 and store the result in 'b'.
b = a * 23.4

# Divide 'b' by (a + 1) and store the result in 'c'.
c = b / (a + 1)

# Add 10 to all elements of 'c'.
c += 10

# show the result:
print(c)

[10.   21.7  25.6  27.55]


In [30]:
# Create a NumPy array 'arr' with values ranging from 100 to 199.
arr = np.arange(100, 200)
print(arr, "\n")

# Create a list 'select' containing index values to select elements from 'arr'.
select = [5, 25, 50, 75, -5]
print(arr[select])  # can use integer lists as indices

[100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117
 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135
 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153
 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171
 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189
 190 191 192 193 194 195 196 197 198 199] 

[105 125 150 175 195]


In [31]:
# Create a NumPy array 'arr' with values ranging from 10 to 19.
arr = np.arange(10, 20)

# Create a Boolean array 'div_by_3' by checking if each element
# of 'arr' is divisible by 3.
div_by_3 = arr % 3 == 0

# Print the Boolean array 'div_by_3'.
print(div_by_3)


[False False  True False False  True False False  True False]


In [32]:
# We can use Boolean lists as indices
# (keep values at True positions)
print(arr[div_by_3])


[12 15 18]


## 7. Descriptive Statistics


In [33]:
# Create a NumPy array 'arr' with values ranging from
# 10 to 19 and reshape it into a 2x5 matrix.
arr = np.arange(10, 20).reshape(2, 5)
arr

array([[10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [34]:
# Calculate the sum of all elements in the NumPy array 'arr'.
arr.sum()

145

In [35]:
# Calculate the mean (average) of all elements in the NumPy array 'arr'.
arr.mean()

14.5

In [36]:
# Calculate the standard deviation of all elements in the NumPy array 'arr'.
arr.std()

2.8722813232690143

In [37]:
# Find the maximum value among all elements in the NumPy array 'arr'.
arr.max()

19

In [38]:
# Find the minimum value among all elements in the NumPy array 'arr'.
arr.min()

10

In [39]:
# Check if all elements in the Boolean array 'div_by_3' are True.
div_by_3.all()

False

In [40]:
# Check if any element in the Boolean array 'div_by_3' is True.
div_by_3.any()

True

In [41]:
# Count the number of True values in the Boolean array 'div_by_3'.
div_by_3.sum()

3

In [42]:
# Find the indices of True values in the Boolean array 'div_by_3'.
div_by_3.nonzero()

(array([2, 5, 8]),)

## 8. Sorting


In [43]:
# Create a NumPy array 'arr' with the specified values.
arr = np.array([4.5, 2.3, 6.7, 1.2, 1.8, 5.5])

# Sort the elements of the NumPy array 'arr' in ascending order (in-place).
arr.sort()

# Print the sorted array 'arr'.
print(arr)


[1.2 1.8 2.3 4.5 5.5 6.7]


In [44]:
# Create a NumPy array 'x' with the specified values.
x = np.array([4.5, 2.3, 6.7, 1.2, 1.8, 5.5])

# Return a new NumPy array containing the sorted elements of 'x' in ascending order.
np.sort(x)


array([1.2, 1.8, 2.3, 4.5, 5.5, 6.7])

In [45]:
# Print the original NumPy array 'x'. Hasn't changed!
print(x)

[4.5 2.3 6.7 1.2 1.8 5.5]


In [46]:
# Create an array 's' containing the indices that would sort the array 'x'.
s = x.argsort()

# Print the array 's'.
s

array([3, 4, 1, 0, 5, 2])

In [47]:
# Use the array of sorted indices 's' to rearrange the elements of 'x'.
x[s]

array([1.2, 1.8, 2.3, 4.5, 5.5, 6.7])

In [48]:
# Reverse the order of elements in the array obtained by using sorted indices 's' on 'x'.
np.flip(x[s])

array([6.7, 5.5, 4.5, 2.3, 1.8, 1.2])

## ⚠️ Exercises

**1.** Write a NumPy program to create an array with values ranging from 12 to 38. Expected Output:

``[12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37]``

In [53]:
np.arange(12, 38)

array([12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
       29, 30, 31, 32, 33, 34, 35, 36, 37])

**2.** Write a NumPy program to create a 3x3 matrix with values ranging from 2 to 10. Expected Output:

<code>[[ 2  3  4]
 [ 5  6  7]
 [ 8  9 10]]</code>


In [60]:
np.arange(2, 11).reshape(3, 3)

array([[ 2,  3,  4],
       [ 5,  6,  7],
       [ 8,  9, 10]])

**3.** Given ``a = np.array([15, 7, 5, 9, 20])``, find the maximum values in this array, and the average value of the array.

In [61]:
a = np.array([15, 7, 5, 9, 20])
a.max()

20

**4.** Think of 2 ways to create the following array:

<code>[[5. 0. 0. 0.]
 [0. 5. 0. 0.]
 [0. 0. 5. 0.]
 [0. 0. 0. 5.]]</code>

In [66]:
np.diag([5, 5, 5, 5])

array([[5, 0, 0, 0],
       [0, 5, 0, 0],
       [0, 0, 5, 0],
       [0, 0, 0, 5]])

In [82]:
a = []
for i in range(4):
  z_arr = np.zeros(4)
  z_arr[i] = 5
  a.append(z_arr)
np.array(a)

array([[5., 0., 0., 0.],
       [0., 5., 0., 0.],
       [0., 0., 5., 0.],
       [0., 0., 0., 5.]])

**5.** Given the array below, create this array with ``arange`` and ``reshape``, then find the values that are greater than 15.

<code>[[ 0 10 20]
 [30 40 50]]</code>

In [71]:
m = np.arange(0, 60, 10).reshape(2, 3)
gt_15 = m>15
gt_15

array([[False, False,  True],
       [ True,  True,  True]])