$\textbf{DS 131 - DATA STRUCTURES and ALGORITHMS} \\ \text{1Q SY2324}$

$\text{Edgar M. Adina} \\ \textit{Instructor}$

## <center> DATA STRUCTURES - ARRAYS

Numpy is probably the most fundamental numerical computing module in Python. NumPy is important in scientific computing, it is coded both in Python and C (for speed). Some of the important features for Numpy are:

     (i) a powerful N-dimensional array object

     (ii) sophisticated (broadcasting) functions

     (iii) tools for integrating C/C++ and Fortran code

      (iv) useful linear algebra, Fourier transform, and random number capabilities

This lesson only focuses on the Numpy array which is related to the data structure. In order to use Numpy module, we need to import it first. A conventional way to import it is to use “np” as a shortened name.

In [1]:
import numpy as np

To define an array in Python, you could use the $np.array$ function to convert a list.

**Example:** Create the following arrays:

$ x= \left( \begin{array}{ccc}  1 & 4 & 3 \end{array} \right) $

$ y= \left( \begin{array}{ccc}  1 & 4 & 3 \\ 9 & 2 & 7 \end{array} \right) $

In [2]:
x = np.array([1, 4, 3])
x

array([1, 4, 3])

In [3]:
y = np.array([[1, 4, 3], [9, 2, 7]])
y

array([[1, 4, 3],
       [9, 2, 7]])

*Remark:* A 2-D array could use a nested lists to represent, with the inner list represent each row.

Many times we would like to know the size or length of an array. The array shape attribute is called on an array M and returns a 2 × 3 array where the first element is the number of rows in the matrix M and the second element is the number of columns in M. Note that the output of the shape attribute is a tuple. The size attribute is called on an array M and returns the total number of elements in matrix M.

**Example:** Find the rows, columns and the total size for array y.

In [4]:
y.shape

(2, 3)

In [5]:
y.size

6

*Remark:* You may notice the difference that we only use y.shape instead of y.shape(), this is because shape is an attribute rather than a method in this array object.

Very often we would like to generate arrays that have a structure or pattern. For instance, we may wish to create the array z = [1 2 3 … 2000]. It would be very cumbersome to type the entire description of z into Python. For generating arrays that are in order and evenly spaced, it is useful to use the arange function in Numpy.

**Example:** Create an array z from 1 to 2000 with an increment 1.

In [6]:
z = np.arange(1, 2000, 1)
z

array([   1,    2,    3, ..., 1997, 1998, 1999])

Using the *np.arange*, we could create $z$ easily. The first two numbers are the start and end of the sequence, and the last one is the increment. Since it is very common to have an increment of 1, if an increment is not specified, Python will use a default value of 1. Therefore np.arange(1, 2000) will have the same result as np.arange(1, 2000, 1). Negative or noninteger increments can also be used. If the increment “misses” the last value, it will only extend until the value just before the ending value. For example, x = np.arange(1,8,2) would be [1, 3, 5, 7].

**Example:** Generate an array with [0.5, 1, 1.5, 2, 2.5].

In [7]:
np.arange(0.5, 3, 0.5)

array([0.5, 1. , 1.5, 2. , 2.5])

Sometimes we want to guarantee a start and end point for an array but still have evenly spaced elements. For instance, we may want an array that starts at 1, ends at 8, and has exactly 10 elements. For this purpose you can use the function *np.linspace. linspace* takes three input values separated by commas. So **A = linspace(a,b,n)** generates an array of $n$ equally spaced elements starting from $a$ and ending at $b$.

**Example:** Use linspace to generate an array starting at 3, ending at 9, and containing 10 elements.

In [8]:
np.linspace(3, 9, 10)

array([3.        , 3.66666667, 4.33333333, 5.        , 5.66666667,
       6.33333333, 7.        , 7.66666667, 8.33333333, 9.        ])

Getting access to the 1D numpy array is similar to what we described for lists or tuples, it has an index to indicate the location.

**Example:**

In [9]:
# get the 2nd element of x
x[1]

4

In [10]:
# get all the element after the 2nd element of x
x[1:]

array([4, 3])

In [11]:
# get the last element of x
x[-1]

3

For 2D arrays, it is slightly different, since we have rows and columns. To get access to the data in a 2D array M, we need to use M[r, c], that the row r and column c are separated by comma. This is referred to as array indexing. The r and c could be single number, a list and so on. If you only think about the row index or the column index, than it is similar to the 1D array. Let’s use the $ x= \left( \begin{array}{ccc}  1 & 4 & 3 \\ 9 & 2 & 7 \end{array} \right) $ as an example.

**Example:**

In [12]:
y[0,1]

4

**Example:** Get the first row of array y.

In [13]:
y[0, :]

array([1, 4, 3])

**Example:** Get the last column of array y.

In [14]:
y[:, -1]

array([3, 7])

**Example:** Get the first and third column of array y.

In [15]:
y[:, [0, 2]]

array([[1, 3],
       [9, 7]])

There are some predefined arrays that are really useful. For example, the *np.zeros*, *np.ones*, and *np.empty* are 3 useful functions.

Check the following **examples**.

In [16]:
np.zeros((3, 5))

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [17]:
np.ones((5, 3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

*Remark:* The shape of the array is defined in a tuple with row as the first item, and column as the second. If you only need a 1D array, then it could be only one number as the input: np.ones(5).

**Example:** Generate a 1D empty array with 3 elements.

In [18]:
np.empty(3)

array([4.24399158e-314, 4.23762559e-311, 4.24186959e-311])

*Remark:* The empty array is not really empty, it is filled with random very small numbers.

You can reassign a value of an array by using array indexing and the assignment operator. You can reassign multiple elements to a single number using array indexing on the left side. You can also reassign multiple elements of an array as long as both the number of elements being assigned and the number of elements assigned is the same. You can create an array using array indexing.

**Example:** Let $a$ = [1, 2, 3, 4, 5, 6]. Reassign the fourth element of $a$ to 7. Reassign the first, second, and thrid elements to 1. Reassign the second, third, and fourth elements to 9, 8, and 7.

In [19]:
a = np.arange(1, 7)
a

array([1, 2, 3, 4, 5, 6])

In [20]:
a[3] = 7
a

array([1, 2, 3, 7, 5, 6])

In [21]:
a[:3] = 1
a

array([1, 1, 1, 7, 5, 6])

In [22]:
a[1:4] = [9, 8, 7]
a

array([1, 9, 8, 7, 5, 6])

**Example:** Create a zero array b with shape 2 by 2, and set $ b = \left( \begin{array}{cc}  1 & 2 \\ 3 & 4 \end{array} \right) $ using array indexing.

In [23]:
b = np.zeros((2, 2))
b[0, 0] = 1
b[0, 1] = 2
b[1, 0] = 3
b[1, 1] = 4
b

array([[1., 2.],
       [3., 4.]])

**Remark:** Although you can create an array from scratch using indexing, we do not advise it. It can confuse you and errors will be harder to find in your code later. For example, b[1, 1] = 1 will give the result $ b = \left( \begin{array}{cc}  0 & 0 \\ 0 & 1 \end{array} \right) $, which is strange because b[0, 0], b[0, 1], and b[1, 0] were never specified.

Basic arithmetic is defined for arrays. However, there are operations between a scalar (a single number) and an array and operations between two arrays. We will start with operations between a scalar and an array. To illustrate, let $c$ be a scalar, and $b$ be a matrix.

$b + c$, $b − c$, $b * c$ and $\frac{b}{c}$ adds $c$ to every element of $b$, subtracts $c$ from every element of $b$, multiplies every element of $b$ by $c$, and divides every element of $b$ by $c$, respectively.

**Example:** Let $ b = \left( \begin{array}{cc}  1 & 2 \\ 3 & 4 \end{array} \right) $. Add and substract 2 from $b$. Multiply and divide $b$ by 2. Square every element of $b$. Let $c$ be a scalar. On your own, verify the reflexivity of scalar addition and multiplication: $b + c = c + b$ and $cb = bc$.

In [24]:
b + 2

array([[3., 4.],
       [5., 6.]])

In [25]:
b - 2

array([[-1.,  0.],
       [ 1.,  2.]])

In [26]:
2 * b

array([[2., 4.],
       [6., 8.]])

In [27]:
b / 2

array([[0.5, 1. ],
       [1.5, 2. ]])

In [28]:
b**2

array([[ 1.,  4.],
       [ 9., 16.]])

Describing operations between two matrices is more complicated. Let $b$ and $d$ be two matrices of the same size. $b − d$ takes every element of $b$ and subtracts the corresponding element of $d$. Similarly, $b + d$ adds every element of $d$ to the corresponding element of $b$.

**Example:** Let $ b = \left( \begin{array}{cc}  3 & 4 \\ 5 & 6 \end{array} \right) $ and $ d = \left( \begin{array}{cc}  3 & 4 \\ 5 & 6 \end{array} \right) $. Compute $b + d$ and $b - d$.

In [29]:
b = np.array([[1, 2], [3, 4]])
d = np.array([[3, 4], [5, 6]])

In [30]:
b + d

array([[ 4,  6],
       [ 8, 10]])

In [31]:
b - d

array([[-2, -2],
       [-2, -2]])

There are two different kinds of matrix multiplication (and division). There is element-by-element matrix multiplication and standard matrix multiplication. We will only show how element-by-element matrix multiplication and division work. Python takes the $*$ symbol to mean element-by-element multiplication. For matrices $b$ and $d$ of the same size, $b * d$ takes every element of $b$ and multiplies it by the corresponding element of $d$. The same is true for $/$ and $**$.

**Example:** Compute $b * d$, $b / d$, and $b**d$.

In [32]:
b * d

array([[ 3,  8],
       [15, 24]])

In [33]:
b / d

array([[0.33333333, 0.5       ],
       [0.6       , 0.66666667]])

In [34]:
b**d

array([[   1,   16],
       [ 243, 4096]])

The **transpose** of an array, $b$, is an array, $d$, where $b[i, j] = d[j, i]$. In other words, the transpose switches the rows and the columns of $b$. You can transpose an array in Python using the array method $T$.

**Example:** Compute the transpose of array $b$.

In [35]:
b.T

array([[1, 3],
       [2, 4]])

Numpy has many arithmetic functions, such as $sin$, $cos$, etc., can take arrays as input arguments. The output is the function evaluated for every element of the input array. A function that takes an array as input and performs the function on it is said to be **vectorized**.

**Example:** Compute *np.sqrt* for $x = [1, 4, 9, 16]$.

In [36]:
x = [1, 4, 9, 16]
np.sqrt(x)

array([1., 2., 3., 4.])

Logical operations are only defined between a scalar and an array and between two arrays of the same size. Between a scalar and an array, the logical operation is conducted between the scalar and each element of the array. Between two arrays, the logical operation is conducted element-by-element.

**Example:** Check which elements of the array $x = [1, 2, 4, 5, 9, 3]$ are larger than 3. Check which elements in $x$ are larger than the corresponding element in $y = [0, 2, 3, 1, 2, 3]$.

In [37]:
x = np.array([1, 2, 4, 5, 9, 3])
y = np.array([0, 2, 3, 1, 2, 3])

In [38]:
x > 3

array([False, False,  True,  True,  True, False])

In [39]:
x > y

array([ True, False,  True,  True,  True, False])

Python can index elements of an array that satisfy a logical expression.

**Example:** Let $x$ be the same array as in the previous example. Create a variable $y$ that contains all the elements of $x$ that are strictly bigger than 3. Assign all the values of $x$ that are bigger than 3, the value 0.

In [40]:
y = x[x > 3]
y

array([4, 5, 9])

In [41]:
x[x > 3] = 0
x

array([1, 2, 0, 0, 0, 3])

### <center> Exercises

**1.** Create array $x$ and $y$, where $x = [1, 4, 3, 2, 9, 4]$ and $y=[2, 3, 4, 1, 2, 3]$. Perform all possible operations that can be defined between $x$ and $y$.

In [42]:
import numpy as np

In [48]:
x = np.array([1, 4, 3, 2, 9, 4])
y = np.array([2, 3, 4, 1, 2, 3])


addition_result = x + y
print("Addition:")
print(addition_result)


subtraction_result = x - y
print("\nSubtraction:")
print(subtraction_result)


multiplication_result = x * y
print("\nMultiplication:")
print(multiplication_result)


division_result = x / y
print("\nDivision:")
print(division_result)


exponentiation_result = x ** y
print("\nExponentiation:")
print(exponentiation_result)


greater_than_result = x > y
print("\nGreater than:")
print(greater_than_result)


less_than_result = x < y
print("\nLess than:")
print(less_than_result)


equal_to_result = x == y
print("\nEqual to:")
print(equal_to_result)


not_equal_to_result = x != y
print("\nNot equal to:")
print(not_equal_to_result)

Addition:
[ 3  7  7  3 11  7]

Subtraction:
[-1  1 -1  1  7  1]

Multiplication:
[ 2 12 12  2 18 12]

Division:
[0.5        1.33333333 0.75       2.         4.5        1.33333333]

Exponentiation:
[ 1 64 81  2 81 64]

Greater than:
[False  True False  True  True  True]

Less than:
[ True False  True False False False]

Equal to:
[False False False False False False]

Not equal to:
[ True  True  True  True  True  True]


**2.** Generate an array with size 100 evenly spaced between -10 to 10 using linspace function in Numpy.

In [45]:
import numpy as np

In [44]:
array = np.linspace(-10, 10, 100)
print(array)

[-10.          -9.7979798   -9.5959596   -9.39393939  -9.19191919
  -8.98989899  -8.78787879  -8.58585859  -8.38383838  -8.18181818
  -7.97979798  -7.77777778  -7.57575758  -7.37373737  -7.17171717
  -6.96969697  -6.76767677  -6.56565657  -6.36363636  -6.16161616
  -5.95959596  -5.75757576  -5.55555556  -5.35353535  -5.15151515
  -4.94949495  -4.74747475  -4.54545455  -4.34343434  -4.14141414
  -3.93939394  -3.73737374  -3.53535354  -3.33333333  -3.13131313
  -2.92929293  -2.72727273  -2.52525253  -2.32323232  -2.12121212
  -1.91919192  -1.71717172  -1.51515152  -1.31313131  -1.11111111
  -0.90909091  -0.70707071  -0.50505051  -0.3030303   -0.1010101
   0.1010101    0.3030303    0.50505051   0.70707071   0.90909091
   1.11111111   1.31313131   1.51515152   1.71717172   1.91919192
   2.12121212   2.32323232   2.52525253   2.72727273   2.92929293
   3.13131313   3.33333333   3.53535354   3.73737374   3.93939394
   4.14141414   4.34343434   4.54545455   4.74747475   4.94949495
   5.151515

**3** Let array_a be an array $[-1, 0, 1, 2, 0, 3]$. Write a command that will return an array consisting of all the elements of array_a that are larger than zero.

In [47]:
array_a = np.array([-1, 0, 1, 2, 0, 3])
result_array = array_a[array_a > 0]
print(result_array)

[1 2 3]


**4.** Create a zero array with size (2, 4).

In [49]:
zero_array = np.zeros((2, 4))
print(zero_array)

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]]


**5.** Change the 2nd column in the above array to 1.

In [50]:
zero_array = np.zeros((2, 4))

In [51]:
zero_array[:, 1] = 1

In [52]:
print(zero_array)


[[0. 1. 0. 0.]
 [0. 1. 0. 0.]]
