13016213 Data Structures and Algorithms Laboratory

**NOTE** click here to select this cell, press Esc-Enter to enter cell edit mode, press Shift-Enter to put the cell back to display mode.

#### Name: *Araya Siriadun*

#### Student ID: *58090046*

Laboratory 2: Arrays
===

## Overview

**Array** is the most fundamental structure for storing and accessing a collection of data items. Most high-level programming languages provide the array as a primitive data type and allow the creation of arrays with multiple dimensions. In this laboratory, we implement a one-dimensional array ADT (Abstract Data Type) and then use it to implement a two-dimensional array ADT.


## The Array Structure

A one-dimensional dimensional array, as shown in Figure 2.1 below, is composed of sequential elements stored in a *contiguous* bytes of memory. The entire content of an array are identified with a single name. Each element within the array can be access directly by specifying an index value which indicates an offset from the start of the array. For instance, to access the fourth element of the array in Figure 2.1, we write $x[3]$.




<center>![Array Structure](figs/0201.png)
**Figure 2.1 A one-dimensional array.**</center>

## Defining the Array Abstract Data Type

### Array ADT interface
A *one-dimensional array* is a collection of contiguous elements in which individual elements are identified by a unique integer subscript starting with zero. Once an array is created, its size cannot be changed.

* **Array( size )** <br>
&emsp;&emsp;Creates a one-dimensional array consisting of *size* elements with each element initially set to *None*. *size* must be greater than zero.

* **length( )** <br>
&emsp;&emsp;Returns the length or number of elements in the array.

* **getitem( index )** <br>
&emsp;&emsp;Returns the value stored in the array at element position *index*. The *index* argument must be within the valid range. Accessed using the subscript operator. 

* **setitem( index, value )** <br>
&emsp;&emsp;Modifies the contents of the array element at position *index* to contain *value*. The index must be within the valid range. Accessed using the subscript operator.

* **clear( value )** <br>
&emsp;&emsp;Clears the array by setting every element to *value*.

* **iterator( )** <br>
&emsp;&emsp;Creates and returns an iterator that can be used to traverse the elements of the array.

<hr>
### Question 2.1 [2 marks]
Python provides a built-in **list** data type. The array ADT is *very similar* to Python's list. Both structures are sequences of multiple sequential elements that can be accessed by indices. What are the major differences between our array ADT and Python's list data type?

### Answer 2.1: 


The array ADT is a container which can hold fixed number of items and these items should be of same type. Whereas Python's list is more flexible than the array ADT and have entries of various object types. The array ADT are also more efficient for some numerical computation.



<hr>

## Using the Array ADT

The following program illustrates the creation and the usage of an array object based on the Array ADT.

```python
# Listing 2.1
import random

# create an array size 10
floatArray = Array( 10 )

# fill in the array with random floating-point values.
for i in range( len(floatArray) ):
    floatArray[i] = random.random()

# print the values, one per line.
for value in floatArray:
    print( value )
```

## Implementing the Array
The implementation of the Array ADT using a hardware-supported array created using the Python's *ctypes* module is provided in Listing 2.2.

In [1]:
# Listing 2.2

import ctypes

class Array:
    '''Implements the Array ADT with the ctypes module.'''
    
    def __init__(self, size):
        '''
        Creates a one-dimensional array consisting of *size* elements 
        with each element initially set to None. *size* must be greater than zero.
    
        '''        
        assert size > 0, "Array size must be > 0"
        self._size = size
        
        # create the array 
        DSA_ArrayType = ctypes.py_object * size
        self._elements = DSA_ArrayType()
        
        # initialize each element with None
        self.clear(None)
        
    def __len__(self):
        '''
        Returns the length or number of elements in the array.
        '''    
        return self._size
   
    def __getitem__(self, index):
        '''
        Returns the value stored in the array at element position *index*. 
        The *index* argument must be within the valid range. Accessed using the subscript operator.
        '''
        assert index >= 0 and index < len(self), "Array subscript out of range"
        return self._elements[index]
    
    def __setitem__(self, index, value):
        '''
        Modifies the contents of the array element at position *index* to contain *value*. 
        The index must be within the valid range. Accessed using the subscript operator.
        '''
        assert index >= 0 and index < len(self), "Array subscript out of range"
        self._elements[index] = value
    
    def clear(self, value):
        '''
        Clears the array by setting every element to *value*.
        '''
        for i in range(len(self)):
            self._elements[i] = value
    
    def __iter__(self):
        '''
        Returns the array's iterator for traversing the elements.
        '''
        return _ArrayIterator(self._elements)

class _ArrayIterator:
    '''An iterator for the Array ADT.'''
    def __init__(self, theArray):
        self._arrayRef = theArray
        self._curNdx = 0
        
    def __iter__(self):
        return self
    
    def __next__(self):
        if self._curNdx < len(self._arrayRef):
            entry = self._arrayRef[self._curNdx]
            self._curNdx += 1
            return entry
        else:
            raise StopIteration
            

Now, let us test the ctypes based implementation using the sample client program in Listing 2.1.

In [2]:
import random

# create an array size 10
floatArray = Array( 10 )

# fill in the array with random floating-point values.
for i in range( len(floatArray) ):
    floatArray[i] = random.random()

# print the values, one per line.
for value in floatArray:
    print( value )

0.14463259921204785
0.7033633863297051
0.3832845281395906
0.14736420703179431
0.338594407133336
0.7757901637251546
0.889968033969156
0.09545458079166025
0.1762026668925516
0.2622084148533129


<hr>
### Question 2.2 [2 marks]
Describe the following block of code. 
You may need to refer to the Python Standard Library documentation for the *ctypes* module (https://docs.python.org/3/library/ctypes.html).

```python
        # create the array 
        DSA_ArrayType = ctypes.py_object * size
        self._elements = DSA_ArrayType()
```

### Answer 2.2: 


This generally used to create the array of elements using the ctypes in the python.

The ``` DSA_ArrayType ``` is a data structure of the type array. The ``` ctypes.py_object ``` represents the datatype of C PyObject. So calling this class without an argument creates a ``` NULL PyObject * pointer ```. As this is multuplied with the size it returns or saves the ``` ctypes_py.object ``` of the given size to the data structure of array type. The last line ``` self._elements ``` defines the array data type to be assigned to the self elements as initialised.



<hr>

<hr>
### Question 2.3 [4 marks]

The Python implementation for the Array ADT in Listing 2.2 makes use of many "special methods" (e.g. $__init__$, $__len__$).<br> 
List all *special methods* in Listing 2.2 and describe when these methods are called by Python.


### Answer 2.3: 


$__init__$ gets passed whatever the primary constructor was called.

$__len__$ returns the length of the container.

$__getitem__$ defines behavior for when an item is accessed.

$__setitem__$ defines behavior for when an item is assigned.

$__iter__$ should return an iterator for the container.

$__next__$ retrieve the next value from an iterator.



<hr>

<hr>
### Question 2.4 [6 marks] (Programming exercise)

Write a client program that use the *Array* ADT to count the number of lowercase and uppercase letters in an input text file (formatted in an ASCII encoding). Use the text file "grimm_fairytales.txt" provided within your lab package files to test your program. 

#### Hint: 
* You might want to use the following Python built-in functions.
   * chr() https://docs.python.org/3/library/functions.html#chr
   * ord() https://docs.python.org/3/library/functions.html#ord


* With "grimm_fairytales.txt" as an input text file, your program should print countings of letters like this.

<img src="figs/q0204_result.png" width="280px" height="200px" />


In [7]:
### Answer 2.4 (Put your code here)

txt = open("grimm_fairytales.txt", 'r')
Counter = Array(52)
Counter.clear(0)
for line in txt:
    for char in line:
        if char > '@' and char < '[':
            Counter[ord(char)-65] += 1
        if char > '`' and char < '{':
            Counter[ord(char)-71] += 1
txt.close()
for i in range(26):
    print(chr(i + 65), '-', format(Counter[i], '5'), "\t", chr(i + 97), '-', format(Counter[i + 26] , '5'))


A -   840 	 a - 33465
B -   383 	 b -  5611
C -   253 	 c -  7534
D -   234 	 d - 21412
E -   533 	 e - 52406
F -   240 	 f -  7967
G -   417 	 g -  8653
H -   704 	 h - 31947
I -  1586 	 i - 23254
J -    69 	 j -   376
K -    85 	 k -  4162
L -   317 	 l - 16946
M -   232 	 m -  8607
N -   441 	 n - 27510
O -   377 	 o - 30708
P -   211 	 p -  5175
Q -    11 	 q -   285
R -   346 	 r - 21202
S -   658 	 s - 22374
T -  1824 	 t - 37754
U -   115 	 u - 10842
V -    48 	 v -  3215
W -   657 	 w - 11598
X -     8 	 x -   302
Y -   194 	 y -  7543
Z -     2 	 z -   144


<hr>

## Two-Dimensional Arrays

In this section, we will use the Array ADT to construct a two-dimensional array datatype. The 2-D array, as illustrated in Figure 2.2, organizes data into rows and columns similar to a table. The individual elements can be accessed by specifying two indices, one for the row and one for the column, $[i, j]$. For example, to access the 3rd element of the 2nd row of the 2-D array in Figure 2.2, one writes $Y[1][2]$.

<center>![Array Structure](figs/0202.png)
**Figure 2.2 A two-dimensional array.**</center>

## Defining the Array2D Abstract Data Type

### Array2D ADT interface
A *two-dimensional array* consists of a collection of elements organized into rows and columns. Individual elements are referenced by specifying the specific row and column indices $(r, c)$, both of which start at 0.

* **Array2D( nrows, ncols )** <br>
&emsp;&emsp;Creates a two-dimensional array organized into rows and columns. The nrows and ncols arguments indicate the size of the table. Each element of the table is initialized to *None*.

* **numRows( )** <br>
&emsp;&emsp;Returns the number of rows in the 2-D array.

* **numCols( )** <br>
&emsp;&emsp;Returns the number of columns in the 2-D array.

* **clear( value )** <br>
&emsp;&emsp;Clears the array by setting every element to *value*.

* **getitem( r, c )** <br>
&emsp;&emsp;Returns the value stored in the 2-D array at element position indicated by *(r, c)*. Both *r* and *c* must be within the valid range. Accessed using the subscript operator: $y = x[1, 2]$.

* **setitem( r, c , value )** <br>
&emsp;&emsp;Modifies the contents of the 2-D array element at position indicated by *(r, c)* to contain *value*. The index must be within the valid range. Accessed using the subscript operator: $x[0, 3] = y$.

## Using the Array2D ADT

The following program illustrates the creation and the usage of the 2-D object based on the Array2D ADT.

```python
# Listing 2.3
import random

# create a 2-D array consisting of 5 rows and 5 columns
g = Array2D( 5, 5 )

# fill in the array with random floating-point values.
for i in range( g.numRows() ):
    for j in range( g.numCols() ):
        g[i, j] = random.random()

# print the values, one per line.
for i in range( g.numRows() ):
    for j in range( g.numCols() ):
        print("{0:.4f}\t".format(g[i,j]), end="")
    print()
```

## Implementing the Array2D
The implementation of the Array2D ADT using the Array ADT is provided in Listing 2.4.

In [3]:
# Listing 2.4

class Array2D:
    '''Implements a 2-D array ADT'''
    def __init__(self, nrows, ncols):
        '''
        Creates a two-dimensional array organized into rows and columns. 
        The *nrows* and *ncols* arguments indicate the size of the table. 
        Each element of the table is initialized to *None*.
        '''
        
        # Create a 1-D array to store an array reference for each row.        
        self._theRows = Array( nrows )
        
        # Create the 1-D arrays for each row of the 2-D array.
        for i in range( nrows ):
            self._theRows[i] = Array( ncols )
            
    def numRows(self):
        '''
        Returns the number of rows in the 2-D array.
        '''
        return len( self._theRows )
    
    def numCols(self):
        '''
        Returns the number of columns in the 2-D array.
        '''
        return len( self._theRows[0] )
    
    def clear(self, value):
        '''
        Clears the array by setting every element to *value*.
        '''
        for row in range( self.numRows() ):
            self._theRows[row].clear( value )
            
    def __getitem__(self, ndxTuple):
        '''
        Returns the value stored in the 2-D array at element position indicated by *(r, c)*. 
        Both *r* and *c* must be within the valid range. 
        Accessed using the subscript operator: $y = x[1, 2]$.
        '''
        assert len(ndxTuple) == 2, "Invalid number of array subscripts."
        row = ndxTuple[0]
        col = ndxTuple[1]
        assert row >= 0 and row < self.numRows() \
            and col >= 0 and col < self.numCols(), \
            "Array subscript out of range."
        
        the1dArray = self._theRows[row]
        
        return the1dArray[col]
    
    
    def __setitem__(self, ndxTuple, value):
        '''
        Modifies the contents of the 2-D array element at position indicated by *(r, c)* to contain *value*. 
        The index must be within the valid range. Accessed using the subscript operator: $x[0, 3] = y$.
        '''
        assert len(ndxTuple) == 2, "Invalid number of array subscripts."
        
        row = ndxTuple[0]
        col = ndxTuple[1]
        assert row >= 0 and row < self.numRows() \
            and col >= 0 and col < self.numCols(), \
            "Array subscript out of range."

        the1dArray = self._theRows[row]
        the1dArray[col] = value
        

Now, let us test the Array2D implementation using the sample client program in Listing 2.3.

In [4]:
import random

# create a 2-D array consisting of 5 rows and 5 columns
g = Array2D( 5, 5 )

# fill in the array with random floating-point values.
for i in range( g.numRows() ):
    for j in range( g.numCols() ):
        g[i, j] = random.random()

# print the values, one per line.
for i in range( g.numRows() ):
    for j in range( g.numCols() ):
        print("{0:.4f}\t".format(g[i,j]), end="")
    print()

0.0178	0.3781	0.9543	0.1394	0.7505	
0.8361	0.1113	0.1798	0.2336	0.3996	
0.3522	0.8372	0.4606	0.1655	0.8034	
0.0712	0.9430	0.7743	0.5553	0.2588	
0.8774	0.0811	0.0487	0.0392	0.3818	


<hr>
## Programming Quiz 2 [10 marks]

### Implementing the Matrix ADT 

Write a Python program for a matrix class that can *add*, *subtract*, *multiply* two-dimensional arrays of numbers.<br>
Your program must check that the dimensions agree appropriately for the operation.

The specification of the Matrix ADT and a test function are provided below.


#### Matrix ADT interface
A *matrix* is a collection of scalar values arranged in rows and columns as a rectangular grid of a fixed size. The elements of the matrix can be accessed by specifying a given row and column index with indices starting at 0.

* **Matrix( nrows, ncols )** <br>
&emsp;&emsp;Creates a new matrix containing *nrows* and *ncols* with each element initialized to 0.

* **numRows( )** <br>
&emsp;&emsp;Returns the number of rows in the matrix.

* **numCols( )** <br>
&emsp;&emsp;Returns the number of columns in the matrix.

* **getitem( r, c )** <br>
&emsp;&emsp;Returns the value stored in the matrix at element position indicated by *(r, c)*. Both *r* and *c* must be within the valid range. 

* **setitem( r, c , value )** <br>
&emsp;&emsp;Modifies the contents of the matrix element at position indicated by *(r, c)* to contain *value*. The index must be within the valid range.

* **add( rhsMatrix )** <br>
&emsp;&emsp;Creates and returns a new matrix that is the result of adding this matrix to the given *rhsMatrix*. The size of the two matrices must be the same.

* **subtract( rhsMatrix )** <br>
&emsp;&emsp;Creates and returns a new matrix that is the result of subtracting *rhsMatrix* from this matrix. The size of the two matrices must be the same.

* **multiply( rhsMatrix )** <br>
&emsp;&emsp;Creates and returns a new matrix that is the result of multiplying *rhsMatrix* to this matrix. The two matrices must be of appropriate sizes as defined for matrix multiplication.


In [5]:
def testMatrix():
    A = Matrix(3, 2)
    A[0,0] = 0
    A[0,1] = 1
    A[1,0] = 2
    A[1,1] = 3
    A[2,0] = 4
    A[2,1] = 5
    print("A = ")
    print(A)
    
    B = Matrix(3, 2)
    B[0,0] = 6
    B[0,1] = 7
    B[1,0] = 8
    B[1,1] = 9
    B[2,0] = 1
    B[2,1] = 0
    print("B = ")
    print(B)

    print("-----------------------------")
    
    print("A + B =")
    print(A + B)

    print("B + A =")
    print(B + A)

    print("-----------------------------")

    print("A - B =")
    print(A - B)

    print("B - A =")
    print(B - A)

    print("-----------------------------")

    print("A = ")
    print(A)

    C = Matrix(2, 3)
    C[0,0] = 6
    C[0,1] = 7
    C[0,2] = 8
    C[1,0] = 9
    C[1,1] = 1
    C[1,2] = 0
    print("C =")
    print(C)

    print("A * C =")
    print(A * C)

    print("C * A =")
    print(C * A)

testMatrix()

A = 
  0  1
  2  3
  4  5

B = 
  6  7
  8  9
  1  0

-----------------------------
A + B =
  6  8
 10 12
  5  5

B + A =
  6  8
 10 12
  5  5

-----------------------------
A - B =
 -6 -6
 -6 -6
  3  5

B - A =
  6  6
  6  6
 -3 -5

-----------------------------
A = 
  0  1
  2  3
  4  5

C =
  6  7  8
  9  1  0

A * C =
  9  1  0
 39 17 16
 69 33 32

C * A =
 46 67
  2 12



In [6]:
### TODO Put your code for Matrix ADT here

class Matrix: 
    def __init__(self, nrows, ncols): 
        self.rows = nrows 
        self.cols = ncols 
        self.matrix = Array2D(nrows, ncols) 
        self.matrix.clear(0) 
    def numRows(self): 
        return self.rows 
    def numCols(self): 
        return self.cols 
    def __getitem__(self, r): 
        return self.matrix[r] 
    def __setitem__(self, r, value): 
        self.matrix[r] = value
    def __add__(self, rhsMatrix): 
        assert (self.rows == rhsMatrix.numRows() and self.cols == rhsMatrix.numCols()) 
        result = Matrix(self.rows, self.cols) 
        for i in range(self.rows): 
            for j in range(self.cols): 
                result[i, j] = self.matrix[i, j] + rhsMatrix[i, j] 
        return result 
    def __sub__(self, rhsMatrix): 
        assert (self.rows == rhsMatrix.numRows() and self.cols == rhsMatrix.numCols()) 
        result = Matrix(self.rows, self.cols) 
        for i in range(self.rows): 
            for j in range(self.cols): 
                result[i, j] = self.matrix[i, j] - rhsMatrix[i, j] 
        return result
    def __mul__(self, rhsMatrix): 
        assert (self.cols == rhsMatrix.numRows()) 
        result = Matrix(self.rows, rhsMatrix.numCols()) 
        for i in range(self.rows): 
            for j in range(rhsMatrix.numCols()): 
                for k in range(rhsMatrix.numRows()): 
                    result[i, j] += self.matrix[i, k] * rhsMatrix[k, j] 
        return result 
    def __str__(rhsMatrix): 
        result = "" 
        for i in range(rhsMatrix.numRows()): 
            for j in range(rhsMatrix.numCols()): 
                result += str(format(rhsMatrix[i, j], '3'))
            result += "\n" 
        return result