In [9]:
#include <iostream>
#include <iomanip>
using namespace std;
cout << boolalpha; 

# Arrays

## Memory Allocation

- Memory allocation can be classified as either
    - Contiguous
    - Linked
    - Indexed

- Prototypical examples:
    - Contiguous allocation: arrays
    - Linked allocation: linked lists

## Contiguous Allocation

- An array stores *n* objects in a single contiguous space of memory
- Unfortunately, if more memory is required, a request for new memory usually requires copying all information into the new memory
    - In general, you cannot request for the operating system to allocate to you the next *n* memory locations

## Linked Allocation

- Linked storage such as a linked list associates two pieces of data with each item being stored:
    - The object itself, and
    - A reference to the next item
        - In C++ that reference is the address of the next node

## Indexed Allocation

- With indexed allocation, an array of pointers (possibly NULL) link to a sequence of allocated memory locations

- Used in the C++ standard template library

- Matrices can be implemented using indexed allocation:
    - Most implementations of matrices (or higher-dimensional arrays) use indices pointing into a single contiguous block of memory

## Other Allocation Formats

- We will look at some variations or hybrids of these memory allocations including:
    - Trees
    - Graphs
    - Deques (linked arrays)
    - inodes

## Linear List

- *Linear list* is a data object whose instances are of the form ($e_1,e_2,\ldots,e_n$)
    - $e_i$ is an element of the list
    - $e_1$ is the first element, and $e_n$ is the last element
    - $n$ is the length of the list
    - when $n = 0$, it is called an empty list.
    - $e_1$ comes before $e_2$, $e_2$ comes before $e_3$, and so on.
- Examples
    - student names order by their alphabets
    - a list of exam scores sorted by descending order

## Implementations of Linear List

- Array-based (Formula-based)
    - Uses a mathematical formula to determine where (i.e., the memory address) to store each element of a list
- Linked list (Pointer-based)
    - The elements of a list may be stored in any arbitrary set of locations
    - Each element has an explicit pointer (or link) to the next element
- Indirect addressing
    - The elements of a list may be stored in any arbitrary set of locations
    - Maintain a table such that the ith table entry tells us where the ith element is stored
- Simulated pointer
    - Similar to linked representation but integers replace the C++ pointers

## ArrayList

- The **Vector** or **Array List ADT** extends the notion of array by storing a sequence of objects
- It uses an array to store the elements of linear list.
- An element can be accessed, inserted or removed by specifying its index (number of elements preceding it)
- An exception is thrown if an incorrect index is given (e.g., a negative index)
- Individual element is located in the array using a mathematical formula.
    - typical formula: **location(i) = i - 1**

## ArrayList Itreface

- Main methods:
    - `at(integer i)`: returns the element at index `i` without removing it
    - `set(integer i, object o)`: replace the element at index `i` with `o`
    - `insert(integer i, object o)`: insert a new element `o` to have index `i`
    - `erase(integer i)`: removes element at index `i`

- Additional methods:
    - `size()`
    - `max_size()`
    - `empty()`

## Applications of Array Lists

- Direct applications
    - Sorted collection of objects (elementary database)
    - Indirect applications
- Auxiliary data structure for algorithms
    - Component of other data structures

## Array-based Implementation

- Use a static array `A` of size `N`
- A variable `n` keeps track of the size of the array list (number of elements stored)
- Operation `at(i)` is implemented in $O(1)$ time by returning `A[i]`
- Operation `set(i,o)` is implemented in $O(1)$ time by performing `A[i] = o`

![](img/arraylist.png)

In [None]:
template <typename T>
class ArrayList {
protected:
    int n;       // current size
    int N;       // maximum allowed size
    T* elements; // storage for the objects
public:
    ArrayList(int size = 10) : n{0}, N{size} {
        elements = new T[N];
    }
    ~ArrayList() {
        delete[] elements;
    }
    // Capacity
    bool empty() const;
    int size() const;
    int max_size() const;
    // Element access
    T at(int i) const;
    void set(int i, T o);
    // Modifiers
    void insert(int i, T o);
    void erase(int i);
};

## Insertion

- In operation `insert(i, o)`, we need to make room for the new element in the $i$th position
    - by shifting forward the $n-i$ elements $A[i], \ldots, A[n - 1]$
- In the worst case *(i = 0)*, this takes $O(n)$ time

![](../img/arraylist-insert.png)

In [2]:
#include "../src/ArrayList.h"

ArrayList<int> al;
cout << "Current size: " << al.size() << endl;
cout << "Max size: " << al.max_size() << endl;
cout << "Empty: " << al.empty() << endl;

Current size: 0
Max size: 10
Empty: true


In [3]:
al.insert(0, 1);
cout << "ArrayList: "; al.print(); cout << endl;

ArrayList: 1


In [4]:
al.insert(0, 2);
cout << "ArrayList: "; al.print(); cout << endl;

ArrayList: 2-1


In [5]:
al.insert(1, 3);
cout << "ArrayList: "; al.print(); cout << endl;
cout << "Current size: " << al.size() << endl;

ArrayList: 2-3-1
Current size: 3


## Element Removal

- In operation `erase(i)`, we need to fill the hole left by the removed element
    - by shifting backward the $n - i - 1$ elements $A[i + 1], \ldots, A[n - 1]$.
- In the worst case $(i = 0)$, this takes $O(n)$ time

![](../img/arraylist-erase.png)

In [6]:
al.insert(2, 4);
cout << "ArrayList: "; al.print(); cout << endl;
cout << "Current size: " << al.size() << endl;

ArrayList: 2-3-4-1
Current size: 4


In [7]:
al.erase(1);
cout << "ArrayList: "; al.print(); cout << endl;
cout << "Current size: " << al.size() << endl;

ArrayList: 2-4-1
Current size: 3


In [8]:
al.erase(0);
cout << "ArrayList: "; al.print(); cout << endl;
cout << "Current size: " << al.size() << endl;

ArrayList: 4-1
Current size: 2


## Performance

- In the array based implementation of an array list:
    - The space used by the data structure is $O(n)$ (linear)
    - `size`, `empty`, `at` and `set` run in $O(1)$ (constant) time
    - `insert` and `erase` run in $O(n)$ (linear) time in worst case
- If we use the array in a circular fashion, operations `insert(0, x)` and `erase(0, x)` run in $O(1)$ time
- In an `insert` operation, when the array is full, instead of throwing an exception, we can replace the array with a larger one

## Tabular Representation

- Data is often available in tabular form
- Tabular data is often represented in arrays
- Matrix is an example of tabular data and is often represented as a 2-dimensional array
    - Matrices are normally indexed beginning at 1 rather than 0
    - Matrices also support operations such as `add`, `multiply`, and `transpose`, which are NOT supported by C++’s 2D array
    
- It is possible to **reduce time and space** using a customized representation of multidimensional arrays
    - Row- and column-major mapping and representations of multidimensional arrays
    - Special matrices: diagonal, tridiagonal, triangular, symmetric, sparse

## 2D Arrays

The elements of a 2-dimensional array a declared as:

```cpp
int a[3][4];
```

may be shown as a table

    a[0][0]     a[0][1]    a[0][2]    a[0][3]
    a[1][0]     a[1][1]    a[1][2]    a[1][3]
    a[2][0]     a[2][1]    a[2][2]    a[2][3]
    
- we can use either rows or columns

## Jagged Arrays

- A **jagged array** is an array whose elements are **arrays**
    - The elements of a jagged array can be of different dimensions and sizes
    - A jagged array is sometimes called an **array of arrays**
    
- Represent 2D array as a **1D array of rows** and store as 3 1D arrays  

```cpp
int row0[]{1, 2, 3, 4};
int row1[]{5, 6, 7, 8};
int row2[]{9, 8, 7, 6};
int* x[]{row0, row1, row2}; // 4 separate 1D array
```

In [11]:
char row0[]{'a', 'b', 'c', 'd'};
char row1[]{'e', 'f', 'j', 'h'};
char row2[]{'w', 'x', 'y', 'z'};
char* x[]{row0, row1, row2};

cout << "x[1][1] = " << (x[1])[1] << endl;
cout << "x[2][3] = " << x[2][3] << endl;

x[1][1] = f
x[2][3] = z


## Array Representation in C++

- Requires contiguous memory of size 3, 4, 4, and 4 for the 4 1D arrays.
- 1 memory block of size **number of rows** and **number of rows** blocks of size **number of columns**
- space overhead = (number of rows + 1) x 4 bytes
    - overhead for 3 1D arrays = 4 * 4 bytes = 16 bytes

## Row-Major Order

    a b c d
    e f g h
    w x y z

- Convert 2D into 1D array y by collecting elements by rows
- Within a row elements are collected from left to right
- Rows are collected from top to bottom
    - y[] = {a, b, c, d, e, f, g, h, w, x, y, z}

## Locating Element x[i][j]

- assume $x$ has $r$ rows and $c$ columns
- each row has $c$ elements
- $i$ rows to the left of row $i$
    - so $ic$ elements to the left of $x[i][0]$
- $x[i][j]$ is mapped to position  $ic + j$ of the 1D array

In [17]:
char y[12]; // convert to row-major order
int k = 0, r = 3, c = 4;
for (int i=0;i<r; i++)       // iterate through rows
    for (int j=0;j<c; j++) { // iterate through columns
        y[k] = x[i][j];
        k++;
    }

In [18]:
cout << "y[] = { ";
for (auto e : y)
    cout << e << ", ";
cout << "}";

y[] = { a, b, c, d, e, f, j, h, w, x, y, z, }

In [14]:
int i = 2, j = 3;
cout << "x[i][j] = " << x[i][j] << endl;
cout << "y[i*c+j] = " << y[i*c+j] << endl;

x[i][j] = z
y[i*c+j] = z


## Column-Major Order

    a b c d
    e f g h
    w x y z

- Convert 2D into 1D array y by collecting elements by columns
- Within a column elements are collected from top to bottom
- Columns are collected from left to right
    - y[] = {a, e, w, b, f, x, c, g, y, d, h, z}

In [19]:
char y[12]; // convert to column-major order
int k = 0, r = 3, c = 4;
for (int j=0;j<c; j++)       // iterate through columns
    for (int i=0;i<r; i++) { // iterate through rows    
        y[k] = x[i][j];
        k++;
    }

In [20]:
cout << "y[] = { ";
for (auto e : y)
    cout << e << ", ";
cout << "}";

y[] = { a, e, w, b, f, x, c, j, y, d, h, z, }

In [21]:
int i = 1, j = 3;
cout << "x[i][j] = " << x[i][j] << endl;
cout << "y[i+r*j] = " << y[i+r*j] << endl;

x[i][j] = h
y[i+r*j] = h


## Row- and Column-Major Mappings

- Row-major order mapping functions
    - for 2D arrays: $map(i_1,i_2) = i_1 c_2+i_2$
	- for 3D arrays: $map(i_1,i_2,i_3) = i_1 c_2 c_3 + i_2 c_3 + i_3$

- What is the mapping function for the following 2D array?
```
    1  2  3  4  5  6
    7  8  9 10 11 12
   13 14 15 16 17 18
```

- Answer: $map(i_1,i_2) = 6i_1+i_2$, so $map(2,3) = ?$
- Column-major order mapping functions
	- do this as an exercise

## Dynamic Allocation of Matrices

- If the dimensions of a two-dimensional array are not known in advance, it is necessary to allocate the array dynamically.
- The matrix is an array of row pointers. Since each row pointer is of type **int***, the matrix is of type **int****, that is, a pointer to a pointer of integers.

In [2]:
int n = 3, m = 4;
int** M = new int*[n]; // allocate an array of row pointers
for (int i = 0; i < n; i++) {
    M[i] = new int[m]; // allocate the i-th row
    for (int j = 0; j < m; j++)
        M[i][j] = i*10+j; // initialize with some data (optional)
}

## Dynamic Allocation of Matrices (cont.)
- Once allocated, we can access its elements just as before, for example, as `M[i][j]`

In [3]:
cout << "M[2][3] = " << M[2][3] << endl;

M[2][3] = 23


Deallocating the matrix involves reversing these steps
- First, we deallocate each of the rows, one by one
- Then deallocate the array of row pointers
- Since we are deleting an array, we use the command `delete[]`

In [4]:
for (int i = 0; i < n; i++)
    delete[] M[i]; // delete the i-th row
delete[] M;        // delete the array of row pointers