# **Computation II: Algorithms & Data Structures** <br/>
**Bachelor's Degree Programs in Data Science and Information Systems**<br/>
**NOVA IMS**<br/>

**NOTE:** Adapted from Prof. Dr. Illya Bakurov's class materials.


## References
1. Data Structures and Algorithms with Python (2015), by K. D. Lee and S. Hubbard
2. Data Structures and Algorithms using Python (2011), by R. D. Necaise. John Wiley & Sons, Inc.
3. [Python's official documentation: Design and History FAQ](https://docs.python.org/3.7/faq/design.html#how-are-lists-implemented-in-cpython)
4. [NumPy's official documentation: ``numpy.append()``](https://docs.python.org/3.7/faq/design.html#how-are-lists-implemented-in-cpython)
5. [NumPy's official documentation: ``numpy.ndarray``](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html)

Imports ``numpy`` to generate random values and manipulate arrays.

In [1]:
import sys
import timeit
import random
import numpy as np

# 1. Arrays
"*A one-dimensional array (...) is composed of multiple sequential elements stored in contiguous bytes of memory and allows for random access to the individual elements*" [2]. In other words, this data structure is implemented at the hardware level. In visual terms:

<center><img src="https://beginnersbook.com/wp-content/uploads/2018/10/array.jpg" width=400/></center>

Individual elements within the array can be accessed directly by using an integer index value, which indicates an offset from the start of the array. When creating an array, the user must know the maximum number of elements a-priori. Adding or removing (i.e., appending) an element to an array implies creating a copy of the source array added/removed a given element. This makes the array best suited for problems requiring a sequence in which the maximum number of elements is known a-priori. Moreover:
- "An array object represents a multidimensional, homogeneous array of fixed-size items." [5]
- NumPy arrays can directly work with arithmetic operations without additional overhead
- Working with ``numpy.ndarray`` requires installing and importing the NumPy library 

# 2. Lists
Technically, a Python ``list`` type is also a collection of contiguous memory locations, similarly to an array. In fact, a ``list`` type is implemented as a dynamic array in CPython (the default and the most widely used software implementation or the default Python interpreter). In visual terms:

<center><img src="https://www.interviewcake.com/images/svgs/dynamic_array__preview.svg?bust=210" width=400/></center>


Following the official documentation [3]:

"*CPython's lists are really variable-length arrays (...). The implementation uses a contiguous array (...). This makes indexing a list ``a[i]`` an operation whose cost is independent of the size of the list or the value of the index. When items are appended or inserted, the array of references is resized. Some cleverness is applied to improve the performance of appending (...) some extra space is allocated so the next few times don’t require an actual resize.*"

Specifically: "*(...) a list contains more storage space than is needed to store the items currently in the list. This extra space, the size of which can be up to twice the necessary capacity, allows for quick and easy expansion as new items are added to the list.*" [2].

In this sense, when compared to arrays, lists are more useful when the size of the sequence needs to frequently change after its creation. 

# 3. Linked list
Following [1]:

"*If a programmer wants to insert a large number of items towards the beginning of a list, a different organization for a sequence might be better suited to their needs. A linked list is an organization of a list where each item in the list is in a separate node. Linked lists look like the links in a chain. Each
link is attached to the next link by a reference that points to the next link in the chain. When working with a linked list, each link in the chain is called a Node. Each node consists of two pieces of information, an item, which is the data associated with the node, and a link to the next node in the linked list, often called next.*"

In visual terms:

<center><img src="https://static.javatpoint.com/ds/images/ds-linked-list.png" width=400/></center>

Creates a class ``Node``.

In [1]:
class Node:
    def __init__(self, data, link=None):
        self.data = data
        self.link = link

In [2]:
my_node = Node("a")
print(my_node.data)
print(my_node.link)

a
None


Tests the ``Node`` type.

In [3]:
node_object = Node({"anything": 123, "you": 987, "want": 0})
print("Data:", node_object.data)                   
print("Link to the next node:", node_object.link)

Data: {'anything': 123, 'you': 987, 'want': 0}
Link to the next node: None


Creates a class called ``LinkedList``.

In [4]:
class LinkedList:
    def __init__(self, data):        
        self.head = Node(data)
    
    def print(self):
        temp = self.head
        while temp:
            print(temp.data, end=' ')
            temp = temp.link

In [5]:
llist = LinkedList("Hello")
print(llist.head)
print(llist.head.data)
print(llist.head.link)

<__main__.Node object at 0x0000020274ECB410>
Hello
None


Tests ``LinkedList`` and ``print()``.

In [6]:
llist = LinkedList("Hello")
node1 = Node("world")
node2 = Node(",")
node3 = Node("students!")
llist.print()

Hello 

Testing ``print()`` after connecting the nodes.

In [8]:
llist.head.link = node1
node1.link = node2
node2.link = node3
llist.print()

Hello world , students! 

## 3.1. Initialize from a sequence

In [9]:
class LinkedList:
    def __init__(self, data=None):        
        self.head= Node(data) if data else None
    
    def init_from_list(self, lst):
        self.head = Node(lst[0])
        temp = self.head
        for data in lst[1:]:
            temp.link = Node(data)
            temp = temp.link  
    
    def print(self):
        temp = self.head
        while temp:
            print(temp.data, end=" ")
            temp = temp.link  

Tests ``init_from_list()``.

In [10]:
llist = LinkedList()
llist.init_from_list(["a", "b", "c", "d"])
llist.head.data

'a'

## 3.2. Searching
The search operation requires $O(n)$ in the worst case, which happens when the target item is not in the list or is in the last position.

In [11]:
class LinkedList:
    def __init__(self, data=None):        
        self.head= Node(data) if data else None
    
    def init_from_list(self, lst):
        self.head = Node(lst[0])
        temp = self.head
        for data in lst[1:]:
            temp.link = Node(data)
            temp = temp.link  
    
    def print(self):
        temp = self.head
        while temp:
            print(temp.data, end=" ")
            temp = temp.link  
            
    def search(self,value):
        index = 0
        temp = self.head
        while temp:
            if temp.data == value:
                return index
            temp = temp.link
            index += 1 

Tests the ``search()`` method.

In [12]:
llist = LinkedList()
llist.init_from_list(["a", "b", "c", "d"])
llist.search("c")

2

## 3.3. Adding a new value
When working with an unordered sequence, the addition of new data values to it can be done anywhere. Given that linked list maintains a reference to the head of the list, if one does not care where to add new data, we can simply add a new node at the head (aka *push*) with little effort - this would require just $O(1)$. Consider the following visualization:

<center><img src="https://www.alphacodingskills.com/imgfiles/linked-list-add-node-at-start.PNG" width=400/></center>

If one is interested in appending a new node at the bottom of the linked list, one can either:
1. traverse all the list to arrive at the tail, and then append a new node. This would require $O(n) + O(1) = O(n)$, respectively.
1. create a pointer to the tail and append in $O(1)$.

Consider the following visualization:

<center><img src="https://www.alphacodingskills.com/imgfiles/linked-list-add-node-at-end.PNG" width=400/></center>


Consider the following implementation of ``push()`` and ``append()``. The latter is implemented in $O(n)$ time.

In [13]:
class LinkedList:
    def __init__(self, data=None):        
        self.head= Node(data) if data else None
    
    def init_from_list(self, lst):
        self.head = Node(lst[0])
        temp = self.head
        for data in lst[1:]:
            temp.link = Node(data)
            temp = temp.link  
    
    def print(self):
        temp = self.head
        while temp:
            print(temp.data, end=" ")
            temp = temp.link  
            
    def search(self,value):
        index = 0
        temp = self.head
        while temp:
            if temp.data == value:
                return index
            temp = temp.link
            index += 1 
            
    def push(self,data):
        new_node = Node(data)
        new_node.link = self.head
        self.head = new_node
        
    def append(self,data):
        current = self.head
        while current.link:
            current = current.link
        current.link = Node(data)

Tests the ``push()`` and ``append()`` methods.

In [14]:
llist = LinkedList()
llist.init_from_list(["a", "b", "c", "d"])
llist.push("x")
llist.append("y")
llist.print()

x a b c d y 

Adds a tail to the linked list and modifies the ``append()`` function to render it $O(1)$.

In [15]:
class LinkedList:
    def __init__(self, data=None):        
        self.head= Node(data) if data else None
        self.tail = None
    
    def init_from_list(self, lst):
        self.head = Node(lst[0])
        temp = self.head
        for data in lst[1:]:
            temp.link = Node(data)
            temp = temp.link
        self.tail = temp
    
    def print(self):
        temp = self.head
        while temp:
            print(temp.data, end=" ")
            temp = temp.link  
            
    def search(self,value):
        index = 0
        temp = self.head
        while temp:
            if temp.data == value:
                return index
            temp = temp.link
            index += 1 
            
    def push(self,data):
        new_node = Node(data)
        new_node.link = self.head
        self.head = new_node
        
    def append(self,data):
        new_node = Node(data)
        self.tail.link = new_node
        self.tail = new_node

Tests the new ``append()`` method.

In [16]:
llist = LinkedList()
llist.init_from_list(["a", "b", "c", "d"])
llist.append("y")
llist.print()

a b c d y 

## 3.4. Removing elements
An item can be removed from a linked list by removing or unlinking the node
containing that item. When removing from head or tail (assuming tail is also being tracked), the required time complexity is $O(1)$. Alternatively, if one is interested in removing some node with a given value, one needs to find that value in the linked list and then remove it: $O(n) + O(1) = O(n)$.


Consider the following visualization:

<center><img src="https://www.alphacodingskills.com/imgfiles/linked-list-delete-first-node.PNG" width=400/></center>

In [17]:
class LinkedList:
    def __init__(self, data=None):        
        self.head= Node(data) if data else None
        self.tail = None
    
    def init_from_list(self, lst):
        self.head = Node(lst[0])
        temp = self.head
        for data in lst[1:]:
            temp.link = Node(data)
            temp = temp.link
        self.tail = temp
    
    def print(self):
        temp = self.head
        while temp:
            print(temp.data, end=" ")
            temp = temp.link  
            
    def search(self,value):
        index = 0
        temp = self.head
        while temp:
            if temp.data == value:
                return index
            temp = temp.link
            index += 1 
            
    def push(self,data):
        new_node = Node(data)
        new_node.link = self.head
        self.head = new_node
        
    def append(self,data):
        new_node = Node(data)
        self.tail.link = new_node
        self.tail = new_node
    
    def remove_value(self,value):
        previous = None
        current = self.head
        while current:
            if current.data == value:
                if previous:
                    previous.link = current.link
                    current.link = None
                else:
                    self.head = current.link
                    current.link = None
            previous = current
            current = current.link 

Tests the ``remove_value()`` method.

In [18]:
llist = LinkedList()
llist.init_from_list(["a", "b", "c", "d"])
llist.remove_value("b")
llist.print()

a c d 

One can also remove elements by index. This operation is known as ``pop()``.

In [19]:
class LinkedList:
    def __init__(self, data=None):        
        self.head= Node(data) if data else None
        self.tail = None
    
    def init_from_list(self, lst):
        self.head = Node(lst[0])
        temp = self.head
        for data in lst[1:]:
            temp.link = Node(data)
            temp = temp.link
        self.tail = temp
    
    def print(self):
        temp = self.head
        while temp:
            print(temp.data, end=" ")
            temp = temp.link  
            
    def search(self,value):
        index = 0
        temp = self.head
        while temp:
            if temp.data == value:
                return index
            temp = temp.link
            index += 1 
            
    def push(self,data):
        new_node = Node(data)
        new_node.link = self.head
        self.head = new_node
        
    def append(self,data):
        new_node = Node(data)
        self.tail.link = new_node
        self.tail = new_node
    
    def remove_value(self,value):
        previous = None
        current = self.head
        while current:
            if current.data == value:
                if previous:
                    previous.link = current.link
                    current.link = None
                else:
                    self.head = current.link
                    current.link = None
            previous = current
            current = current.link 
        
    def pop(self,index):
        index_count = 0
        current = self.head
        # if you are erasing the head
        if index_count == index:
            # assign the new head
            self.head = self.head.link
            # cut the connection
            current.link = None
        previous = current
        current = self.head.link
        index_count += 1
        while current:
            if index_count == index:
                previous.link = current.link
                current.link = None
            previous = current
            current = current.link
            index_count += 1

Tests ``pop()``.

In [20]:
llist = LinkedList()
llist.init_from_list(["a", "b", "c", "d"])
llist.pop(1)
llist.print()

a c d 