# SE_02 Portfolio

Created By: Domenico Di Ruocco

# Table of Contents

- [Introduction to Theory](#introductionT)
    - [Time Complexity](#timeComplexity)
    - [Space Complexity](#spaceComplexity)
    - [Asymptotic Notation](#asymptoticNotation)
        - [Big O](#bigO)
        - [Big Ω](#bigOmega)
        - [Big Θ](#bigTheta)
        - [Working with the Asymptotic Notation](#workingWithNotation)
    - [Order of Dominance in the Asymptotic Limit](#orderOfDominance)
- [Introduction to Data Structures](#introductionDS)
    - [Arrays](#arrays)
    - [Linked Lists](#linkedLists)
    - [Stacks](#stacks)
    - [Queues](#queues)
    - [Hash Tables](#hashTables)
    - [Trees](#trees)
        - [Binary Trees](#binaryTrees)
            - [Binary Search Trees](#binarySearchTrees)
            - [Red Black Trees](#redBlackTrees)
            - [Ropes](#ropes)
        - [Heaps](#heaps)
    - [Graphs](#graphs)
- [Sources](#sources)

<div id="introductionT"></div>

# Introduction To Theory

In Computer Science, an algorithm is a set of instructions that must be followed in a fixed order to calculate an answer to a mathematical problem. Since it is common to find more than one algorithm that has been developed to solve the same problem, we need a way to analyze and compare them.

<div id="timeComplexity"></div>

## Time Complexity

One way to compare two algorithms that solve the same problem, assuming that the solutions provided by both are correct, is to compare the time it takes each of them to get to the solution. The problem with this method is that it depends on the hardware were the algorithm runs. It is for this reason that for machine-independent algorithm design we consider our algorithm to be running on a hypothetical machine called the "Random Access Machine" or RAM.

On the RAM, we consider each simple operation (+, *, -, =, if, call) and each memory access to take one time step, while loops and subroutines are considered to be the composition of many single-step operation.

<div id="spaceComplexity"></div>

## Space Complexity
Another way to compare two algorithms is to compare the total space it takes them get to the solution. The total space includes the size of the input and auxiliary space, which is the extra or temporary space used by the algorithm.

<div id="asymptoticNotation"></div>

## Asymptotic Notation

We can use the RAM model to determine the number of steps it will take an algorithm to end with an input we choose, but estimating the worst, average, and best case runtime scenerio with the RAM model can be unconvenint.

In [1]:
def even_numbers_avg(array):
    '''
    This function returns either the average of the sum of even numbers,
    or None.
    '''
    even_sum = 0                    #1 time step
    even_count = 0                  #1 time step
                                    #n times:
    for n in array:                    #1 time step
        if n % 2 == 0:                 #1 time step
            even_sum += n                  #1 time step
            even_count +=1                 #1 time step
            
    if even_count > 0:              #1 time step
        return even_sum/even_count     #1 time step
    else:                           #1 time step
        return None                    #1 time step

In the example above, we can try to generalize the time complexity of this algorithm by counting every step, and we will find that in the worst case its time complexity will be: $T(n) = 5n + 6$. Its space complexity will be the size of the array n, plus the two variables we initialize.

The problem with the notation we used above is that is difficult to work precisely with it. In the example above we can see that both return statements have been counted as well as every step in the for loop, which is correct for the worst but not for the average and best case scenarios.

Since we can approximate an algorithm to a mathematical function, we can also determine its growth as a function of the input and define an upper bound function (Big O), a lower bound function (Big Ω), or both (Big Θ) in order to understand how it grows.

In Asymptotic Notation, we only consider the fastest growing term without any multiplicative constant. E.g.: in the example above $T(n) = 5n + 6$ is O(n) and not O(5n)

<div id="bigO"></div>

### Big O

The Big O of a function is its asymptotic upper bound. This means that the running time of a function $T$ will be always shorter than that of $f$. To generalize we can say that a funciton $T(n)$ is $O(f(n))$ if there is a constant $k$ such that $T(n) < k·f(n)$ for large enough $n$.

<img src="pics/Big O.png" alt="Big O"/>

<div id="bigOmega"></div>

### Big Ω

The Big Ω of a function is its asymptotic lower bound. This means that the running time of a function $T$ will always be longer than that of $f$. To generalize we can say that a funciton $T(n)$ is $Ω(f(n))$ if there is a constant $k$ such that $T(n) > k·f(n)$ for large enough $n$.

<img src="pics/Big Omega.png" alt="Big Omega"/>

<div id="bigTheta"></div>

### Big Θ

The Big Θ of a function is its asymptotic tight bound. This Means that the funciton always runs in a time comprised between the run time of the two asymptotic bounds. To generalize we can say that a funciton $T(n)$ is $Θ(f(n))$ if there are two constants $k_{1} and k_{2}$
such that $T(n) ≥ k_{1}·f(n)$ and $T(n) ≤ k_{2}·f(n)$ for large enough $n$.

<img src="pics/Big Theta.png" alt="Big Theta"/>

<div id="workingWithNotation"></div>

### Working with the Asymptotic Notation

1. Addition

    We can sum two functions together and the result will be the dominant one:
    $f(n) + g(n) = Θ(max(f(n), g(n)))$
2. Multiplication

    Multiplying a function by a constant will result in the constant to be ignored. If instead we are multiplying two functions, we proceed as follows: $Θ(f(n))·Θ(g(n)) = Θ(f(n)·g(n))$
    

**The same rules also apply to Big O and Big Ω**

<div id="orderOfDominance"></div>

## Order of Dominance in the Asymptotic Limit

Let's consider some common asymptotic growths, valid for both space and time complexity:
- Constant: $O(1)$
- Logarithmic: $O(log(n))$
- Linear: $O(n)$
- Quasilinear: $O(n·log(n))$
- Quadratic: $O(n^{2})$
- Exponential: $O(2^{n})$
- Factorial: $O(n!)$

To understand the difference between some of the most common time complexities, take a look at thi graph below, were the x-axys represents the size of the input of the functions and the y-axys represents the result of the functions.

<img src="pics/Comparison.png" alt="Functions Compared" width="800"/>

We can see that the dominance order of these functions is: $$n! >> 2^n >> n^2 >> n·logn >> n >> log(n) >> 1$$

It is also clear that an efficient algorithm can really make the difference in terms of time and space efficiency, especially as the input size grows.

<div id="introductionDS"></div>

# Introduction to Data Structures

Data Structures are constructs that allow you to store and manage data values and provide you with methods to access or manipulate such data. There are different kinds of data structures available, each with its pros and cons. It is important to understand the strengths and weaknesses of different Data Structures in order to choose the right one for your use case and make your software more efficient.

Data structures can be classified in "contiguos" and "linked". The former are based upon arrays and the latter on pointers

<div id="arrays"></div>

## Arrays

An Array is an example of contiguos data structure. They are collections of data of fixed size allocated in contiguos memory locations, which make accessing the data values by index really efficient.

<figure>
    <img src="pics/Array.png" alt="Example of array in memory" width="450"/>
    <figcaption>
        Image source: <a href="https://www.geeksforgeeks.org/array-data-structure">Geeks for Geeks</a>
    </figcaption>
</figure>

Analysis of common operations on arrays:

- Access:

    Since the values are indexed and this is a data structure contiguos in memory it is possible to access data in the array with a worst-case time complexity of $O(1)$ (constant time complexity);
- Search:

    The data stored in an array is not always sorted, so we need to assume that every memory slot could be searched before finding the element we are looking for, therefore the time complexity of this operation is $O(n)$ (linear time complexity);
- Insertion:

    To insert an element at a specific index we might, in the worst case scenerio, need to shift all the other elements in the array, so the time complexity for this operation will be $O(n)$;
- Deletion:

    To delete an element in an array we may run into the same issue of insertion, hence the time complexity will be, again, $O(n)$.

The space efficiency of arrays is also one of its major strengths, since arrays are only made up of pure data, no space is wasted.

Another advantage of arrays is their memory locality, which takes full advantage of the speed of cahce memory.

Their main disadvantage is the impossibility of changing their size while the program is running. It is however possible to avoid this limitation using dynamic arrays, arrays whose size doubles every time they are full.

Arrays can be used for nearly everything, and other data structures are based on them as well.

<div id="linkedLists"></div>

## Linked Lists

Linked lists are an example of linked data structures. They are collections of data not allocated in contiguos memory locations but in which the elements are linked using a pointer to the next value in the case of a Singly Linked List or to both the previous and the next in the case of a Doubly Linked List.
Linked lists are made up of a "Nodes", the first one of which is called Head. Every node has a pointer that points to the next node (or to a null value in case it is the last element), while in case they are doubly linked list they also have a pointer to the previous value (or to a null value in case of the first element).

<figure>
    <img src="pics/LinkedList.png" alt="Representation of singly linked lists" width="500"/>
    <figcaption>
        Singly Linked List. Image source: <a href="https://www.geeksforgeeks.org/data-structures/linked-list/">Geeks for Geeks</a>
    </figcaption>
</figure>

<figure>
    <img src="pics/DoublyLinkedList.png" alt="Representation of doubly linked lists" width="500"/>
    <figcaption>
        Doubly Linked List. Image source: <a href="https://www.geeksforgeeks.org/doubly-linked-list/">Geeks for Geeks</a>
    </figcaption>
</figure>

Analysis of common operations on linked lists:

- Access:

    Linked Lists (both Singly and Dobly Linked) are not indexed data structures. This mean that we cannot access one value directly, but we need to start from the Head (or also from the last element if we are in a Doubly Linked List) and follow the pointers until we find the element we are looking for or we find a null value. For this reason, the time complexity of this operation is $O(n)$;
- Search:

    The same concept from the access operation applies here;
- Insertion:

    To insert a new node "B" in a Singly Linked List, between two nodes "A" and "C", we just need to make sure that the node "A" points to node "B" and that node "B" points to node "C". If the list is a Doubly Linked one, we also need to make sure that the backward pointer of node "C" and "B" is set correctly. 
    We can also add an element at the beginning of a Linked List by simply making it the new head and set its pointer to the previous head, and in case it is a Doubly Linked List we also need to point the backward pointer of the previous Head to the new Head. The time complexity of this operation, after you know the position where you want to insert a new element is $O(1)$, since you only need to change the value of the pointer(s);
- Deletion:

    The same concept from the insertion operation applies here, except that insteaf of changing the pointer(s) to include a new element we change them to exclude one. For this reason the time complexity of this operation is also $O(1)$.

Linked lists are not really space efficient since they need to store the pointers and the extra space needed will be $O(n)$.

As we have seen, their main disadvantages are that they are slow in searching and occupy more memory than arrays. Moreover, another disadvantage is that they don't benefit from the speed of cache memory since they are not stored contiguously.

They are used where dynamic memory allocation is required, and just as arrays, other data structures are based on them.

<div id="stacks"></div>

## Stacks

A Stack is a linear data structure. They allow two operations: insertion at the top (push) and read and removal at the top (pop). For this reason stacks follow the "Last In First Out" or "LIFO" order.

<figure>
    <img src="pics/Stack.png" alt="Representation of a stack" width="500"/>
    <figcaption>
        Stacks. Image source: <a href="https://www.geeksforgeeks.org/stack-data-structure/">Geeks for Geeks</a>
    </figcaption>
</figure>

Analysis of common operations on Stacks:
    
- Access:

    Access an element in a Stack entails that we need to read and remove the top item until we reach the one we're looking for or we are left with an empty Stack. Because of this, the time complexity of this operation will depend on its size and will therefore be $O(n)$;
- Search:

    The search operation in a Stack is the same as the access operation.
- Insertion:

    We can only insert an element at the top of the Stack, and for this reason this operation will have a time complexity of $O(1)$ (constant time);
- Deletion:

    Just like insertion, we can only delete the top element of a Stack and thus the time complexity of this operation will also be $O(1)$.
    

Depending on how Stacks are implemented they can be more or less space efficient.

Their main advantage is that they are easy to implement and that that the insertion and deletion operations are really time efficient.

Uses of Stacks include situations in which the order in which the elements are inserted/deleted is not important, or when you specifically need to retrieve the last elements first.

<div id="queues"></div>

## Queues

A Queue is a linear data structure similar to the Stack but that supports a different set of operations, Enqueue (inserting at item at the rear of the queue) and Dequeue (reading and deleting an item from the fron of the queue). Instead of the LIFO, Queues follow the FIFO order (First In, First Out).

<figure>
    <img src="pics/Queue.png" alt="Representation of a queue" width="500"/>
    <figcaption>
        Queues. Image source: <a href="https://www.geeksforgeeks.org/queue-data-structure/">Geeks for Geeks</a>
    </figcaption>
</figure>

Analysis of common operations on Queues:
    
- Access:

    Access an element in a Queue is similar to the access operation in a Stack, we need to read and remove (dequeue) the front item until we find the one we are looking for or no elements are left in the Queue. The time complexity of this operation will henve be $O(n)$;
- Search:

    The search operation in a Queue is the same as the access operation.
- Insertion:

    We can only insert an element at the rear of the Queue, and for this reason this operation will have a time complexity of $O(1)$ (constant time);
- Deletion:

    We can only delete the front element of a Queue and thus the time complexity of this operation will also be $O(1)$.

The space complexity of stacks also depends on how they are implemented.

Just like Stacks they are also easy to implement and that the insertion and deletion operations are really time efficient.

Queues are useful in situations in which the order in which the elements are retrieved matters.

<div id="hashTables"></div>

## Hash Tables

A Hash Table is data structure in which a key-value data pair is stored. It is usually implemented with an array and it works by hashing (generating an unique value, integer in this case) the key and storing the kay-value data at the index returned by the hash function. In this way, given the key, it is really efficient to locate its value.

The main challenge of Hash Tables is to remain memory efficient while avoiding collisions (having 2 or more elements at the same index).

<figure>
    <img src="pics/HashTable.png" alt="Representation of a hash table" width="400"/>
    <figcaption>
        Hash Table. Image source: <a href="https://en.wikipedia.org/wiki/Hash_table#/media/File:Hash_table_3_1_1_0_1_0_0_SP.svg/">Wikipedia</a>
    </figcaption>
</figure>

Analysis of common operations on Hash Tables:

- Search / Access:

    Since the data stored in a Hash Table is indexed, it will take constant time to search or access it ($O(1)$), even if depending on the hashing algorithm, it could depend on the size of the key. Other edge cases include collision, and we will take such situation into acccount later;

- Insertion:

    To insert an element in a Hash table, we just need to execute the hash function and insert the data in the array, which happens, depending on the hashing funciton, wither with a time complexity of $O(1)$ (constant time), or with a time complexity dependent on the key size. This can change in case of collisions;
- Deletion:

    Deleting an element in a Hash Table is a process similar to searching for it, except that insted of reading it, it gets deleted. Normally this process happend with a time complexity of $O(1)$, but also this operation can be slowed down by collisions.


**Collisions** are more frequent as the array fills up because the empty slots are less. It is common to use the variable $a=n/m$ (where $n$ is the number of elements and $m$ the length of the array) to measure how full is an array. Once a collision happen there are different ways to deal with it:
- Chaining:

    We add more than 1 key value pair to the same index. In this way inserting a new element has always a time complexity of $O(1)$ while searching for an element a time complexity of $(O(1+a))$;
    
- Open Addressing:

    We store the elements in same array without using additional data structures. Ways to find a new index include:
    
    - Linear Probing:

        We store the colliding element in the first available slot in the array. This method, however, can be really inefficient. Inserting and searching for an element can have a time complexity of $O(n)$;
    - Quadratic Probing:
        
        We find a new index by adding an arbitrary number that increases quadratically. Just like Linear probing, this method can be really inefficient with inserting and sorting operations that can have a time complexity of $O(n)$;
    - Double Hashing:

        We generate a new hash if a collision is detected. We can chose between different function to implement this technique, but generally this method is more time efficient compared to the other 2, especially in searching.

Hash Table are not really space efficient.

Their main advantage is the efficienty with which insert,

Common implementations of Hash Tables include python dictionaries.

<div id="trees"></div>

## Trees

Trees are non linear data structures that can be considered an extension of a linked list. In a Tree, each node points to some "children" nodes or a null value, creating an hierarchical data structure.

Trees have a lot of properties, understand them we are going to take into consideration the Binary Tree in th eimage below (a Binary Tree is a Tree in which each node has at most 2 children).

<figure>
    <img src="pics/BinaryTree.png" alt="Representation of a binary tree" width="500"/>
    <figcaption>
        Binary Tree. Image source: <a href="https://www.geeksforgeeks.org/binary-tree-data-structure//">GeeksforGeeks</a>
    </figcaption>
</figure>

**Properties and Terminology of Trees:**

- **Node**: an element of the Tree, contains data and pointers to its children;
- **Edge**: The "link" between 2 nodes, every Tree has a maximum of $N-1$ edges, where $N$ is the number of nodes;
- **Parent Node**: the predecessor of a node, or the node that points to another. In the example above, among the others, "1" is the parent of "2" and "3", and "2" is the parent of "4" and "5";
- **Child node**: the descendant of a node. In the example above, among the others, "11" is a child of "2" and "3" is a child of "1";
- **Root Node**: The first element of the Tree and the only node without a parent node, in the example above the node is "1";
- **Siblings Nodes**: nodes that are children of the same parent node. In the example above, "4" and "5" and "8" and "9" are examples of siblings;
- **Leaf**: a node without children, like "11" or "14" in the example above.
- **Internal Node**: a node with at least 1 child. "7", "4" and "1" are internal nodes in the example above;
- **Degree**: the nummber of children that a node has. In a binary tree, this number is never greater than 2. The "Degree of Tree" is the Degree of the node with the highest Degree;
- **Level**: the "distance" of a Node from the Root Node, starting at 1. In the example above, the level of "1" is 1, the degree of "3" is 2, the degree of "5" is 3 and the degree of "9" is 4;
- **Height**: is the "distance" between the futhest descending leaf and a node, starting at 0 for the leaves. In the example above, the height of " 10" is 0, the height of "4" is 1, the height of "3" is 2 and the Height of "1" is "4". The Height of the Root Node is also the Height of Tree.
- **Depth**: the number of edges between a Node and the Root Node. For example, it is 0 for "1" and 2 for "7" in the Tree above.

Analysis of common operations on trees:

- Access:
    
    Since a Tree is not an indexed data structure, in the worst case it is possible that we need to look trough all the nodes until we find the one we are looking for. For this reason the time complexity of this operation is $O(n)$;
- Search:
    
    Trees are not ordered data structures, or at least not all of them. For this reason searching in a tree could also mean that we need to search trough all teh other nodes, either with a Depth-First-Search or a Breadth-First-Search approach (we will look at both of these algorithms in the algorithms section of this portfolio) with a time complexity that in both cases is of $O(n)$;
- Insertion:
    
    Inserting a Node in a tree may require changing the positions of the other nodes as well to keep the properties of the tree. For this reason this operation also has a time complexity of $O(n)$;
- Deletion:

    Deletion, just like insertion, may require to rearrange all the other Nodes and so this operation also happens in $O(n)$ time complexity.

<div id="binaryTrees"></div>

### Binary Trees

As I mentioned before, a Binary Tree is a Tree in which is node can have a maximum of 2 children, therefore each node cointains some data, a pointer to its left child and a pointer to its right child.

A binary tree is not an ordered tree by definition, so the time complexity of common operations is the same as that of a normal tree.

There are, however, a lot of different implementations of Binary Trees that make them more efficient in those common operations:

<div id="binarySearchTrees"></div>

#### Binary Search Trees

Binary Search Trees (BST) are Binary Trees in which the left child of a node (and all of its children) have a value smaller than that of the parent node and the right child of a node (and all of its children) have a value bigger than that of the parent node.

<figure>
    <img src="pics/BST.png" alt="Representation of a binary search tree" width="400"/>
    <figcaption>
        Binary Tree. Image source: <a href="https://www.geeksforgeeks.org/binary-search-tree-data-structure/">GeeksforGeeks</a>
    </figcaption>
</figure>

Analysis of common operations on BSTs:

- Search / Access:
    
    Binary Search Trees are not indexed data structure, so to access an element we need to search for it. BSTs are ordered data structures that work really well with the Binary Search algorithm (an analysis of this algorithm can be found in the algorithms part of this portfolio). Because of the fact that we can use Binary Search with this data structure, we can find an element with a time complexity of $O(h)$, where $h$ is also the height of the tree;
    
- Insertion:
    
    Inserting a Node in a BST may require us to change the order of all the other nodes, so in the worst case scenario the time complexity of this operation will be $O(n)$, but in the average case the time complexity of this operation will depend on the height of the tree, so $Θ(h)$;
- Deletion:

    Deleting a Node has the same impact as inserting a Node. the time complexity of this operation will therefore be $Θ(h)$ in the average case and $O(n)$ in the worst.
    
An implementation of a BST could be using it together with a binary search algorithm.

The main problems with BSTs is that they can be unbalanced when the nodes skew to one of the sides of the tree. In that case the height of the tree is $n$ and the operations in the tree become inefficient. To avoid this and have an height of $log(n)$ we can use a self balancing Binary Tree.

<div id="redBlackTrees"></div>

#### Red-Black Trees

Red-Black trees are a common kind of self-balancing Binary Trees in which: 
- Each node has a color property (either red or black); 
- The root is always black; 
- A Node cannot have the parent or children of its the same color, except if one or both of its children are leaves; 
- Its leaves have a null value and are considered black; 
- And in every path from a node to any of its null descendants contains the same number of black nodes.

<figure>
    <img src="pics/RedBlack.png" alt="Representation of a red black tree" width="500"/>
    <figcaption>
        Red Black Tree. Image source: <a href="https://en.wikipedia.org/wiki/Red–black_tree#/media/File:Red-black_tree_example.svg">Wikipedia</a>
    </figcaption>
</figure>

Analysis of common operations on Red-Black Trees:

- Search / Access:
    
    It is possible to perform a Binary Search on a Red-Black Tree, which as we have seen before allows us to search an element with a time complexity of $O(h)$. Since Red-Black trees are a balanced data structure, $h$ will be $log(n)$ and therefore this operation will happen with a time complexity of $O(log(n))$;
    
- Insertion:
    
    Since a Red Black Tree is a balanced Tree, and the insertion operation depends on the height of the tree, we can say that this operation will have a worst-case time complexity of $O(log(n))$;
- Deletion:

    Deleting a node can have the same impact as inserting a node, and this operation depends on the height of the Tree too. Since the height of a Red Black tree is $O(log(n))$, this operation will have a time complexity of $O(log(n))$. 

<div id="ropes"></div>

#### Ropes

A Rope is a Binary Tree used for string manipulation. In a rope, each leaf holds a substring and each inner node the total length of the substrings that are descendants of its left child.

In the example below, the root node has as value the total length of the string, which is not a mandatory feature but can be useful in common operations, as we will see below.

<figure>
    <img src="pics/Rope.jpg" alt="Representation of a rope" width="500"/>
    <figcaption>
        Rope. Image source: <a href="https://www.geeksforgeeks.org/ropes-data-structure-fast-string-concatenation/">GeeksForGeeks</a>
    </figcaption>
</figure>

Common operations that can be done on a rope are different than those done on other trees. These operations include:

- Index:

    Searching an element by their index is a very common operation in string manipulation, and thanks to the value of the inner nodes this operation can be done on Ropes with time complexity of $O(log(n))$. 
    
    To understand how that is possible let's consider an example, finding the character with at the position i=8 in the example above. We start by comparing i to the value of the root (which in this case holds the value of the total length), and we quickly determine if the index is part of the string. Since is smaller than the value of A, we move to A's left child (B). we compare i to the value of B and since 8 is smaller than 9 we move to B's left child. We compare i to the value of C (6) and since 8 is bigger, we move to  C's right child (F) and since we're moving to the right we update i to b i-C (8-6 = 2), and since F is a leaf we access the child at position i (which is now 2) of the substring (y), which is the 8th character of the whole string.

- Concat

    To concatenate 2 Ropes we just need to assign them to a new common root node with the value equal to the sum of the length of the substrings that descend from its new left child. This operation can be made in $O(1)$ time complexity, but computing the value for the new root node is an operation that has a time complexity of $O(log(n))$;

- Split

    When splitting a string starting from an index we need to make a distinction between 2 major cases: we need to start splitting after the last character of a leaf, or we need to start splitting starting from a middle character of a leaf. If our case is the latter, we assign 2 children to the leaf (which becomes an inner node), the left one containing the character that we don't need to split, and the right one containing the characters that we need to split. 
    
    After finding the leaf from which we need to split the Rope, we separate the nodes at the right of that leaf and we fix the weight of the inner nodes that were ancestors of the nodes we separated. We then assign the split nodes to a new common root node.
    
    At this point, it may be necessary to rebalance both Ropes. 
    
    This operation has a time complexity of $O(log(n))$ since it is the sum of the time complexities of the operations that are needed to complete this operation;

- Insert

    Inserting a Rope in the middle of another Rope is an operation that can be done by splitting the original Rope, concatenate the Rope we need to insert, and then concatenate the right part of the node we originally split. Rebalancing the tree may be also required. The time complexity of this operation will be the sum of the time complexities of 1 split operation and 2 concatenation operations $(O(log(n))$;

- Delete

    To delete a substring at the middle of a Rope, we need to split the original rope starting at the first character that we want to delete. We then split the resulting right Rope starting after the last character that we need to delete, and we finally concatenate the left rope of the first split operation with the right Rope of the last split operation. This operation will also have a time complexity given by the sum of the operations that it uses, which will result in a time complexity of $O(n)$.

Ropes are widely used in text editors and email clients because of their performances in managing strings, especially compared with a traditional string implement with an array of characters that also requires continuos memory allocation.

<div id="heaps"></div>

### Heaps

Heaps are trees in which the parent node always stores a value smaller than that of its children (in the case of a Min Heap) or bigger than that of its children (in case of a Max Heap). In this way the root node always stores the smallest value (in a Min Heap) or the biggest value (in a Max Heap).

Heaps do not need to be Binary Trees, but they need to be complete (every level should have the maximum amount of nodes) andif they are not, new elements are added to the incomplete lavel from left to right. Because of this last property, Heaps are usually stored as arrays.

<figure>
    <img src="pics/Heap.png" alt="Representation of a heap" width="500"/>
    <figcaption>
        Heap. Image source: <a href="https://www.geeksforgeeks.org/heap-data-structure">GeeksForGeeks</a>
    </figcaption>
</figure>

Analysis of common operations on Heaps:

- Search / Access:

    Searching an element that is not the root node (the node with the max value in a Max Heap or the node with a min value in a Min Heap), we may need to search through all the nodes to find the one we're looking for. Because of this, this operation will have a time complexity of $O(n)$;
- Insertion:
    
    
   To insert an element in its correct position in a Heap we need to start by appending it to the last level, which can be done with an average time complexity of $O(1)$ (if the heap is stored in an array). We then need to switch it with its parent node (in case the parent node is smaller and our heap is a Max Heap or in case the parent node is bigger and our Heap is a Min Heap) until it satisfies the properties of the heap. This second operation has a time complexity of $O(log(n))$;
- Deletion:
   
   
   To delete an element from a Heap we may also need to rearrange its nodes until the properties of the heap are satisfied. To do this we may need to switch an element for every level of the Heap and since the number of levels is given by $log(n)$ the time complexity of this operation will be $O(log(n))$.
   
Heaps are used as an auxiliary data structure in various algorithm, like the Heapsort algorithm.

<div id="graphs"></div>

## Graphs

Graphs are non-linear data structures made of vertices (or nodes) that store data and edges (that can also store data). Graphs are used to represent the relationships between its nodes or vertices. We have already examined a subset of Graphs, Trees.

<figure>
    <img src="pics/Graph.png" alt="Representation of a graph" width="500"/>
    <figcaption>
        Graph. Image source: <a href="https://www.geeksforgeeks.org/graph-data-structure-and-algorithms/">GeeksForGeeks</a>
    </figcaption>
</figure>

Graphs terminology:

- **Vertex** (or Node), an element of the graph that always contains some data;
- **Edge**, the relationship between 2 vertices, can also contain information;
- **Adjacency**, two nodes connected via an edge;
- **Path**, a sequence of edges between 2 vertices;
- **Eulerian Path**, a path that visits every edge once (but can visit vertices more than once) and ends up in a vertex which is not the starting one;
- **Eulerian Cycle**: a path that visits every edge once (but can visit vertices more than once) and ends up in the starting vertex;
- **Hamiltonian Path**, a path that visits every vertex only once (but can visit edges more than once) and ends up in a vertex which is not the starting one;
- **Hamiltonina Cycle**: a path that visits every vertex only once (but can visit edges more than once) and ends up in the starting vertex;
- **Parallel Edges**, two or more edges that connect the same vertices;
- **Loop**, an edge that connect a node to itself.

Types of Graphs:

- **Finite**. A finite Graph contains a finite number of edges and vertices;
- **Infinite**. An infinite Graph contains an infinite number of vertices and edges;
- **Trivial**. A trivial Graph contains only one vertex and no edges;
- **Simple**. A simple Graph contains only one edge between a pair of vertices;
- **Non Simple**. A non simple Graph contains more than one edge between a pair of vertices;
- **Multi-Graph**. A multi-graph contains some parallel edges but no loops;
- **Pseudo-Graph**. a pseudo-graph is a graph with at least a loop and a parallel edge;
- **Null**. A null graph contains vertices but no edges;
- **Complete** (or Full Graph). In a complete graph every vertex is adjacent to all the others;
- **Unweighted**. In an unweighted graph the edges do not store data;
- **Weighted**. In a weighted graph the edges store data;
- **Directed**. In a directed graph the edges connect 2 vertices only in one direction;
- **Undirected**. In an undirected graph the edges connect 2 vertices in both directions;
- **Topological**. In a Topological Graphs, the vertices are represented by distinct points in space.

Ways of representing a graph:

- **Adjacency List:**

    With an Adjacency List, we use an array to store information about the graph. In a Adjacency List, the array element with the same index as the id of a vertex contains information about its adjacent vertices. An Adjacency List for the graph in the image above will look like this: ```[[1,4],[0,2,3,4],[1,3],[1,2,4],[0,1,3]]```.
    
    Analysis of common operations on Adjacency Lists:
    
    - Storage:
    
        Storing a graph as an Adjacency List can be done with a time complexity of $O(|V| + |E|)$, where $V$ is the number of vertices and also the length of the array, and $E$ is the number of edges;
        
    - Add Vertex:
    
        Adding a vertex to a graph represented as an Adjacency List can be done with a time complexity of $O(1)$ since we just need to store a new element to the list;
    
    - Add Edge:
    
        Adding an edge to a graph represented as an Adjacency List can be done with a time complexity of $O(1)$, since we just need to add to the arrays representing the two nodes 1 value;
    
    - Remove Vertex:
    
        Removing a vertex from a graph represented as an Adjacency List can be done with a time complexity of $O(|V|+|E|)$, since we need to remove the edges to that vertex from all the other vertices as well;
    
    - Remove Edge:
    
        Removing an edge from a graph represented as an Adjacency List can be done with a time complexity of $O(|E|)$, since we need to search and remove the edge from the list of edges of the two nodes that it connects.
    
    
    The space complexity of an Adjacency List is $O(|V|+|E|)$.
    
    
- **Adjacency Matrix:**

    With an Adjacency Matrix, we use a 2D matrix to store information about the graph. In a Adjacency Matrix, the array element with the same index as the id of a vertex contains an array that indicates if a vertex is connected to another or not (1 if it is, 0 if it is not). An Adjacency Matrix for the graph in the image above will look like this:
    
    ```
    [
    [0,1,0,0,1],
    [1,0,1,1,1],
    [0,1,0,1,0],
    [0,1,1,0,1],
    [1,1,0,1,0]
    ]
    ```
    
    Analysis of common operations on Adjacency Matrices:
    
    - Storage:
    
        Storing a graph as an Adjacency matrix can be done with a time complexity of $O(|V|^{2})$, where $V$ is the number of vertices, the number of arrays in te matrix, and the length of each array;

    - Add Vertex:
    
        Adding a vertex to a graph represented as an Adjacency Matrix can be done with a time complexity of $O(|V|^{2})$ since we need to update all the arrays of which the matrix is made up as well as adding a new array;
        
    - Add Edge:
        
        Adding an edge to a graph represented as an Adjacency Matrix can be done with a time complexity of $O(1)$, since we just need to add to the update 2 values in the matrix;
   
    - Remove Vertex:
    
        Removing a vertex from a graph represented as an Adjacency Matrix can be done with a time complexity of $O(|V|^{2})$, because we need to update all the other arrays of which the matrix is made up;

    - Remove Edge:
    
        Removing an edge from a graph represented as an Adjacency Matrix can be done with a time complexity of $O(1)$, since we need to just update 2 values in the matrix.
        
    The space complexity of an Adjacency Matrix is $O(|V|^{2})$.


Because of their space complexity, it makes sense to use Adjacency Matrices either for small Graphs or Graphs with a lot of edges.

One of the possible real world applications of Graphs is for road maps.

<div id="sources"></div>

# Sources
- [Udacity - Data Structures & Algorithms in Python](https://classroom.udacity.com/courses/ud513)
- [Cambridge Dictionary - Algorithm](https://dictionary.cambridge.org/dictionary/english/algorithm)
- [GeeksforGeeks - Space Complexity](https://www.geeksforgeeks.org/g-fact-86/)
- [Khan Academy - Asymptotic Notation](https://www.khanacademy.org/computing/computer-science/algorithms/asymptotic-notation/a/asymptotic-notation)
- [The Lean Blogs - Some common runtime complexities and their meanings](https://medium.com/learn-with-the-lean-programmer/some-common-runtime-complexities-and-their-meanings-5a2bf4320f48)
- [GeeksforGeeks - Array](https://www.geeksforgeeks.org/array-data-structure/)
- [GeeksforGeeks - Linked Lists](https://www.geeksforgeeks.org/data-structures/linked-list/)
- [GeeksforGeeks - Doubly Linked Lists](https://www.geeksforgeeks.org/doubly-linked-list/)
- [GeeksforGeeks - Stacks](https://www.geeksforgeeks.org/stack-data-structure/)
- [GeeksforGeeks - Queues](https://www.geeksforgeeks.org/queue-data-structure/)
- [CS Dojo - Hash Tables](https://www.youtube.com/watch?v=sfWyugl4JWA&list=PLBZBJbE_rGRV8D7XZ08LK6z-4zPoWzu5H&index=13)
- [Ananda Gunawardena - CMU - Hash Table Conflict Resolution](http://www.cs.cmu.edu/~ab/15-121N11/lectures/lecture16.pdf)
- [Typeocaml - Height, Depth and Level of a Tree](http://typeocaml.com/2014/11/26/height-depth-and-level-of-a-tree/)
- [GeeksforGeeks - Red-Black Trees](https://www.geeksforgeeks.org/red-black-tree-set-1-introduction-2/)
- [GeeksforGeeks - Ropes](https://www.geeksforgeeks.org/ropes-data-structure-fast-string-concatenation/)
- [Opengenus - Ropes](https://iq.opengenus.org/rope-data-structure/)
- [GeeksforGeeks - Heaps](https://www.geeksforgeeks.org/heap-data-structure/)
- [HackerRank - Heaps](https://www.youtube.com/watch?v=t0Cq6tVNRBA)
- [Tutorialspoint - Graphs](https://www.tutorialspoint.com/data_structures_algorithms/graph_data_structure.htm)
- [GeeksforGeeks - Types of Graphs](https://www.geeksforgeeks.org/graph-types-and-applications/)
- [BigO complexities [pdf]](http://souravsengupta.com/cds2016/lectures/Complexity_Cheatsheet.pdf)
- Skiena, S. The Algorithm Design Manual. 1998. Springer.