### **Question 1 (12 points)**

Determine the runtime complexity T(n) using the Master Theorem for the following recurrences. If the Master Theorem doesn't apply, indicate why:

\begin{align*}
1. \quad T(n) &= 4T\left(\frac{n}{3}\right) + n^2\log n \\
2. \quad T(n) &= 3T\left(\frac{n}{4}\right) + n\sqrt{n} \cdot \log n \\
3. \quad T(n) &= 2T(n - 1) + \log n \\
4. \quad T(n) &= nT\left(\frac{n}{2}\right) + n^{1.5}\log n \\
5. \quad T(n) &= T\left(\frac{n}{2}\right) + T\left(\frac{n}{4}\right) + n\log n \\
6. \quad T(n) &= 4T\left(\frac{n}{2}\right) + n\log n\\
\end{align*}


### **Solution - Question 1**

**1.** $T(n) = 4T\left(\frac{n}{3}\right) + n^2\log n$

Using the Master Theorem:
- a = 4, b = 3

The general form is $n^{\log_b a} = n^{\log_3 4}$.
Comparing $f(n) = n^2\log n$ with $n^{\log_3 4}$, we find $f(n)$ is larger.
Since $f(n)$ is polynomially larger, we fall into Case 3 of the Master theorem.
Thus, the solution is $T(n) = \Theta(n^2\log n)$.

**2.** $T(n) = 3T\left(\frac{n}{4}\right) + n\sqrt{n} \cdot \log n$

Using the Master Theorem:
- $a = 3$, $b = 4$  

The general form is $n^{\log_b a} = n^{\log_4 3}$.
Comparing $f(n) = n\sqrt{n} \cdot \log n$ with $n^{\log_4 3}$, we find $f(n)$ is larger. 
Again, since $f(n)$ is polynomially larger, we fall into Case 3 of the Master theorem.
Thus, the solution is $T(n) = \Theta(n\sqrt{n} \cdot \log n)$.

**3.** $T(n) = 2T(n - 1) + \log n$

This is a linear recurrence. In this relation, the size of the problem is reduced by a constant (1) at each step, rather than a fraction of $n$. The Master Theorem is designed for recurrences that divide the problem size by a constant factor at each step.

**4.** $T(n) = nT\left(\frac{n}{2}\right) + n^{1.5}\log n$

The Master Theorem doesn't apply here due to the multiplicative factor of $n$ in front of the recursive term. This structure is not covered by the Master Theorem, which considers a fixed number of subproblems of reduced size. Expanding this recurrence would show that the problem size reduces exponentially, but the amount of work at each level grows exponentially too because of the multiplication by $n$. Such exponential growth in both directions (more subproblems and more work per subproblem) is outside the scope of the Master Theorem.

**5.** $T(n) = T\left(\frac{n}{2}\right) + T\left(\frac{n}{4}\right) + n\log n$

This recurrence has two recursive calls with different fraction splits, which means that the problem divides into subproblems of different sizes. The Master Theorem is designed for recurrences where each subproblem is of the same size. Given the two recursive calls with different fraction splits, the Master Theorem does not apply directly.

**6.** $T(n) = 4T\left(\frac{n}{2}\right) + n\log(n)$

- a = 4, 
b = 2, 
- $f(n) = n\log(n)$  

For case 2 of the Master Theorem, $f(n)$ needs to be $\Theta(n^{\log_b(a)})$. 
So, we first calculate $n^{\log_b(a)}$:
$n^{\log_b(a)} = n^{\log_2(4)} = n^2$

Now, $f(n) = n\log(n)$ is indeed $\Theta(n^2)$ when considering the logarithmic factor as a constant multiplier. Thus, case 2 of the Master Theorem applies.

The solution would be:
$T(n) = \Theta(n^2\log(n))$


### **Justification and Proof of Correctness - Question 1**

**1.**
 
Using the Master Theorem:
- $a = 4, b = 3$, hence $f(n) = n^2\log n$
- Critical exponent: $k = \log_b a = \log_3 4 \approx 1.26$

Comparing $f(n)$ to $n^k$:
$f(n) = \Omega(n^{k+\epsilon})$
For some $\epsilon > 0$ (in this case, $\epsilon \approx 0.74$ due to the $n^2$ term).

Therefore, using Case 3 of the Master Theorem, the solution is $T(n) = \Theta(n^2\log n)$.

**2.**  
Using the Master Theorem:
- $a = 3, b = 4$, hence $f(n) = n\sqrt{n} \cdot \log n$
- Critical exponent: $k = \log_b a = \log_4 3 \approx 0.79$

Comparing $f(n)$ to $n^k$:
$f(n) = \Omega(n^{k+\epsilon})$
For some $\epsilon > 0$ (in this case, $\epsilon \approx 0.71$ due to the $n^{1.5}$ term).

Using Case 3 of the Master Theorem, the solution is $T(n) = \Theta(n\sqrt{n} \cdot \log n)$.

**3.**  
This is a linear recurrence, not geometric, which the Master Theorem is designed to address. When expanded:

$T(n) = 2[2T(n-2) + \log(n-1)] + \log n$
$= 4T(n-2) + 2\log(n-1) + \log n$

Expanding this further results in a sum of logarithms. The recurrence doesn't follow the typical format of dividing $n$ by a constant factor. Hence, the Master Theorem doesn't apply.

**4.**  
This recurrence is particularly complex due to the leading coefficient of $n$ in the recursive term. When expanded:

$T(n) = n[\frac{n}{2}T\left(\frac{n}{4}\right) + \frac{n^{1.5}}{2}\log\left(\frac{n}{2}\right)] + n^{1.5}\log n$

The resulting formula becomes increasingly complicated as we expand further, clearly indicating that the recurrence doesn't fit the mold that the Master Theorem is designed to address.

**5.**  
The presence of two recursive terms makes this an irregular recurrence. To understand its behavior:

$T(n) = T\left(\frac{n}{2}\right) + T\left(\frac{n}{4}\right) + n\log n$

Expand $T\left(\frac{n}{2}\right)$ and $T\left(\frac{n}{4}\right)$ in a similar manner, and you'll find that the two terms will introduce a form of double recursion, leading to an irregular branching pattern. This doesn't fit the Master Theorem's standard format.

**6.**  
According to the Master Theorem, we need to compare $f(n)$ with $n^{\log_b(a)}$.
For case 2 of the Master Theorem to apply, the following conditions must be satisfied:

1. $f(n)$ is $\Theta(n^{\log_b(a)}\log^k(n))$
2. $k \ge 0$

Given $f(n) = n\log(n)$, this is equivalent to $n^{\log_b(a)}\log^1(n)$, with $k = 1$.  
Thus, $f(n)$ satisfies the condition for case 2 of the Master Theorem.

According to case 2, the solution is:  
$T(n) = \Theta(n^{\log_b(a)}\log^{k+1}(n))$    
Substituting in the values for $a$, $b$, and $k$: 
$T(n) = \Theta(n^2\log^2(n))$ 
 

### **Question 2 (15 points)**

A pharmaceutical company produces medicines through a series of stages. Each stage transforms a type of chemical compound into another. Some stages also require specialized equipment which has an associated rental cost. A type of compound that cannot be transformed further is termed a 'finished drug'. A type of compound that cannot be derived from another is termed a 'base compound'.

Each base compound has a procurement price and a storage shelf life, after which it becomes unusable. Each transformation stage incurs a cost and has an associated time duration. Every finished drug has a market price and an expiration duration post which it cannot be sold. The prices, costs, and durations are variable, changing every week. Moreover, new transformation stages, types of finished drugs, and equipment can be introduced every week.

Considering the constraints of shelf life and drug expiration, the company must decide every week on the optimal 'base compound' to start with and produce a 'finished drug' that offers maximum revenue, taking into account procurement, transformation costs, equipment rental, and potential wastage due to expiration.

Design an algorithm that determines the most optimal production strategy for the week efficiently, while also ensuring minimal wastage due to expiration of compounds and finished drugs. Write the pseudocode and justify the efficiency of your algorithm.

### **Solution - Question 2**

The pharmaceutical process can be modeled as a directed graph. Nodes in this graph represent chemical compounds, whether base, intermediate, or finished. Directed edges between these nodes denote transformation stages.

- Nodes associated with base compounds are weighted by their procurement costs, while those linked to finished drugs bear potential revenue weights. 
- Edges, which signify transformation stages, are burdened with costs. 
- If equipment is required for a particular stage, the rental for said equipment is integrated into the edge weight. 
- Furthermore, time elements, such as the shelf life of compounds and the expiration of drugs, must be accounted for.

Using a modification of the shortest path algorithm, the most cost-effective path from a base compound to a finished drug is discerned, taking into account financial costs and time constraints.The problem's dynamic aspects, such as fluctuating costs and emerging stages, necessitate iterating over all possible paths each week, framing the solution as a weekly batch process.

The devised solution assesses the most profitable path from every base compound to every finished drug. Leveraging an optimized shortest path algorithm tailored for a directed graph, its complexity stands at O(V+E) for each compound pair, where V represents the number of compounds and E stands for the number of transformation stages. As all potential compound pairs are examined, the worst-case time complexity is O(V^2(V+E)). For a weekly evaluation, especially when the compound and stage count is reasonably bounded, this complexity is manageable.

**Pseudocode to determine the most optimal production strategy for the week:**

```plaintext

function getBestStrategy(graph, compoundPrices, stageCosts, drugPrices, equipmentCosts, shelfLives, expirationTimes):
    highestProfit = -Infinity
    bestPlan = None
    
    // Start loop for every base compound
    for each baseCompound in graph:
        
        // Start loop for every finished drug
        for each drug in graph:
            
            if canMakeDrug(baseCompound, drug):
                profit = computeProfit(baseCompound, drug, compoundPrices, stageCosts, drugPrices, equipmentCosts, shelfLives, expirationTimes)
                
                if profit > highestProfit:
                    highestProfit = profit
                    bestPlan = (baseCompound, drug)
        
        // End loop for every finished drug
    
    // End loop for every base compound
    
    return bestPlan
``` 

**Pseudocode to compute the profit for a given base compound and finished drug:**
```plaintext
function computeProfit(baseCompound, drug, compoundPrices, stageCosts, drugPrices, equipmentCosts, shelfLives, expirationTimes):
    totalCost = compoundPrices[baseCompound]
    revenue = drugPrices[drug]
    
    path = findShortestPath(graph, baseCompound, drug)
    timeTaken = 0
    
    // Start loop for every transformation stage in the path
    for each stage in path:
        totalCost += stageCosts[stage]
        
        if needsEquipment(stage):
            totalCost += equipmentCosts[stage]
        
        timeTaken += timeForStage[stage]
    
    // End loop for every transformation stage in the path
    
    if timeTaken > shelfLives[baseCompound] or timeTaken > expirationTimes[drug]:
        return -Infinity   // Can't make the drug within time limits
    
    return revenue - totalCost


```

### **Justification and Proof of Correctness - Question 2**


- **Graph-based Representation:** The problem presents a series of transformations from one compound to another, naturally resembling a flow process. Representing this as a directed graph captures the inherent structure of the pharmaceutical process. In this representation, each node signifies a compound, while each edge represents a transformation stage.

- **Use of Dynamic Programming via Modified Shortest Path:** The modified shortest path algorithm ensures that for each pair of base compound and finished drug, the most cost-effective transformation path is obtained. Dynamic programming ensures that the best solution for subproblems is reused, aiding in the efficiency of the solution.

- **Iterative Weekly Examination:** Given the dynamic nature of costs, stages, and other factors that can change weekly, an iterative approach ensures the algorithm's relevancy and timeliness.

**Psuedocode explanation:**

- Completeness:
    - The algorithm iterates over every base compound and every possible finished drug. This ensures that all potential productio paths are examined.
    - By examining all pairs, the algorithm captures all potential routes to profitability.

- Optimality:
    - Within each pair of base compound and finished drug, a shortest path algorithm is used. This ensures that the best path, i terms of cost, is obtained for that particular pair.
    - Dynamic programming ensures that previously computed solutions for subproblems are utilized, ensuring efficiency.

- Time-boundedness:
    - Time constraints, such as shelf life and drug expiration, are integrated into the algorithm. This means any path that exceed these constraints is assigned a value that ensures its non-viability.
    - By incorporating these constraints, only paths that adhere to the time limitations are considered viable.

- Consistency with Dynamic Changes:
    - Given the algorithm's weekly iteration, it will adapt to any changes, ensuring the strategy remains relevant.
    - The solution's adaptability ensures it remains accurate, even when confronted with changes in costs, stages, or other dynami factors.

### **Question 3 (25 points)**

**Efficient Removal and Reconnection in Undirected Graph**

*Problem Statement:*
You're given an undirected graph G=(V,E). Design an algorithm that repeatedly finds a vertex with degree 1, outputs it, and removes it and its connected edge from the graph. Once a vertex is removed, its neighboring vertex (the vertex to which it was connected) should then be connected to its next neighboring vertex (if it exists). 

- Design algoithm for problem statement above.
- Implement the algorithm such that it runs in time O(V+E). Implement psuedocode for your solution.
- What would be the implications if a vertex of degree 0 or degree 2 exists in the graph?

*Input Format:*

- List of edges representing the undirected graph.

*Output Format:*

- List of vertices in the order they are removed.
- Modified list of edges after all removals.

*Sample Inputs and Outputs:*

- Input: [(1, 2), (2, 3), (3, 4), (4, 5)]
- Output:
    - Removed vertices: [1, 5, 2, 4, 3]
    - Modified edges: []


### **Solution - Question 3**

**Algorithm:**
```plaintext
Initialize an empty list called removed_vertices.
Initialize an adjacency list representation of the graph.
While there are vertices in the graph:
    Iterate through the vertices of the graph.
    If a vertex has degree 1:
        Append the vertex to the removed_vertices list.
        Note the neighbor of this vertex.
        Remove the vertex and its edge from the graph.
        If the neighbor now has degree 1 (and isn't already in removed_vertices), connect it to its next neighbor if it exists.
    Else continue iterating.
Return the removed_vertices list and the modified edge list.
```

*Vertex of degree 0:*
    - A vertex of degree 0 means it's isolated from the graph, and our algorithm will not touch or remove it since it never had a degree of 1 to begin with.

*Vertex of degree 2:*
    - If a vertex of degree 2 exists, then our algorithm will not remove it in its first pass because the criteria for removal is a vertex with a degree of 1. However, after the neighboring vertices (which presumably have a degree of 1) are removed, this vertex might eventually have a degree of 1 and will be addressed in the subsequent iterations of the algorithm.

**Pseudocode:**

```plaintext
function removeAndReconnect(edges):
    graph = constructAdjList(edges)
    removed_vertices = []

    while graph is not empty:
        for vertex in graph:
            if degree(vertex) == 1:
                neighbor = getNeighbor(vertex)
                
                removed_vertices.append(vertex)
                removeVertex(vertex, graph)
                
                if degree(neighbor) == 1 and neighbor not in removed_vertices:
                    next_neighbor = getNextNeighbor(neighbor, graph)
                    if next_neighbor:
                        connect(neighbor, next_neighbor, graph)
    
    return removed_vertices, getEdgeList(graph)
```



### **Justification and Proof of Correctness - Question 3**

**Algorithm and Pseudocode justification:**

*1. Loop Invariant:*
- After each iteration of the while loop, all vertices with degree 1 in the graph have been removed, and their neighbors are connected to subsequent neighbors.

    - Initialization: Before the loop, no vertices are removed; the invariant is true.
    - Maintenance: Each iteration removes all vertices with degree 1 from the current graph state and connects their neighbors appropriately, maintaining the invariant.
    - Termination: The loop ends when no more degree-1 vertices are left, making the invariant true at termination.

*2. Postcondition:*
Upon algorithm termination, the graph contains no vertices of degree 1. Remaining vertices are either isolated or connected.

*3. Implications for Vertices:*

- Degree 0 (Isolated): Unaffected, as they're never targeted for removal.
- Degree 2: These can become degree-1 vertices if a neighbor is removed. The loop invariant guarantees their removal in subsequent iterations.

*4. Edge Reduction:*
- Every removed vertex results in the reduction of at least one edge. This guarantees that the algorithm doesn't loop indefinitely, as both vertices and edges decrease with every iteration.

*5. Termination:*
- With every iteration, at least one vertex is removed. Hence, the algorithm will halt after at most |V| iterations.

By consolidating these points, we can reasonably assert that our algorithm correctly and accurately removes vertices of degree 1 and modifies the graph according to the problem statement.


**Proof for O(V+E) Time Complexity:**

- Adjacency List Creation: We iterate over all edges (E) once to create the adjacency list. This takes O(E) time.

- Main Loop: In the worst-case, every vertex (V) is processed once. Inside this loop:
    - Checking the degree of a vertex: O(1)
    - Removing a vertex and its edge: O(1)
    - Connecting neighbors: O(1)

- Given the above operations are constant time (O(1)) and each vertex is processed once, the loop's complexity is O(V).

Edge Iteration: Since every edge is at most considered twice (once for each vertex), this contributes O(2E) which is O(E).

- Combining the complexities: O(E) for adjacency list creation, O(V) for the vertex loop, and O(E) for edge iteration gives us O(V+E) as the total time complexity.

Thus, the algorithm operates in O(V+E) time.

### **Question 4 (10 points)**

**Storage Optimization of Precious Metals**

Problem Statement:  

A jeweler has recently acquired a collection of precious metals. Each metal has an associated value and weight. The jeweler has a limited storage box that can only accommodate a certain weight. The jeweler wishes to store the metals in such a way that he maximizes the total value of metals in the box.

Given the weights and values of the metals, determine which metals to store in the storage box to maximize the total value, keeping within the weight limit of the box.

Input:
- A list of n metals, each with a weight and a value.
- An integer, W, representing the weight limit of the storage box.

Output:
- A list of metals that should be stored in the box to maximize the value.

| Metal (i) | Value (v_i) | Weight (w_i) |
|----------|------------|-------------|
| Gold     | 6          | 4           |
| Silver   | 4          | 3           |
| Platinum | 5          | 5           |
| Copper   | 3          | 2           |




### **Solution - Question 4**

*Algorithm Approach: Dynamic Programming*

- Start by creating a 2D DP table where the row represents each metal and the column represents each weight from 0 up to W.
- Initialize the first row and the first column to 0 because the maximum value with 0 weight or 0 metals is 0.
- For each metal and each weight, check if the weight of the current metal is less than or equal to the current weight limit.

- If it is, compare the value of:  
    **a. Including the current metal:** The value of the current metal plus the value from the previous row with a reduced weight (current weight - weight of current metal).  
    **b. Excluding the current metal:** The value from the previous row with the same weight.
    Update the current cell with the maximum value of the two above.

- After filling out the table, backtrack to find which metals were included.
- Start from the last metal and the full weight W.
- If the value is the same as the value in the row above, the metal was not included. If it's different, the metal was included.

*Pseudocode:*  
```plaintext
function maximizeMetalValue(values, weights, W):
    n = length of values
    dp = array of size (n+1) x (W+1) initialized to zero

    for i from 1 to n:
        for w from 0 to W:
            if weights[i] <= w:
                dp[i][w] = max(dp[i-1][w], dp[i-1][w-weights[i]] + values[i])
            else:
                dp[i][w] = dp[i-1][w]

    storedMetals = empty list
    w = W
    for i from n down to 1:
        if dp[i][w] != dp[i-1][w]:  // This indicates the metal was included
            storedMetals.append(i)
            w -= weights[i]

    return storedMetals
```

### **Justification and Proof of Correctness - Question 4**

**Proof by Induction:**

- **Base Case:**
    The base cases are straightforward. If we have 0 items or a knapsack of 0 weight, the maximum value we can obtain is 0. This is reflected in the initialization, where the first row and the first column of the dp table are initialized to 0.

- **Inductive Hypothesis:**
    Assume that for some k items and weight w, dp[k][w] stores the maximum value that can be obtained by considering the first k items and a knapsack weight of w. We want to prove that dp[k+1][w] will store the maximum value that can be obtained by considering the first k+1 items and a knapsack weight of w.

- **Inductive Step:**
    For item k+1, we have two choices:  
    a. Exclude the item. If we exclude, the solution would just be dp[k][w] since we rely on the previous row's solution.
    b. Include the item. If we include the item, the solution would be the value of the k+1 item plus the solution from the previous row with a reduced weight (dp[k][w-weights[k+1]] + values[k+1]).

The algorithm correctly considers both these choices by using the max function:
dp[k+1][w] = max(dp[k][w], dp[k][w-weights[k+1]] + values[k+1])

This means that for each item and weight combination, the table is storing the maximum possible value that can be obtained.

- **Optimal Substructure:**
    When we choose to include an item in the knapsack, the remaining problem we have to solve is finding the optimal solution for the remaining weight and the remaining items. This is a subproblem of the original problem. the dynamic programming approach computes solutions for these subproblems and uses them to construct a solution for the original problem.

- **Overlapping Subproblems:**
    When computing the solution for items 1, 2, and 3 with weights 5, 6, and 7, we might need to compute the solution for items 1 and 2 with weight 4 multiple times. The dynamic programming approach stores the solution to each subproblem in the dp table so that we don't have to recompute it, ensuring correctness and efficiency.

- **Backtracking:**
    The backtracking part of the pseudocode reconstructs the solution using the dp table. If the value in dp[i][w] is not equal to the value in dp[i-1][w], this means that item i was included in the solution for weight w. By backtracking from the last item and full weight down to the first item and weight 0, we can determine which items were included in the optimal solution.

### **Question 5 (15 points)**

From the weighted undirected graph below,

Express it as:  
A. An adjacency list (2.5 points)  
B. An adjacency matrix (2.5 points)

Find the shortest paths to all the vertices from vertex 0 using Dijikstras algorithm. Show steps as well as distances to each vertex. (10 points):

![prob1.png](attachment:image.png)

### **Solution - Question 5**

**A. Adjacency List:**  

![image.png](attachment:image.png)  

**B. Adjacency Matrix:**  

![image-2.png](attachment:image-2.png)

**Step by step process:**

- Start with vertex 0, the initial distance to itself is 0.
- For all its neighbors, update the distance if it's less than the current known distance. After updating all neighbors of the current vertex, mark the current vertex as visited.
- Pick the next vertex with the shortest known distance that hasn't been visited yet. Repeat step 2.

- From vertex 0:
    - Distance to vertex 3 = 3
    - Distance to vertex 4 = 9

- Next shortest distance is to vertex 3:
    - Distance to vertex 7 = 3 + 4 = 7

- Next is vertex 4:
    - Distance to vertex 2 = 9 + 2 = 11

- The path to vertex 7 is the next shortest:
    - Distance to vertex 5 = 7 + 8 = 15

- Vertex 2 follows:
    - Distance to vertex 6 = 11 + 3 = 14

- From vertex 5:
    - Distance to vertex 1 = 15 + 5 = 20

Thus, we obtain the paths as follows:

- To vertex 0: 0 (Distance: 0)
- To vertex 1: 0 -> 3 -> 7 -> 5 -> 1 (Distance: 20)
- To vertex 2: 0 -> 4 -> 2 (Distance: 11)
- To vertex 3: 0 -> 3 (Distance: 3)
- To vertex 4: 0 -> 4 (Distance: 9)
- To vertex 5: 0 -> 3 -> 7 -> 5 (Distance: 15)
- To vertex 6: 0 -> 4 -> 2 -> 6 (Distance: 14)
- To vertex 7: 0 -> 3 -> 7 (Distance: 7)

### **Justification and Proof of Correctness - Question 5**

In our walk-through of the provided graph:

- We started with vertex 0 and set its distance to 0.
- At each step, we consistently picked the vertex with the smallest known distance that hadn't been visited. For instance, after starting with vertex 0, we chose vertex 3 because its distance (3) was the shortest among unvisited vertices.
- Each time we visited a vertex, we updated the distances to its neighbors if we found a shorter path. This ensured that we always had the shortest known distances for all vertices.
- As we progressed, we never revisited an already visited vertex or changed its distance, which aligns with the maintenance principle.

Since we applied Dijkstra's algorithm according to its principles and did not deviate, our solution is correct. The paths and distances we found are indeed the shortest paths from vertex 0 to all other vertices in the graph.

### **Question 6 (20 points)**

Given the DAG below,  

![image.png](attachment:image.png)

Express the directed graph above as:  
A. An adjacency list (2.5 points)  
B. An adjacency matrix (2.5 points)  

Can the directed graph be topologically sorted? If so, produce a topological sort for the graph using DFS. 
Give the pseudocode for the DFS approach for topological sort. (15 points)

### **Solution - Question 6**

**A. Adjacency List:**  

![image.png](attachment:image.png)  

**B. Adjacency Matrix:**  

![image-2.png](attachment:image-2.png)  

Since the Graph is a DAG, it can be topologically sorted.

**Pseudocode for Topological Sort:**

```plaintext
function DFS_Topological_Sort(graph):
    Initialize an empty stack S
    Mark all vertices as not visited

    for each vertex v in graph:
        if v is not visited:
            DFS_Util(v, visited, S)

    while S is not empty:
        pop an item from S and print it (or store in a result list)

function DFS_Util(v, visited, S):
    mark v as visited

    for each vertex u adjacent to v:
        if u is not visited:
            DFS_Util(u, visited, S)

    push v onto S
```

**Topological Sort:**

- Start with Vertex 5:
    - Visit Vertex 6: No unvisited adjacent vertices. Push 6 onto stack.
    - Visit Vertex 7: No unvisited adjacent vertices. Push 7 onto stack.
    - Push 5 onto the stack.   
     

- Move to Vertex 1:  
    - Visit Vertex 2: No unvisited adjacent vertices. 
    - Push 2 onto stack.
    - Push 1 onto the stack.  
    
- Move to Vertex 0:
    - Visit Vertex 3: Moves to Vertex 7, which is visited. Push 3 onto stack.
    - Visit Vertex 4: Adjacent vertices 6 and 7 are visited. Push 4 onto stack.
    - Push 0 onto the stack.

All vertices are now visited.

Stack order: 5, 1, 0, 4, 6, 3, 7, 2.

This gives the topological order as: 5, 1, 0, 4, 6, 3, 7, 2.

### **Justification and Proof of Correctness - Question 6**

**Proof to check if the graph can be topologically sorted:**

Using Depth First Search (DFS), one can detect cycles. If no cycles are present, it's possible to perform a topological sort.

- Starting from vertex 0:

    - Vertex 2 is visited. It doesn't have further adjacent vertices that are unvisited.
    - Vertex 3 is visited next, leading to vertex 7. Vertex 7 doesn't have further unvisited adjacent vertices.
    - Vertex 4 is then visited, leading to vertices 6 and 7. Neither has further unvisited adjacent vertices.  
    
- By continuing this traversal for all vertices:
    - It's observed that vertices 2, 6, and 7 have no outgoing edges. All other vertices have been covered.  
    
- During the traversal, no back edges were encountered, which would have indicated a cycle.

- From the evidence gathered:

    - The DFS traversal from all vertices didn't detect any cycles.
    - All vertices in the graph were reachable.
    - The graph is directed and doesn't contain any cycles.
    
Thus, it is concluded that a topological sort can be performed on the graph.

**Psuedocode and approach justification:**

- The key insight is that in a DAG, a node is pushed onto the stack only after all its descendants are visited and already on the stack.
- When the DFS_Util function visits a node, it first recursively explores all its adjacent (neighbor) nodes. Only after all these nodes are visited is the current node pushed onto the stack.
- This guarantees that for every directed edge from vertex u to vertex v, u appears before v in the topological ordering (i.e., v is pushed onto the stack before u).
- The outer loop in DFS_Topological_Sort ensures that every vertex is visited. This ensures that all vertices, even if they're part of a disconnected subgraph, are included in the topological ordering.
- The pseudocode ensures all nodes are visited and processed. The use of the stack ensures the correct order is maintained, as explained in point 2.
- The marking of nodes as "visited" ensures that nodes are not processed more than once and that cycles (if they existed, which they shouldn't in a DAG) don't lead to infinite loops.

Given that the graph is a DAG (which is a prerequisite for topological sorting), the provided DFS-based pseudocode correctly computes a topological ordering of its vertices.

### **Question 7 (15 points)**

Given the following graph,

![image.png](attachment:image.png)

Express the directed graph above as:  
A. An adjacency list (2.5 points)  
B. An adjacency matrix (2.5 points)  

Find the minimum spanning tree using Kruskal's algorithm. Show the steps as well as the edges selected. (10 points)

### **Solution - Question 7**

**A. Adjacency List:**    

![image-4.png](attachment:image-4.png)

**B. Adjacency Matrix:**  

![image-3.png](attachment:image-3.png)

**Minimum Spanning Tree:**

Step 1: List all edges of the graph in non-decreasing order of their weight.

Sorted edges by weight:
(1,2) - 1  
(5,6) - 4  
(6,7) - 5  
(0,3) - 6  
(0,4) - 6  
(1,6) - 6  
(4,7) - 7  
(2,6) - 9  

Step 2: Start with an empty graph and add edges one by one in increasing order of their weights. If adding an edge creates a cycle, we discard it.

Let's go edge by edge:

(1,2) - Add it. No cycle formed.  
(5,6) - Add it. No cycle formed.  
(6,7) - Add it. No cycle formed.  
(0,3) - Add it. No cycle formed.  
(0,4) - Add it. No cycle formed.  
(1,6) - Add it. No cycle formed.  
(4,7) - Add it. No cycle formed.  
(2,6) - Discard. It would create a cycle with vertices 1, 2, and 6.  

Result: The edges in the minimum spanning tree are:  
(1,2), (5,6), (6,7), (0,3), (0,4), (1,6), and (4,7)

The weight of this minimum spanning tree is 1 + 4 + 5 + 6 + 6 + 6 + 7 = 35.

![image.png](attachment:image.png)



### **Justification and Proof of Correctness - Question 7**

**Validuty of the Minimum Spanning Tree:**

Edges included:
(1,2), (5,6), (6,7), (0,3), (0,4), (1,6), and (4,7)

- **Connectivity:** These edges ensure that every vertex is connected directly or indirectly to every other vertex.

- **No Cycles:** None of these edges introduce a cycle. For example, the potential cycle I previously mentioned with edge (4,7) doesn't exist because, up to that point in the algorithm, vertex 4 was not connected to vertex 7 through any other path.

- **Optimality:** The edges were added based on increasing weight. By the time we reached the heaviest edge (2,6) with a weight of 9, adding it would have caused a cycle. Hence, the MST we derived avoids the heavier edges that would form cycles, ensuring the overall weight is minimized.

Given the nature of Kruskal's algorithm, the greedy choice property, and by ensuring we're not forming cycles, the derived solution is guaranteed to be the Minimum Spanning Tree for the given graph.