### Question 1 (10 points)

There are x research groups at Northeastern University, each focusing in a certain study subject such as HCI, Data Analytics, Bioinformatics, and so on. Each research group has a defined grant amount for supporting research and stipends, as well as a limited number of employment. Your job is to assign n graduate students to these research groups based on particular criteria.
Students and research groups both have preferences: 

- Students prioritize depending on research emphasis, publication prospects, and mentorship quality.
- Research groups prioritize depending on students' abilities, previous research, academic records, and specialized skill sets. 

**Objectives:**
1. Fills all open research jobs within each group.
2. Does not exceed the funding amount for any research group.
3. Match depending on the preferences and skill sets of the students and groups.

**Constraints:**

- The number x denotes the number of research groups.
- n denotes the total number of graduate students.
- Each research group has a finite number of open slots y, where y>0.
- Each student possesses a non-negligible set of talents.
- To be considered for a post, each research group may require one or more abilities.

**Common Input and Output Formats:**

*-Input:* Lists of students' choices, stipend needs, and skill sets, followed by lists of research groups' preferences, grant amounts, and necessary skills.

*-Output:* A stable list of student-research group matching that complies to funding and skill set requirements.

Derive an algorithm to create stable matching between the students and research groups. Write a pseudocode and explain the time complexity of the algorithm used.


### **Solution - Question 1**

The problem is an extension of the Stable Marriage problem, and we can solve it using a modified Gale-Shapley algorithm. Checking for grant amount limits and ensuring that students have the necessary abilities for a certain research group are among the modifications.

### Pseudocode

```plaintext
Initialize all students as "unmatched"
Initialize all research groups as "not full" and their grant amount to the initial grant available
Create empty lists to hold matches for each research group

while there is an "unmatched" student who has not proposed to every research group:
    student = first unmatched student who has not proposed to every group
    group = first group on student's list to which he/she has not yet proposed to
    
    if group is "not full":
        if student has required skills for group:
            if grant_left_in_group >= student's stipend demand:
                Add student to group's list
                Mark student as "matched"
                Decrease group's remaining grant by student's stipend demand
                if group's positions are filled:
                    Mark group as "full"
    else:
        if student has required skills for group:
            if grant_left_in_group >= student's stipend demand:
                worst_matched_student = worst-matched student currently in group according to group's preference
                if group prefers student over worst_matched_student:
                    Remove worst_matched_student from group
                    Add student to group
                    Mark worst_matched_student as "unmatched"
                    Mark student as "matched"
                    Update group's remaining grant
```

### Analysis of Time Complexity

1. Each student may be required to make a proposal to each of the x research groups: *O(n x) = *O(n x)*
2. In each proposal, we may need to determine whether a student possesses the necessary abilities for the group. This check might take up to *O(5)* seconds: *O(5) = O(1)*
3. We may also need to check the worst-matched student in a research group for each proposal, which can require *O(y)*, where y is the number of places in the group: *O(y) = O(y)*

As a result, the overall time complexity is *O(n * x * 1 * y)* = *O(n * x * y)*.

Take note that x represents the number of research groups, n represents the number of students, and y represents the maximum number of slots in any group.

### **Justification and Proof of Correctness - Question 1**

Proving the correctness of an algorithm generally involves demonstrating that it fulfills specific qualities or requirements that identify a "correct" answer. We want to prove two key features of the improved Gale-Shapley algorithm:

- **Stability**: There should be no motivation for any student or research group to break their allocated matching to pair with each other.
- **Feasibility**: All limitations, such as skills, stipend expectations, and grant amounts, must be met.

#### Proof of Stability

For Students:

- A student will submit study groups in their preferred sequence. 
- A student will never be transferred from a higher favored group to a less preferred group. 
- As a result, once matched, a student has little reason to depart from the designated group because they are in the greatest possible position.

For Research Groups:

- A group will only replace a current match if the new student is preferred. When a group is full, any replacement keeps or enhances the group's preference list.
- As a result, a research group has no incentive to break the existing matching for a different, less desirable student.

#### Proof of Feasibility

- Before making any matches, the skill sets are verified. As a result, no student is in a group who does not possess the necessary abilities. Before matching, the stipend is reviewed and taken from the group's funding, guaranteeing that the group can afford all of its matches.

#### Inductive Proof of Correctess 

- Base Case (n=0): No one is matched when the process starts, which is a valid condition of the system. Thus it holds for n=0.
- Inductive Hypothesis(n=k): Assuming that the procedure holds for n=k, all students and groups are stably and feasibly matched after k iterations.

- Inductive Step(n=k+1): Consider iteration k+1.

    - If a student is matched, they are assigned to the most favored group that has not yet rejected them, for which they have the necessary abilities and may be supported by the group's remaining grant. 
    - If a group gets full or has to replace a less favored student, it replaces him or her with a more preferred one, ensuring stability.
   - This also indicates that if the group discovered the most preferred student in the k+1 iteration, the less preferred student would have been replaced in the k iteration. As a result, the procedure works for n=k+1.

Therefore the modified gale-shapely algorithm used here is proven to be stable, viable, and complete. As a result, it is a viable solution to the problem.

### Question 2 (10 points)

You are provided a function called *OptimalVendorPairing*, which can be used to solve the problem statement.

*OptimalVendorPairing()*

- *Input:* A set of *n* food trucks and *n* popular locations. Each food truck has a preference list ranking all *n* locations, and each location has a preference list ranking all *n* food trucks.
- *Output:* A stable pairing of food trucks to locations.

You're in charge of coordinating *n* art events and *n* food trucks. Each art event requires exactly one food truck. However, both the events and the food trucks have their own criteria: An art event might find some food trucks "incompatible" due to space constraints, noise level, or type of cuisine. Similarly, a food truck may find some art events "incompatible" due to lack of audience interest, parking issues, or event duration.

In this context, an allocation is considered "stable" if:

- No food truck is allocated to an art event it deems incompatible.
- No art event is allocated a food truck it finds incompatible.
- There are no unallocated food truck-art event pairs that both find the other compatible and would prefer each other over their current allocation.

**Typical Input and Output Format:**

*-Input:*     
- artEvents: A list of *n* art events.
- foodTrucks:  A list of *n* food trucks.
- eventPrefs: Each art event's preference list of food trucks, may include both acceptable and unacceptable choices.
- truckPrefs: Each food truck's preference list of art events, may include both acceptable and unacceptable choices.
- eventUnfit: A list of *n* lists, each containing boolean values that indicate if an art event finds the corresponding food trucks unacceptable.
- truckUnfit: A list of *n* lists, each containing boolean values that indicate if a food truck finds the corresponding art events unacceptable.

*-Output:* 
- matchings: A list of *n* tuples, each containing an art event and a food truck that it's paired with. If an art event is not paired, it would be represented as None. 

Note: It's not necessary for every food truck to be allocated to have a stable system, due to the concept of "incompatible" choices.The preference lists may include both acceptable and unacceptable choices.

(a)  Given *n* art events and *n* food trucks, along with their respective preference lists and incompatible choices, Derive an algorithm and describe how you can use *OptimalVendorPairing* to find a stable pairing.

(b) Estimate the runtime of your proposed algorithm. Assume that for an input with *n* art events and *n* food trucks, OptimalVendorPairing has a runtime of O(n^2).

### **Solution - Question 2**

**(a) Pseudo-code**

Pseudo-code for the algorithm that handles stable pairings of art events and food trucks:

```plaintext
Algorithm StableEventTruckPairing(artEvents, foodTrucks, eventPrefs, truckPrefs, eventUnfit, truckUnfit)
    
    // Step 1: Preprocessing to Remove Incompatible Choices
    Declare cleanedEventPrefs as empty list
    Declare cleanedTruckPrefs as empty list

    For i from 0 to length(artEvents) - 1:
        Declare tempList as empty list
        For j from 0 to length(foodTrucks) - 1:
            If eventUnfit[i][j] is False:
                Append eventPrefs[i][j] to tempList
        End For
        Append tempList to cleanedEventPrefs
    End For

    For i from 0 to length(foodTrucks) - 1:
        Declare tempList as empty list
        For j from 0 to length(artEvents) - 1:
            If truckUnfit[i][j] is False:
                Append truckPrefs[i][j] to tempList
        End For
        Append tempList to cleanedTruckPrefs
    End For

    // Step 2: Call to OptimalVendorPairing Function
    matchings = OptimalVendorPairing(artEvents, foodTrucks, cleanedEventPrefs, cleanedTruckPrefs)

    // Step 3: Postprocessing for Unallocated Events or Trucks
    Declare finalMatchings as empty list

    For each tuple in matchings:
        If tuple contains None:
            Append None to finalMatchings
        Else:
            Append tuple to finalMatchings
        End If
    End For

    Return finalMatchings

End Algorithm
```

**(b) Estimating Runtime**

1. Preprocessing Step: It would take O(n^2) time to go through every art event and every food truck and update their preference lists.

2. OptimalVendorPairing Call: According to the problem statement, the OptimalVendorPairing() function has a runtime of O(n^2).

3. Postprocessing Step: This will also take O(n) time to go through each tuple in the matchings list.

Therefore, the total runtime would be O(n^2) for preprocessing + O(n^2) for OptimalVendorPairing + O(n) for postprocessing = O(n^2).

The dominating factor here is O(n^2), so the overall runtime complexity of the proposed algorithm is O(n^2).


### **Justification and Proof of Correctness - Question 2**

**Pre-Conditions and Assumptions**

- OptimalVendorPairing is assumed to be correct and stable when given any set of preference lists.
- Incompatible choices can be completely and correctly identified by the eventUnfit and truckUnfit lists.

**Proof of Correctness**

*Part 1: Preprocessing Step*

Claim: After preprocessing, the cleaned-up preference lists (cleanedEventPrefs and cleanedTruckPrefs) will only contain compatible choices.

*Proof:*

- We loop through each list in eventPrefs and truckPrefs, removing choices that are marked as incompatible in eventUnfit and truckUnfit.
- Therefore, all the remaining choices in cleanedEventPrefs and cleanedTruckPrefs are compatible by definition.

*Part 2: Application of OptimalVendorPairing*

Claim: When applied to the cleaned-up preference lists, OptimalVendorPairing will produce a stable matching.

*Proof:*

- The input preference lists only contain compatible choices, thanks to the preprocessing.
- OptimalVendorPairing is assumed to be correct and produces stable pairings.
- Therefore, the output should be a stable matching based on the definitions provided in the problem statement.

*Part 3: Postprocessing Step*

Claim: Any unmatched (None) entries in the matching list are inevitable and do not violate the stability of the system.

*Proof:*

- Any "None" entries are results of not finding a compatible and preferable choice for a given art event or food truck.
- By definition, this means there are no better or acceptable alternatives for these entries, so they do not violate stability.

### Question 3 (20 points)

(a) Construct an undirected graph as described on the right, using an adjacency matrix named "Adj". This matrix should be implemented as a direct access array set. The vertices are labeled from 0 to 5. For each vertex u in the set {0,1,2,3,4,5}{0,1,2,3,4,5}, Adj[u] will represent its adjacency list. The adjacency lists themselves should also be implemented as direct access array sets. An element Adj[u][v] should be set to 1 if there is an edge connecting vertices u and v. [5 points]

Adjacency Matrix (Adj):

![Adjacency Matrix](image.png)

(b) Write down the adjacency list representation of the graph below by using Python's Dictionary structure, where each node v has its list of adjacent nodes Adj[v]Adj[v]. Here, Adj[v]Adj[v] should be a Python List that contains the nodes connected to v, sorted in alphabetical sequence. [5 points]

![Graph](image-4.png)

(c) Execute both Breadth-First Search (BFS) and Depth-First Search (DFS) on the graph illustrated in part (b). Initiate your search at node A. Visit the adjacent nodes of each vertex in alphabetical sequence. Draw the tree such that each algorithm would construct and enumerate the nodes in the sequence they were initially discovered. [5 points]

(d) It is conceivable to disconnect one edge from the graph in part (b) so that the graph transforms into a Directed Acyclic Graph (DAG). Identify every edge that possesses this characteristic. Additionally, for each of these edges, specify the resulting topological sequence for the modified graph. [5 points]

### **Solution - Question 3**

(a) The adjacency matrix Adj represents an undirected graph with vertices labeled from 0 to 5. In this matrix, Adj[u][v] = 1 signifies that there's an edge between vertex u and vertex v.

Let's interpret the given adjacency matrix:

- Vertex 0 is connected to Vertex 2.
- Vertex 1 is connected to Vertices 3, 4, and 5.
- Vertex 2 is connected to Vertices 0, 3, and 4.
- Vertex 3 is connected to Vertices 1 and 2.
- Vertex 4 is connected to Vertices 1, 2, and 5.
- Vertex 5 is connected to Vertices 1 and 4.

![Graph](image-5.png) 

(b) The adjacency list representation in Python Dictionary form is as follows:

Adj = {  
    'A': ['B'],  
    'B': ['C', 'D'],  
    'C': ['E', 'F'],  
    'D': ['E', 'F'],  
    'E': [],  
    'F': ['D']    
}

(c) 
In a Breadth-First Search (BFS), nodes at the same depth are visited before moving to the next depth level. Depth-First Search (DFS) explores as deeply as possible before backtracking.

BFS: 
- Starting from node 'A', you visit its only neighbor 'B'. Then, from 'B', you go on to visit its neighbors 'C' and 'D'. Once at 'C', you visit 'E' and 'F'. 
- By the time you reach 'D', you find that its neighbors 'E' and 'F' are already visited, so you don't add anything new to the BFS tree.

DFS: 
- You start at 'A', go to 'B', then to 'C', and keep going until you hit a leaf node or revisit a node. Starting from 'A', you first explore 'B', then move to 'C'. 
- 'C' leads you to 'E', a leaf node. You backtrack to 'C' and then go to 'F'. 
- From 'F', you can move to 'D', which hasn't been visited yet.

BFS [A, B, C, D, E, F]

![BFS Tree](image-6.png)

DFS [A, B, C, E, F, D]

![DFS Tree](image-7.png)

(d) In this graph, there's a cycle involving vertices 'D' and 'F'. A Directed Acyclic Graph (DAG) doesn't have any cycles. Therefore, to convert the graph into a DAG, you can remove either edge (D, F) or (F, D). This breaks the cycle, making the graph acyclic.

- Removing edge (D, F) leaves you with a unique topological ordering: (A, B, C, F, D, E).
- Removing edge (F, D) gives you two possible topological orderings: (A, B, C, D, F, E) and (A, B, D, C, F, E).

In a DAG, a topological ordering is a linear ordering of its vertices such that for every directed edge (u, v), vertex u comes before v in the ordering. Once the graph becomes a DAG, you can safely perform topological sorting.

### **Justification and Proof of Correctness - Question 3**

(a) Adj = [
    [0, 0, 1, 0, 0, 0],
    [0, 0, 0, 1, 1, 1],
    [1, 0, 0, 1, 1, 0],
    [0, 1, 1, 0, 0, 0],
    [0, 1, 1, 0, 0, 1],
    [0, 1, 0, 0, 1, 0]
]

The procedure followed was to derive the adjacency list from the adjacency matrix. For each vertex u, we traversed its corresponding row in the matrix. If Adj[u][v]=1, we added v to the adjacency list of u

Now, for each Adj[u][v]=1 in the matrix, ensure that v appears in the adjacency list of u. Based on the given matrix, we can observe, for each 1 in the adjacency matrix, the corresponding vertex is included in the adjacency list of the vertex represented by the row. This proves that the adjacency list representation is a correct translation of the adjacency matrix. And thus the undirected graph is correctly represented based on the adjacency matrix.

(b)
- Node A: It only has an edge to B. This is correctly represented in the adjacency list as 'A': ['B'].   
- Node B: It has edges to C and D. In the adjacency list, this is represented as 'B': ['C', 'D'].    
- Node C: It has edges to E and F. This corresponds to 'C': ['E', 'F'] in the adjacency list.    
- Node D: It has edges to E and F, which are represented as 'D': ['E', 'F'] in the adjacency list.   
- Node E: It has no outgoing edges, appropriately represented by an empty list 'E': [].  
- Node F: It has an edge to D, accurately captured as 'F': ['D'] in the adjacency list.

*Note: From the above representation we can see that every directed edge in the graph from a node u to a node v is represented exactly once in the adjacency list for u. Additionally, nodes without outgoing edges are represented by an empty list, indicating that no other vertices are reachable from them. Thereby, accurately capturing all the directed edges in the graph*

(c)

Claim for BFS and DFS:

When we follow the BFS or DFS approach, starting from node 'A,' we'll traverse all nodes reachable from 'A' exactly once, capturing the most efficient (shortest) path in BFS and exploring all connected branches in DFS.

Justification:

- Start Point - Initialization:
    - We initially start from node 'A' for both techniques because it's our starting point. At this moment, our "visited list" contains only 'A' in both BFS and DFS.

- Next Steps - Exploration and Backtracking:

    - BFS: From 'A,' we look immediate neighbors first, which is 'B' in this case. Once we "move" to 'B,' we look at its neighbors, 'C' and 'D,' and so on. We naturally avoid visiting a node twice to prevent redundancy.

    - DFS: Again starting from 'A,' we first consider its neighbor 'B' but then continue to venture as far down this branch as possible (to 'C,' then 'E' and 'F') before coming back. In this process, we don't visit a node twice; if we think of one, we either explore further or backtrack.

- Stop Point - Ending The Thought Process:

    - For both BFS and DFS, we can conclude our exploration when there are no more new nodes to consider. By this point, we've visited all nodes reachable from 'A' exactly once.

By following these steps, we're essentially doing what the formal algorithm does. Thereby making sure each node is visited exactly once in both BFS and DFS, and respectively takes its most efficient path.

(d)

Claim: Removing either edge (D, F) or (F, D) will result in a Directed Acyclic Graph (DAG) and the given topological orderings are correct.

Proof:

- Acyclicity: The original graph contains only one cycle between 'D' and 'F'. Removing either edge (D, F) or (F, D) breaks this cycle, resulting in an acyclic graph, thereby making it a DAG.

- Topological Ordering:
    - When edge (D, F) is removed, the only possible topological ordering is (A, B, C, F, D, E).
    - When edge (F, D) is removed, two topological orderings are possible: (A, B, C, D, F, E) and (A, B, D, C, F, E).

These orderings satisfy the topological ordering condition: if there is a directed edge (u, v), then u comes before v in the ordering.

### Question 4 (10 points)

For each group of functions, sort each group of functions in the order of increasing order growth in terms of computational complexity. 

Set 1: Linear, Logarithmic, and Power Functions (Check for higher values of n i.e n>500) ) [5 points]

\begin{align*}
f1(n) &= 3n^{0.7} \log n \\
f2(n) &= n (\log n)^{0.5} \\
f3(n) &= 5n\log n \\
f4(n) &= n^{2} \\
f5(n) &= (0.9)^n n^2
\end{align*}


Set 2: Polynomial and Exponential Functions [5 points]

\begin{align*}
f1(n) &= n^3 (log n) \\
f2(n) &= 10n (log^2 n) \\
f3(n) &= n^2 1.5^n \\
f4(n) &= n^2 2^{\log n} \\
f5(n) &= {3^n}
\end{align*}



### **Solution - Question 4**

(a)

\begin{align*}
\hspace{0pt} f5(n) &\hspace{0pt} < f1(n) < f2(n) < f3(n) < f4(n)\\
\\
\hspace{0pt} O((0.9)^n n^2) &\hspace{0pt} < O(n^{0.7} \log n) < O(n (\log n)^{0.5}) < O(5n\log n) < O(n^{2})
\end{align*}

(b)

\begin{align*}
\hspace{0pt} f2(n) &\hspace{0pt} < f4(n) < f1(n) < f3(n) < f5(n)\\
\\
\hspace{0pt} O(nlog^2n) &\hspace{0pt} < O(n^{3}) < O(n^3 logn) < O(n^2 1.5^n) < O(3^n)
\end{align*}


### **Justification and Proof of Correctness - Question 4**

(a)

*Justification*

- f5 is dominated by (0.9)^n, which is an exponential decay. For large nn, it approaches zero, making it the slowest-growing function.
- f1 has a sub-linear growth n^0.7 combined with a logarithmic growth log⁡nlogn. It grows slower than linear-logarithmic but faster than logarithmic functions.
- f2 grows linearly in nn but has a slower-growing logarithmic term log⁡n^0.5. It grows faster than f1(n) due to the linear term n.
- f3 a linear-logarithmic function. It grows faster than both f1(n) and f2(n), but not as fast as quadratic or higher-degree polynomial functions.
- f4 grows quadratically, which is the fastest-growing among the list of functions for large n.

*Graph Plotting*

![Part(a) Computational Complexity Graph](image-8.png)

(b)

*Justification*
- f2(n) combines a linear term with a squared logarithmic term, making it the slowest-growing function in the set.
- f4(n) simplifies to a cubic term n^3, faster than f2(n) but slower than the remaining functions.
- f1(n) enhances a cubic term with a logarithmic multiplier, placing it ahead of f4(n) but behind exponential functions.
- f3(n) combines a quadratic term with an exponential one, resulting in a faster growth than cubic but slower than pure exponential functions.
- f5(n) is a pure exponential function with a base of 3, making it the fastest-growing function in this set

*Graph Plotting*

![Part(b) Computational Complexity Graph ](image-9.png)
![Part(b) Computational Complexity Graph](image-10.png)

### Question 5 (25 points)

**Problem Statement**

Alex, a recruiter at NUTech Solutions, faces the annual challenge of matching interns with mentors. Intrigued by the Gale-Shapley algorithm, Alex wonders if a modified version could offer a more equitable matching system that accounts for various complexities. Your job is to adapt the Gale-Shapley algorithm to these real-world conditions by developing Python functions. ***Please refer to Ques5_Coding.py for the starter code template***

(a) Not all mentors can guide interns on all projects due to differences in skills. Different mentors specialize in different skills, and interns have their own skill preferences. [10 points]

Create a Python function *create_skill_based_preferences(mentors, interns)* to construct initial preference lists based on matching skills.

- Input: Two lists of dictionaries for mentors and interns. Each dictionary contains a 'name' and a list of 'skills'.
- Output: Two lists of preference lists for mentors and interns based on skill compatibility.

(b) Implement the Gale-Shapley algorithm to find the stable matchings between mentors and interns based on their preferences.[10 points]

Create a Python function *find_stable_matching(mentor_preferences, intern_preferences)*, which finds the stable pairing based on the derived preferences from the previous problem
- Input:
    - mentor_preferences: A dictionary where the keys are mentor names and the values are lists of intern names, sorted by preference.
    - intern_preferences: A dictionary where the keys are intern names and the values are lists of mentor names, sorted by preference.
- Output: A dictionary containing stable matches where the key is the mentor and the value is the intern.

(c)  Sometimes, after matching, we get feedback from either the interns or the mentors that they're not happy with the pairing for specific reasons not covered in the initial skill-based matching. These could be due to location, timing, or project alignment.[5 points]

Your task is to write a function *remove_unstable_pairs(stable_pairs, unhappy_mentors, unhappy_interns)* that takes the stable pairs, a list of "unhappy" mentors, and a list of "unhappy" interns, then returns a new set of stable pairs after removing the specified unstable pairs.

- Input
    - stable_pairs: A dictionary representing stable pairs of mentors and interns.
        - Key: Mentor name as a string
        - Value: Intern name as a string
    - unhappy_mentors: A list of mentor names (strings) that are unhappy with their pairing.
    - unhappy_interns: A list of intern names (strings) that are unhappy with their pairing.

- Output
    - A dictionary representing new stable pairs after removing the unstable pairs.

### Question 6 (25 points)

**Problem Statement**

You are working with the traffic department of a smart city project. The city is represented as a graph where intersections are nodes and roads are edges. Your task is to develop algorithms that help drivers and the traffic department in various ways:

(a) Emergency Vehicle Routing (BFS) [5 points]

Emergency services like ambulances and fire brigades need to reach their destinations in the shortest time possible. Write a function shortest_path(graph: dict, start: str, end: str) that takes the city graph and a starting and ending intersection, then returns the shortest path from the start to the end.

- Input

    - graph: A dictionary where keys are intersection names and values are lists of neighboring intersections.
    - start: The starting intersection as a string.
    - end: The ending intersection as a string.
    
- Output

    - A list containing the names of the intersections in the order they should be followed to get from start to end in the shortest time.


(b) Road Maintenance Scheduling (DFS) [5 points]

The city is planning road maintenance and wants to ensure every road is checked for potholes. However, they want to do this in the most efficient way, where each intersection/node is traversed exactly once if possible. Write a Python function maintenance_path(graph: dict, start: str) -> list to find such a path.

- Input

    - graph: A dictionary where keys are intersection names and values are lists of neighboring intersections.
    - start: The starting intersection as a string.

- Output

    -A list of intersection names in the order they should be traversed for efficient road inspection. If multiple paths exist, return any.


(c) Safest Route (BFS with Weighted Edges) [7 points]

Now each road has a safety rating associated with it. The city council wants to know the safest route to travel from one point to another. For the purposes of this problem, the "safest" route is defined as the route with the highest sum of safety ratings along its path.

Write a Python function safest_path(graph: dict, start: str, end: str) -> list that uses BFS to find the safest route.

- Input:
    - graph: A dictionary where keys are intersection names and values are dictionaries with neighboring intersections and their safety rating.
    - start: The starting intersection as a string.
    - end: The destination intersection as a string.

- Output:
    - A list of intersection names, representing the safest path from start to end according to the sum of safety ratings. If multiple paths exist with the same highest sum, returning any is acceptable.

- Constraints:
    - Safety ratings are positive integers.
    - If there's a tie for the safest route according to the sum of safety ratings, any route is acceptable.

(d) Minimum Stops to Refuel (BFS) [8 points]

Given that every car has a sufficient charge to travel directly from one intersection to any neighboring intersection, you need to find the path from a starting point to an end point that minimizes the number of charging stops. This is critical as fewer stops mean reduced charging queue times and a smoother traffic flow, thereby promoting the use of electric cars and contributing to the city's sustainability goals.

Write a Python function def min_charging_stops(graph, start, end, stations) to find the path that minimizes the number of charging stops.

- Input:
    - graph: A dictionary where keys are intersection names and values are lists of neighboring intersections.
    - start: The starting intersection as a string.
    - end: The destination intersection as a string.
    - stations: A list of intersections where charging stations are present.

- Output:
    - A list of intersection names from start to end that minimizes the number of charging stops.


### Question 7 (10 points)

Analyze the following code snippets and determine the overall time complexity of each.

(a)

This search_emails function filters a list of emails (emails) based on a list of target keywords (targets). It iterates through the 'subject' and 'body' of each email, appending matching emails to a found_emails list before returning it

```python
def search_emails(emails, targets):
    found_emails = []
    for email in emails:           
        for target in targets:     
            if target in email['subject'] or target in email['body']:
                found_emails.append(email)
                break
return found_emails
```
(b)

This find_mutual_friends function identifies mutual friends and shared interests between two users (user1_data and user2_data). It iterates through the 'friends' and 'interests' fields of both users' data, appending matching pairs to a mutuals list before returning it.

```python
def find_mutual_friends(user1_data, user2_data):
    mutuals = []
    for friend in user1_data['friends']:        # Loop 1: n iterations
        if friend in user2_data['friends']:     # List search: m iterations
            for interest in user1_data['interests']:  # Loop 2: p iterations
                if interest in user2_data['interests']: # List search: q iterations
                    mutuals.append((friend, interest))
return mutuals
```
(c)

The generate_combinations function creates a list of combinations for available pizza toppings, considering a list of 'exclusions'. It iterates through the 'toppings' list to generate pairs, excluding any pairs present in the 'exclusions' list, and then appends the remaining combinations to a combinations list. Your analysis should consider the nested loops and the list search for exclusions. 

```python
def generate_combinations(toppings, exclusions):
    combinations = []
    for i in range(len(toppings)):              # Loop 1: n iterations
        for j in range(i+1, len(toppings)):     # Loop 2: n - 1, n - 2, ... 1 iterations
            if (toppings[i], toppings[j]) not in exclusions:  # List search: m iterations
                combinations.append((toppings[i], toppings[j]))
return combinations
```

(d)

The function performs a binary search on an array (arr) to find a target value (target) but also has the option to calculate all permutations of the array when a specific depth parameter is set to 0. *Your analysis should consider the logarithmic complexity of binary search and the factorial complexity of calculating permutations.*

```python
def advanced_search(arr, target, depth):
    if depth == 0:
        return calculate_permutations(arr)  
    low = 0
    high = len(arr) - 1
    while low <= high:     
        mid = (low + high) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            low = mid + 1
        else:
            high = mid - 1
    return -1
```
(e) 

The exponential_logarithmic_function combines two operations: generating all subsets of a set s and performing a binary search on an array arr for each subset's length. 

```python
# Function to perform binary search
def binary_search(arr, target):
    low = 0
    high = len(arr) - 1
    while low <= high:
        mid = (low + high) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            low = mid + 1
        else:
            high = mid - 1
    return -1

# Function to generate all subsets of a set
def generate_subsets(s):
    if len(s) == 0:
        return [[]]
    subsets = []
    first = s[0]
    remaining = s[1:]
    for subset in generate_subsets(remaining):
        subsets.append(subset)
        subsets.append([first] + subset)
    return subsets

# Main function that combines both exponential and logarithmic complexities
def exponential_logarithmic_function(arr, s):
    for subset in generate_subsets(s):
        binary_search(arr, len(subset))
```

### **Solution - Question 7**

\begin{aligned}
&(a)  \quad O(n \times m \times p)\\
&(b)  \quad O(n \times m \times p \times q)\\
&(c) \quad O(n^2 \times m)\\
&(d) \quad O(n!)\\
&(e) \quad O(2^n \times \log m)
\end{aligned}



### **Justification and Proof of Correctness - Question 7**

(a)
- There are two nested loops:
    - The first loop iterates through each email in the list emails. Let's assume there are n emails.
    - The second loop iterates through each target keyword in the list targets. Let's assume there are m target keywords  

- Inside the inner loop, the function performs string matching using the in operator. In Python, the worst-case time complexity of the in operator for strings is O(p), where p is the length of the string being searched

Given these, the worst-case time complexity for the function can be calculated as follows:

- The nested loops contribute a time complexity of O(n * m).
- For each iteration of the inner loop, there's a string matching operation with time complexity O(p).
- The append() operation is generally O(1).

Putting these together, we get a worst-case time complexity of O(n * m * p)

(b)

- The outer loop iterates through each friend of user1_data. This contributes a time complexity of O(n).
- Inside the outer loop, there's a conditional statement that checks if friend is in user2_data['friends']. Checking for membership in a list has a time complexity of O(m) in the worst case.
- Inside the outer loop and the conditional statement, there's a nested loop that iterates through each interest of user1_data. This contributes a time complexity of O(p).
- Inside the nested loop, another conditional statement checks if interest is in user2_data['interests']. Checking for membership in a list has a time complexity of O(q) in the worst case.

- Finally, the append() operation has a constant time complexity of O(1).

Putting all these together, we get:

- The time complexity for the outer loop is O(n).
- The list search inside the outer loop has a time complexity of O(m).
- The nested loop has a time complexity of O(p).
- The list search inside the nested loop has a time complexity of O(q).

The overall time complexity is O(n×m×p×q).

(c)
- Outer Loop: The outer loop iterates n times, where n is the number of toppings. So, its time complexity is O(n).

- Inner Loop: The inner loop starts from i+1 to n-1. The number of iterations for the inner loop would be (n−1)+(n−2)+…+1, which forms an arithmetic series. The sum of this series is (n−1)×n/2, which simplifies to O(n^2).

- Exclusion Check: Inside the inner loop, there is a check to see if a particular combination is in the exclusions list. This operation is O(m), where m is the number of exclusions.

- Append Operation: This operation is O(1), a constant time operation.

Combining all these elements, we get the time complexity as O(n^2×m).

(d)

1. When depth is 0:

    In this scenario, the function invokes calculate_permutations(arr). Calculating all permutations of an array of length n has a time complexity of O(n!), as you are generating every possible arrangement of n items.

2. When depth is not 0:

    In this case, a binary search is performed on the array arr. The time complexity of a binary search operation is O(logn), where nn is the size of the array.


Since the function performs either the permutation calculation or the binary search depending on the value of depth, the worst-case time complexity would be the maximum of these two, which is O(n!) for calculating the permutations when depth is 0.

So, the worst-case time complexity of the advanced_search function is O(n!).


(e)

1. Generating Subsets (generate_subsets function):

    The generate_subsets function generates all possible subsets of the set s. For a set of size nn, there are 2n2n subsets. Thus, the time complexity for generating all subsets of s is O(2n).

2. Binary Search (binary_search function):

    The binary_search function has a time complexity of O(log⁡m), where m is the size of the array arr.

3. Combined Complexity (exponential_logarithmic_function function):

    In the exponential_logarithmic_function, the function iterates over each subset generated by generate_subsets(s) and performs a binary search using binary_search(arr, len(subset)).

Hence, the overall time complexity of the exponential_logarithmic_function will be O(2^n×log⁡m).


### Question 8 (10 points)

**Problem Statement:**

Emergency Evacuation Plan in a Building Using Graph Traversal Algorithms

Consider a building with multiple floors and rooms, represented as a graph where each node is a room and each edge represents a doorway between two rooms. Some rooms have staircases that lead to other floors. Due to a fire emergency, the occupants of the building need to be evacuated as quickly and safely as possible.

Develop a pseudocode for a modified BFS or DFS algorithm to find the quickest path from any given room to the nearest emergency exit. The algorithm should account for various constraints like blocked doorways, room capacities, and other hazards. The goal is to evacuate everyone to safety while minimizing the total evacuation time.

- Describe the problem-solving approach you plan to use before diving into the pseudocode.
- Write a pseudocode for the modified BFS or DFS algorithm to solve this problem.

Input Format:

- Number of Rooms: An integer N indicating the number of rooms (nodes).
- Number of Doorways: An integer M indicating the number of doorways (edges).
- Edge Weights: A list of M tuples (a, b, w), indicating there is a doorway between room a and room b with a time-weight of w (to- cross).
- Emergency Exits: A list E of rooms that are designated as emergency exits.
- Impassable Rooms: A list I of rooms that are unsafe or impassable.
- Start Room: An integer S indicating the starting room.

Output Format:

- Shortest Evacuation Time: An integer representing the shortest time needed to reach an emergency exit from the starting room. If no path is found, return -1

**Contraints:**

- The graph representing the building is connected but may contain cycles.
- Some nodes (rooms) may be marked as unsafe or impassable.
- Each edge (doorway) may have a weight representing how long it takes to pass through.
- There are limited emergency exits in the building, represented as special nodes in the graph.


### **Solution - Question 8**

**Approach:**

- Initialization: Create a data structure to hold the state of each room, which includes details like its distance from the starting room and whether it has been visited or not.

- Constraints: Incorporate checks to skip impassable rooms and to adjust edge weights according to the constraints like blocked doorways, room capacities, and other hazards.

- Emergency Exits: Identify the emergency exits within the building and incorporate them into the algorithm's stopping criteria.

- Traversal Algorithm: Use a modified Breadth-First Search (BFS) algorithm to explore the graph. BFS is chosen over Depth-First Search (DFS) because it is better suited for finding the shortest paths in weighted graphs.


**Pseudocode:**

```plaintext
Initialize an empty priority queue Q
Initialize a dictionary distance with all rooms set to infinity, except the starting room S set to 0

Push (0, S) into Q (distance, room)
while Q is not empty:
    current_distance, current_room = Pop the minimum element from Q
    
    if current_room is in Emergency Exits:
        return current_distance

    if current_room is in Impassable Rooms:
        continue

    Mark current_room as visited
    for each neighbor, edge_weight in neighbors of current_room:
        if neighbor is not visited:
            new_distance = current_distance + edge_weight
            
            if new_distance < distance[neighbor]:
                distance[neighbor] = new_distance
                Push (new_distance, neighbor) into Q

return -1  # If no path to any emergency exit is found
```

### **Justification and Proof of Correctness - Question 8**

Pseudocode Justification and correctness proof:

1. Initialization Step

    - Initializing the distance of all rooms to infinity except for the start room, which is set to zero This approach makes sure that any actual path found later will be shorter than infinity and will replace the initialized value, making it a minimal value at that point in time.

    - At the start, distance[start_room] = 0 and distance[any_other_room] = ∞. This ensures that the algorithm will correctly identify if it is able to improve upon this initial assignment.

2. Priority Queue (Q)

    - Priority queues ensure that we always choose the room closest to the starting room at every step. This is crucial for optimality.

    - Whenever we pop an element from the priority queue, it's guaranteed to be the room with the smallest distance label. This property ensures that we are exploring nodes in an order that aligns with finding the shortest paths first, validating the use of a priority queue.

3. Emergency Exit Check

    - Finding the shortest path to the nearest emergency exit; As soon as we find one, we can safely terminate the algorithm.

    - The priority queue ensures that the first exit encountered has the shortest path, and no other path to any other exit could be shorter. Therefore, stopping the algorithm when an emergency exit is reached is correct.

4. Impassable Rooms

    - The algorithm checks for impassable rooms and avoids them, which is necessary for the safety of the evacuation.

    - The algorithm only marks a room as visited if it is not in the impassable rooms list. This ensures that impassable rooms never affect the outcome of the evacuation plan.

5. Edge Relaxation and Update

    - When a room is visited, the algorithm updates the distances to its neighbors, ensuring that the shortest path is updated iteratively.

    - By updating only when a shorter path is found (new_distance < distance[neighbor]), the algorithm ensures optimality. This ensures that the shortest path to each room is found by the time that room is removed from the priority queue.