### Question 1 (10 points)

There are x research groups at Northeastern University, each focusing in a certain study subject such as HCI, Data Analytics, Bioinformatics, and so on. Each research group has a defined grant amount for supporting research and stipends, as well as a limited number of employment. Your job is to assign n graduate students to these research groups based on particular criteria.
Students and research groups both have preferences: 

- Students prioritize depending on research emphasis, publication prospects, and mentorship quality.
- Research groups prioritize depending on students' abilities, previous research, academic records, and specialized skill sets. 

**Objectives:**
1. Fills all open research jobs within each group.
2. Does not exceed the funding amount for any research group.
3. Match depending on the preferences and skill sets of the students and groups.

**Constraints:**

- The number x denotes the number of research groups.
- n denotes the total number of graduate students.
- Each research group has a finite number of open slots y, where y>0.
- Each student possesses a non-negligible set of talents.
- To be considered for a post, each research group may require one or more abilities.

**Common Input and Output Formats:**

*-Input:* Lists of students' choices, stipend needs, and skill sets, followed by lists of research groups' preferences, grant amounts, and necessary skills.

*-Output:* A stable list of student-research group matching that complies to funding and skill set requirements.

Derive an algorithm to create stable matching between the students and research groups. Write a pseudocode and explain the time complexity of the algorithm used.


### **Solution - Question 1**

The problem is an extension of the Stable Marriage problem, and we can solve it using a modified Gale-Shapley algorithm. Checking for grant amount limits and ensuring that students have the necessary abilities for a certain research group are among the modifications.

### Pseudocode

```plaintext
Initialize all students as "unmatched"
Initialize all research groups as "not full" and their grant amount to the initial grant available
Create empty lists to hold matches for each research group

while there is an "unmatched" student who has not proposed to every research group:
    student = first unmatched student who has not proposed to every group
    group = first group on student's list to which he/she has not yet proposed to
    
    if group is "not full":
        if student has required skills for group:
            if grant_left_in_group >= student's stipend demand:
                Add student to group's list
                Mark student as "matched"
                Decrease group's remaining grant by student's stipend demand
                if group's positions are filled:
                    Mark group as "full"
    else:
        if student has required skills for group:
            if grant_left_in_group >= student's stipend demand:
                worst_matched_student = worst-matched student currently in group according to group's preference
                if group prefers student over worst_matched_student:
                    Remove worst_matched_student from group
                    Add student to group
                    Mark worst_matched_student as "unmatched"
                    Mark student as "matched"
                    Update group's remaining grant
```

### Analysis of Time Complexity

1. Each student may be required to make a proposal to each of the x research groups: *O(n x) = *O(n x)*
2. In each proposal, we may need to determine whether a student possesses the necessary abilities for the group. This check might take up to *O(5)* seconds: *O(5) = O(1)*
3. We may also need to check the worst-matched student in a research group for each proposal, which can require *O(y)*, where y is the number of places in the group: *O(y) = O(y)*

As a result, the overall time complexity is *O(n * x * 1 * y)* = *O(n * x * y)*.

Take note that x represents the number of research groups, n represents the number of students, and y represents the maximum number of slots in any group.

### **Justification and Proof of Correctness - Question 1**

Proving the correctness of an algorithm generally involves demonstrating that it fulfills specific qualities or requirements that identify a "correct" answer. We want to prove two key features of the improved Gale-Shapley algorithm:

- **Stability**: There should be no motivation for any student or research group to break their allocated matching to pair with each other.
- **Feasibility**: All limitations, such as skills, stipend expectations, and grant amounts, must be met.

#### Proof of Stability

For Students:

- A student will submit study groups in their preferred sequence. 
- A student will never be transferred from a higher favored group to a less preferred group. 
- As a result, once matched, a student has little reason to depart from the designated group because they are in the greatest possible position.

For Research Groups:

- A group will only replace a current match if the new student is preferred. When a group is full, any replacement keeps or enhances the group's preference list.
- As a result, a research group has no incentive to break the existing matching for a different, less desirable student.

#### Proof of Feasibility

- Before making any matches, the skill sets are verified. As a result, no student is in a group who does not possess the necessary abilities. Before matching, the stipend is reviewed and taken from the group's funding, guaranteeing that the group can afford all of its matches.

#### Inductive Proof of Correctess 

- Base Case (n=0): No one is matched when the process starts, which is a valid condition of the system. Thus it holds for n=0.
- Inductive Hypothesis(n=k): Assuming that the procedure holds for n=k, all students and groups are stably and feasibly matched after k iterations.

- Inductive Step(n=k+1): Consider iteration k+1.

    - If a student is matched, they are assigned to the most favored group that has not yet rejected them, for which they have the necessary abilities and may be supported by the group's remaining grant. 
    - If a group gets full or has to replace a less favored student, it replaces him or her with a more preferred one, ensuring stability.
   - This also indicates that if the group discovered the most preferred student in the k+1 iteration, the less preferred student would have been replaced in the k iteration. As a result, the procedure works for n=k+1.

Therefore the modified gale-shapely algorithm used here is proven to be stable, viable, and complete. As a result, it is a viable solution to the problem.

### Question 2 (10 points)

To answer the problem statement, you are given a function called *OptimalVendorPairing*.

*OptimalVendorPairing()*

- *Input:* A collection of *n* food trucks and *n* well-known sites. Each food truck has a rating list of all *n* locations, and each location has a ranking list of all *n* food trucks.
- *Result:* A consistent matching of food trucks and places.

You are in charge of organizing *n* art events as well as *n* food trucks. Each art event necessitates the use of one food truck. Both the events and the food trucks, however, have their own set of criteria: Some food trucks may be "incompatible" with an art event owing to space limits, noise level, or cuisine type. Similarly, some art events may be "incompatible" with a food truck owing to a lack of audience engagement, parking constraints, or event duration.

In this context, an allocation is deemed "stable" if and only if the following conditions are met:

- No food truck is assigned to an art event deemed incompatible.
- No art event is assigned a food truck with which it is incompatible.
- There are no unallocated food truck-art event partners that are compatible and would favor each other above their current allocation. 

**Typical Input and Output Format:**

*-Input:*     
- artEvents: A collection of *n* art events.
- foodTrucks: An inventory of *n* food trucks.
- eventPrefs: Each art event's food truck preference list may include both approved and unsuitable options.
- truckPrefs: Each food truck's their preference list of art events, could contain both acceptable and unacceptable choices.
- eventUnfit: A collection of *n* lists, each having boolean values indicating if the related food trucks are undesirable to an art event.
- truckUnfit: A collection of *n* lists, each with boolean values indicating if a food truck considers the related art events undesirable.

*-Output:* - matchings: A list of *n* tuples, each of which contains an art event and the food truck with which it is paired. If no art event is paired, it is displayed as None. 

Note: Due to the notion of "incompatible" selections, it is not required to allocate every food truck un order to have a stable system.The preference lists may include both acceptable and unacceptable choices.

(a) Given *n* art events and *n* food trucks, each with its own set of preferences and incompatible options, Create an algorithm and explain how you may use it to identify a stable pairing using *OptimalVendorPairing*.

(b) Calculate the execution time of your suggested method. Assume that OptimalVendorPairing has a runtime of O(n^2) for an input containing *n* art events and *n* food trucks.

### **Solution - Question 2**

**(a) Pseudo-code**

Pseudo-code for the algorithm that handles stable pairings of art events and food trucks:

```plaintext
Algorithm StableEventTruckPairing(artEvents, foodTrucks, eventPrefs, truckPrefs, eventUnfit, truckUnfit)
    
    // Step 1: Preprocessing to Remove Incompatible Choices
    Declare cleanedEventPrefs as empty list
    Declare cleanedTruckPrefs as empty list

    For i from 0 to length(artEvents) - 1:
        Declare tempList as empty list
        For j from 0 to length(foodTrucks) - 1:
            If eventUnfit[i][j] is False:
                Append eventPrefs[i][j] to tempList
        End For
        Append tempList to cleanedEventPrefs
    End For

    For i from 0 to length(foodTrucks) - 1:
        Declare tempList as empty list
        For j from 0 to length(artEvents) - 1:
            If truckUnfit[i][j] is False:
                Append truckPrefs[i][j] to tempList
        End For
        Append tempList to cleanedTruckPrefs
    End For

    // Step 2: Call to OptimalVendorPairing Function
    matchings = OptimalVendorPairing(artEvents, foodTrucks, cleanedEventPrefs, cleanedTruckPrefs)

    // Step 3: Postprocessing for Unallocated Events or Trucks
    Declare finalMatchings as empty list

    For each tuple in matchings:
        If tuple contains None:
            Append None to finalMatchings
        Else:
            Append tuple to finalMatchings
        End If
    End For

    Return finalMatchings

End Algorithm
```

**(b) Estimating Runtime**

1. Preprocessing: It would take O(n^2) time to go through each art event and food truck and update their preference lists.

2. OptimalVendorPairing Call: The OptimalVendorPairing() method has an O(n^2) runtime, according to the issue statement.

3. Postprocessing Step: Going over each tuple in the matchings list will likewise require O(n) time.

As a result, the overall runtime is O(n^2) for preprocessing + O(n^2) for OptimalVendorPairing + O(n) for postprocessing = O(n^2).

Because O(n^2) is the determining factor here, the overall runtime complexity of the proposed approach is O(n^2).


### **Justification and Proof of Correctness - Question 2**

**Pre-Conditions and Assumptions**
- When given any set of preference lists, OptimalVendorPairing is presumed to be valid and stable.
- The eventUnfit and truckUnfit lists can identify incompatible options entirely and accurately.

**Proof of Correctness**

*Step 1: Preprocessing*

The cleaned-up preference lists (cleanedEventPrefs and cleanedTruckPrefs) will only include compatible selections after preprocessing.

*Proof:*

- In eventPrefs and truckPrefs, we cycle over each list, deleting choices indicated as incompatible in eventUnfit and truckUnfit. As a result, all of the remaining options in cleanedEventPrefs and cleanedTruckPrefs are by definition compatible.

*Part 2: Application of OptimalVendorPairing*

OptimalVendorPairing will generate a consistent matching when applied to the cleaned-up preference lists.

*Proof:*

- Because of the preprocessing, the input preference lists only include suitable options.
- It is expected that OptimalVendorPairing is valid and creates stable pairings.
- As a result, depending on the definitions supplied in the issue description, the output should be a stable matching.

*Step 3: Postprocessing*

Claim: Any mismatched (None) elements in the matching list are unavoidable and do not jeopardize the system's stability.

*Proof:*

- Any "None" entries are the consequence of not being able to identify an appropriate and preferable option for a certain art event or food truck. This signifies that these items do not violate stability because there are no better or acceptable alternatives.

### Question 3 (20 points)

(a) Construct an undirected graph as described on the right, using an adjacency matrix named "Adj". This matrix should be implemented as a direct access array set. The vertices are labeled from 0 to 5. For each vertex u in the set {0,1,2,3,4,5}{0,1,2,3,4,5}, Adj[u] will represent its adjacency list. The adjacency lists themselves should also be implemented as direct access array sets. An element Adj[u][v] should be set to 1 if there is an edge connecting vertices u and v. [5 points]

Adjacency Matrix (Adj):

![Adjacency Matrix](image.png)

(b) Write down the adjacency list representation of the graph below by using Python's Dictionary structure, where each node v has its list of adjacent nodes Adj[v]Adj[v]. Here, Adj[v]Adj[v] should be a Python List that contains the nodes connected to v, sorted in alphabetical sequence. [5 points]

![Graph](image-4.png)

(c) Execute both Breadth-First Search (BFS) and Depth-First Search (DFS) on the graph illustrated in part (b). Initiate your search at node A. Visit the adjacent nodes of each vertex in alphabetical sequence. Draw the tree such that each algorithm would construct and enumerate the nodes in the sequence they were initially discovered. [5 points]

(d) It is conceivable to disconnect one edge from the graph in part (b) so that the graph transforms into a Directed Acyclic Graph (DAG). Identify every edge that possesses this characteristic. Additionally, for each of these edges, specify the resulting topological sequence for the modified graph. [5 points]

### **Solution - Question 3**

(a) The adjacency matrix Adj represents an undirected graph with vertices labeled from 0 to 5. In this matrix, Adj[u][v] = 1 signifies that there's an edge between vertex u and vertex v.

Let's interpret the given adjacency matrix:

- Vertex 0 is connected to Vertex 2.
- Vertex 1 is connected to Vertices 3, 4, and 5.
- Vertex 2 is connected to Vertices 0, 3, and 4.
- Vertex 3 is connected to Vertices 1 and 2.
- Vertex 4 is connected to Vertices 1, 2, and 5.
- Vertex 5 is connected to Vertices 1 and 4.

![Graph](image-5.png) 

(b) The adjacency list representation in Python Dictionary form is as follows:

Adj = {  
    'A': ['B'],  
    'B': ['C', 'D'],  
    'C': ['E', 'F'],  
    'D': ['E', 'F'],  
    'E': [],  
    'F': ['D']    
}

(c) 
In a Breadth-First Search (BFS), nodes at the same depth are visited before moving to the next depth level. Depth-First Search (DFS) explores as deeply as possible before backtracking.

BFS: 
- Starting from node 'A', you visit its only neighbor 'B'. Then, from 'B', you go on to visit its neighbors 'C' and 'D'. Once at 'C', you visit 'E' and 'F'. 
- By the time you reach 'D', you find that its neighbors 'E' and 'F' are already visited, so you don't add anything new to the BFS tree.

DFS: 
- You start at 'A', go to 'B', then to 'C', and keep going until you hit a leaf node or revisit a node. Starting from 'A', you first explore 'B', then move to 'C'. 
- 'C' leads you to 'E', a leaf node. You backtrack to 'C' and then go to 'F'. 
- From 'F', you can move to 'D', which hasn't been visited yet.

BFS [A, B, C, D, E, F]

![BFS Tree](image-6.png)

DFS [A, B, C, E, F, D]

![DFS Tree](image-7.png)

(d) In this graph, there's a cycle involving vertices 'D' and 'F'. A Directed Acyclic Graph (DAG) doesn't have any cycles. Therefore, to convert the graph into a DAG, you can remove either edge (D, F) or (F, D). This breaks the cycle, making the graph acyclic.

- Removing edge (D, F) leaves you with a unique topological ordering: (A, B, C, F, D, E).
- Removing edge (F, D) gives you two possible topological orderings: (A, B, C, D, F, E) and (A, B, D, C, F, E).

In a DAG, a topological ordering is a linear ordering of its vertices such that for every directed edge (u, v), vertex u comes before v in the ordering. Once the graph becomes a DAG, you can safely perform topological sorting.

### **Justification and Proof of Correctness - Question 3**

(a) Adj = [  
        [0, 0, 1, 0, 0, 0],  
        [0, 0, 0, 1, 1, 1],  
        [1, 0, 0, 1, 1, 0],  
        [0, 1, 1, 0, 0, 0],  
        [0, 1, 1, 0, 0, 1],  
        [0, 1, 0, 0, 1, 0]  
    ]

The adjacency list was generated by deriving it from the adjacency matrix. We went over the matrix row by row for each vertex u. If Adj[u][v]=1, we added v to the list of u's adjacencies.

Ensuring that v appears in the adjacency list of u for any Adj[u][v]=1 in the matrix. Based on the provided matrix, we can see that for every 1 in the adjacency matrix, the corresponding vertex is included in the adjacency list of the row's vertex. This demonstrates that the adjacency list representation is an accurate representation of the adjacency matrix. As a result, the adjacency matrix appropriately represents the undirected graph.

(b)
- Node A: It has only one edge to B. In the adjacency list, this is appropriately written as 'A': ['B'].   
- Node B: It has edges to Nodes C and D. This is expressed in the adjacency list as 'B': ['C', 'D'].    
- Node C: It has edges to Nodes E and F. In the adjacency list, this corresponds to 'C': ['E', 'F'].    
- Node D: It has edges to E and F, which are represented in the adjacency list as 'D': ['E', 'F'].   
- Node E: It has no outgoing edges, which is indicated by an empty list 'E': [].  
- Node F: It has an edge to D, which is appropriately represented in the adjacency list as 'F': ['D'].

*Note: As can be seen from the preceding representation, every directed edge in the graph from node u to node v is represented precisely once in the adjacency list for u. Furthermore, nodes with no outgoing edges are represented by an empty list, suggesting that they are not accessible from any other vertices. As a result, all of the graph's directed edges are precisely captured*.

(c)

BFS and DFS claim:

Starting with node 'A,' we'll explore all nodes accessible from 'A' precisely once, capturing the most efficient (shortest) path in BFS and examining all related branches in DFS.

Justification:

- Starting Point - Initialization: For both strategies, we begin with node 'A' because it is our beginning point. Our "visited list" currently comprises only 'A' in both BFS and DFS.

- Next Steps - Exploration and Reversal:

    - BFS: From 'A,' we search immediate neighbors first, which is 'B' in this case. When we "move" to 'B,' we examine its neighbors, 'C' and 'D,' and so on. To eliminate redundancy, we avoid accessing a node twice.

    - DFS: Starting with 'A,' we consider its neighbor 'B,' but then proceed along this branch as far as feasible (to 'C' then 'E' and 'F') before returning. We do not visit a node twice in this operation. If we come upon one, we either investigate deeper or go backwards.

- Stop Point - Conclusion of the Process:

    - For both BFS and DFS, we can end the process when there are no new nodes to evaluate. We've visited every node accessible from 'A' precisely once by this stage.

We're effectively performing what the formal algorithm does by following these steps. As a result, in both BFS and DFS, each node is visited precisely once and chooses the most efficient path.

(d)

Claim: Removing either edge (D, F) or (F, D) yields a Directed Acyclic Graph (DAG), and the topological orderings supplied are valid.

Proof:

- Acyclicity: There is only one cycle between 'D' and 'F' in the original graph. Removing either edge (D, F) or (F, D) breaks this cycle, resulting in an acyclic graph and hence a DAG.

- Topological Ordering: 
    - When edge (D, F) is deleted, the only topological ordering that remains is (A, B, C, F, D, E).
    - When edge (F, D) is deleted, there are two potential topological orderings: (A, B, C, D, F, E) and (A, B, D, C, F, E).

These orderings meet the topological ordering condition: if a directed edge (u, v) exists, then u is ordered before v.

### Question 4 (10 points)

For each group of functions, sort each group of functions in the order of increasing order growth in terms of computational complexity. 

Set 1: Linear, Logarithmic, and Power Functions (Check for higher values of n i.e n>500) ) [5 points]

\begin{align*}
f1(n) &= 3n^{0.7} \log n \\
f2(n) &= n (\log n)^{0.5} \\
f3(n) &= 5n\log n \\
f4(n) &= n^{2} \\
f5(n) &= (0.9)^n n^2
\end{align*}


Set 2: Polynomial and Exponential Functions [5 points]

\begin{align*}
f1(n) &= n^3 (log n) \\
f2(n) &= 10n (log^2 n) \\
f3(n) &= n^2 1.5^n \\
f4(n) &= n^2 2^{\log n} \\
f5(n) &= {3^n}
\end{align*}



### **Solution - Question 4**

(a)

\begin{align*}
\hspace{0pt} f5(n) &\hspace{0pt} < f1(n) < f2(n) < f3(n) < f4(n)\\
\\
\hspace{0pt} O((0.9)^n n^2) &\hspace{0pt} < O(n^{0.7} \log n) < O(n (\log n)^{0.5}) < O(5n\log n) < O(n^{2})
\end{align*}

(b)

\begin{align*}
\hspace{0pt} f2(n) &\hspace{0pt} < f4(n) < f1(n) < f3(n) < f5(n)\\
\\
\hspace{0pt} O(nlog^2n) &\hspace{0pt} < O(n^{3}) < O(n^3 logn) < O(n^2 1.5^n) < O(3^n)
\end{align*}


### **Justification and Proof of Correctness - Question 4**

(a)

*Justification*

- f5 is dominated by (0.9)^n, which is an exponential decay. For large nn, it approaches zero, making it the slowest-growing function.
- f1 has a sub-linear growth n^0.7 combined with a logarithmic growth log⁡nlogn. It grows slower than linear-logarithmic but faster than logarithmic functions.
- f2 grows linearly in nn but has a slower-growing logarithmic term log⁡n^0.5. It grows faster than f1(n) due to the linear term n.
- f3 a linear-logarithmic function. It grows faster than both f1(n) and f2(n), but not as fast as quadratic or higher-degree polynomial functions.
- f4 grows quadratically, which is the fastest-growing among the list of functions for large n.

*Graph Plotting*

![Part(a) Computational Complexity Graph](image-8.png)

(b)

*Justification*
- f2(n) combines a linear term with a squared logarithmic term, making it the slowest-growing function in the set.
- f4(n) simplifies to a cubic term n^3, faster than f2(n) but slower than the remaining functions.
- f1(n) enhances a cubic term with a logarithmic multiplier, placing it ahead of f4(n) but behind exponential functions.
- f3(n) combines a quadratic term with an exponential one, resulting in a faster growth than cubic but slower than pure exponential functions.
- f5(n) is a pure exponential function with a base of 3, making it the fastest-growing function in this set

*Graph Plotting*

![Part(b) Computational Complexity Graph ](image-9.png)
![Part(b) Computational Complexity Graph](image-10.png)

### Question 5 (25 points)

**Problem Statement**

Alex, a NUTech Solutions recruiter, has the annual difficulty of linking interns with mentors. Alex is fascinated by the Gale-Shapley algorithm and wonders if a modified version may provide a more fair matching system that takes into consideration diverse difficulties. It is your responsibility to modify the Gale-Shapley method to these real-world settings using Python functions. ***Please refer to Ques5_Coding.py for the starter code template***

(a) Due to competence variations, not all mentors can help interns on all projects. varying mentors specialize in various areas, while interns have varying skill preferences. [10 points]

Make a Python method called *create_skill_based_preferences(mentors, interns)* that will generate first preference lists based on matching talents.

- Input: Two dictionaries lists for mentors and interns. Each dictionary has a 'name' and a'skills' list.
- Output: Two lists of mentor and intern choice lists based on skill compatibility.

(b) Use the Gale-Shapley algorithm to discover stable pairings of mentors and interns based on their preferences.[10 points]

Make a Python method called *find_stable_matching(mentor_preferences, intern_preferences)* that identifies the stable pairing based on the preferences derived from the preceding task.
- Input:
    - mentor_preferences: A dictionary with mentor names as keys and lists of intern names sorted by preference as values.
    - intern_preferences: A dictionary with intern names as keys and lists of mentor names sorted by preference as values.

- Output: A dictionary containing stable matches with the mentor as the key and the intern as the value.

(c) Occasionally, following matching, we get feedback from either interns or mentors that they are unhappy with the pairing for reasons that were not addressed in the initial skill-based matching. These might be owing to factors such as geography, time, or project alignment.[5 points]

Your objective is to develop a function that accepts the stable pairings, a list of "unhappy" mentors, and a list of "unhappy" interns, and produces a new set of stable pairs after deleting the given unstable pairs.

- Input
    - stable_pairs: A dictionary containing stable pairs of mentors and interns.
        - String mentor name as a key
        - String value for intern name
    - unhappy_mentors: A list of mentor names (strings) that are unhappy with their pairing.
    - unhappy_interns: A list of intern names (strings) that are unhappy with their pairing.  
  
- Output - A dictionary containing new stable pairings once the unstable pairs have been removed.

### **Solution - Question 5**

(a)

```python
def create_skill_based_preferences(mentors, interns):
    mentor_pref = {}
    intern_pref = {}
    
    # Loop through mentors to create their preference lists
    for mentor in mentors:
        mentor_name = mentor['name']
        mentor_skills = set(mentor['skills'])
        mentor_pref[mentor_name] = []
        
        # Sort interns based on the number of matching skills
        for intern in sorted(interns, key=lambda x: len(mentor_skills.intersection(set(x['skills']))), reverse=True):
            intern_name = intern['name']
            if len(mentor_skills.intersection(set(intern['skills']))) > 0:
                mentor_pref[mentor_name].append(intern_name)

    # Loop through interns to create their preference lists
    for intern in interns:
        intern_name = intern['name']
        intern_skills = set(intern['skills'])
        intern_pref[intern_name] = []
        
        # Sort mentors based on the number of matching skills
        for mentor in sorted(mentors, key=lambda x: len(intern_skills.intersection(set(x['skills']))), reverse=True):
            mentor_name = mentor['name']
            if len(intern_skills.intersection(set(mentor['skills']))) > 0:
                intern_pref[intern_name].append(mentor_name)

    return mentor_pref, intern_pref
```
(b)
    
```python
def find_stable_matching(mentor_preferences, intern_preferences):
    unassigned_interns = list(intern_preferences.keys())
    mentor_current = {}  # Current matchings for mentors
    intern_current = {}  # Current matchings for interns
    mentor_next_proposal = {mentor: 0 for mentor in mentor_preferences}  # Next proposal index for each mentor

    while unassigned_interns:
        intern = unassigned_interns.pop(0)
        preferred_mentors = intern_preferences[intern]
        
        for mentor in preferred_mentors:
            if mentor not in mentor_current:  # Mentor is unassigned
                mentor_current[mentor] = intern
                intern_current[intern] = mentor
                break
            else:  # Mentor is already assigned
                current_intern = mentor_current[mentor]
                if mentor_preferences[mentor].index(intern) < mentor_preferences[mentor].index(current_intern):  # New intern is preferred
                    mentor_current[mentor] = intern
                    intern_current[intern] = mentor
                    if current_intern in intern_current:
                        del intern_current[current_intern]
                    unassigned_interns.append(current_intern)  # Reassign previous intern
                    break

    return mentor_current
```
(c)

```python
def remove_unstable_pairs(stable_pairs, unhappy_mentors, unhappy_interns):
    new_stable_pairs = {}
    
    for mentor, intern in stable_pairs.items():
        if mentor not in unhappy_mentors and intern not in unhappy_interns:
            new_stable_pairs[mentor] = intern
    
    return new_stable_pairs
```

### **Correctness Proof with Test Cases - Question 5**

The code solution was tested and confirmed for the following test cases:

1. Handle empty lists of mentors and interns.
2. All mentors and interns have identical skills.
3. All mentors and interns have mutual first-choice preferences.
4. Multiple iterations required to find a stable match.
5. All mentors and interns are unhappy with their initial pairings.
6. Some mentors and interns are unhappy with their initial pairings.

Example usage is given along with code implementation in the file **Ques5_Sol.py**

### Question 6 (25 points)

**Problem Statement**

You are a member of a smart city project's traffic department. The city is represented as a graph, with junctions acting as nodes and roads acting as edges. Your mission is to create algorithms that assist drivers and the traffic department in a variety of ways.***Please refer to Ques6_Coding.py for the starter code template***:

(a) Emergency Vehicle Routing (BFS) [5 points]

Ambulances and fire departments, for example, must get at their locations as quickly as possible. Create a function shortest_path(graph: dict, start: str, end: str) that accepts a city graph and a starting and finishing intersection and returns the shortest path from the beginning to the end.

- Input

    - graph: A dictionary with intersection names as keys and lists of surrounding junctions as values.
    - start: As a string, the first intersection.
    - end: The string representing the last intersection.  
    
- Output

    - A list of intersection names in the order they should be followed to get from start to finish in the shortest amount of time.


(b) Road Maintenance Scheduling (DFS) [5 points]

The city is planning road repair and wants to guarantee that every route is pothole-free. They want to execute this in the most efficient way possible, with each intersection/node visited exactly once if feasible. To discover such a path, create a Python function maintenance_path(graph: dict, start: str) -> list.

- Input

    - graph: A dictionary with intersection names as keys and lists of surrounding junctions as values.
    - start: As a string, the first intersection.

- Output

    -A list of intersection names, arranged in the order in which they should be traveled for effective road inspection. If there are many pathways, return any of them.


(c) Safest Route (BFS with Weighted Edges) [7 points]

Each road now has a safety grade assigned to it. The city council wants to know the best way to get from one location to another. The "safest" route is defined for this problem as the one having the highest total of safety ratings along its path.

Write a Python method safest_path(graph: dict, start: str, end: str) -> list that searches for the safest path using BFS.

- Input:
    - graph: A dictionary with intersection names as keys and dictionaries with surrounding intersections and their safety rating as values.
    - start: As a string, the first intersection.
    - end: As a string, the destination intersection.

- Output: 
    - A list of junction names that reflect the safest path from start to finish based on the total of safety ratings. Returning any is allowed if many pathways exist with the same greatest sum.

- Constraints: 
    - Safety ratings are positive integers.
    - If the safest route is a tie based on the total of safety ratings, any route is acceptable.

(d) Minimum Stops to Refuel (BFS) [8 points]

Given that each car has enough charge to go directly from one junction to any nearby intersection, you must discover the path from a beginning point to an endpoint with the fewest charging stops. This is crucial since fewer stops imply shorter charging lines and smoother traffic flow, encouraging the usage of electric vehicles and contributing to the city's environmental goals.

Write a Python method called min_charging_stops(graph, start, finish, stations) to identify the path with the fewest charging stops.

- Input:
    - graph: A dictionary with intersection names as keys and lists of surrounding junctions as values.
    - start: As a string, the first intersection.
    - end: As a string, the destination intersection.
    - stations: A list of intersections where charging stations are present.

- Output:
    - A list of intersection names from start to end that minimizes the number of charging stops.


### **Solution - Question 6**

(a)

```python
def shortest_path(graph, start, end):
    visited = set()
    queue = deque([(start, [start])])

    while queue:
        current_intersection, path = queue.popleft()
        if current_intersection == end:
            return path
        visited.add(current_intersection)
        
        for neighbor in graph[current_intersection]:
            if neighbor not in visited:
                new_path = list(path)
                new_path.append(neighbor)
                queue.append((neighbor, new_path))
```

(b)

```python
def maintenance_path(graph, start):
    visited = set()
    path = []

    def dfs(current_intersection):
        visited.add(current_intersection)
        path.append(current_intersection)
        for neighbor in graph[current_intersection]:
            if neighbor not in visited:
                dfs(neighbor)

    dfs(start)
    return path
```

(c)

```python
def safest_path(graph, start, end):
    candidates = [(0, start, [])]
    visited = set()
    
    while candidates:
        candidates.sort(reverse=True)  # Sort by safety, descending
        safety_sum, current, path = candidates.pop(0)  # Get the safest route
        
        if current in visited:
            continue
        visited.add(current)
        
        path = path + [current]
        
        if current == end:
            return path
        
        for neighbor, safety in graph[current].items():
            if neighbor not in visited:
                new_safety_sum = safety_sum + safety
                candidates.append((new_safety_sum, neighbor, path))
                
    return "Not possible"
```

(d)

```python
def min_charging_stops(graph, start, end, stations):
    visited = set()
    queue = deque([(start, [start], 0)])  # Node, Path, Stops

    while queue:
        current_node, path, stops = queue.popleft()
        visited.add(current_node)

        # If we reach the destination, return the path
        if current_node == end:
            return path

        for neighbor in graph[current_node]:
            if neighbor not in visited:
                new_stops = stops
                # If there is a charging station at the neighbor, reset the stops counter
                if neighbor in stations:
                    new_stops += 1
                # Add the neighbor node to the queue for future processing
                queue.append((neighbor, path + [neighbor], new_stops))

    # If a path is not found
    return None
```


### **Correctness Proof with Test Cases - Question 6**

The code solution for all the parts were tested and confirmed with multiple edge cases and constraints according to the question test cases:

- Test Case 1: Basic Graph with a Loop
- Test Case 2: Graph with Multiple Branching Paths
- Test Case 3: More Complex Graph with Unique multiple Path
- Test Case 4: Small-Scale Grid Path
- Test Case 5: Simple Four-Node Graph
- Test Case 6: Graph with Isolated Intersections

Example usage is given along with code implementation in the file **Ques6_Sol.py**

### Question 7 (10 points)

Analyze the following code snippets and determine the overall time complexity of each.

(a)

This search_emails function filters a list of emails (emails) based on a list of target keywords (targets). It iterates through the 'subject' and 'body' of each email, appending matching emails to a found_emails list before returning it

```python
def search_emails(emails, targets):
    found_emails = []
    for email in emails:           
        for target in targets:     
            if target in email['subject'] or target in email['body']:
                found_emails.append(email)
                break
return found_emails
```
(b)

This find_mutual_friends function identifies mutual friends and shared interests between two users (user1_data and user2_data). It iterates through the 'friends' and 'interests' fields of both users' data, appending matching pairs to a mutuals list before returning it.

```python
def find_mutual_friends(user1_data, user2_data):
    mutuals = []
    for friend in user1_data['friends']:        # Loop 1: n iterations
        if friend in user2_data['friends']:     # List search: m iterations
            for interest in user1_data['interests']:  # Loop 2: p iterations
                if interest in user2_data['interests']: # List search: q iterations
                    mutuals.append((friend, interest))
return mutuals
```
(c)

The generate_combinations function creates a list of combinations for available pizza toppings, considering a list of 'exclusions'. It iterates through the 'toppings' list to generate pairs, excluding any pairs present in the 'exclusions' list, and then appends the remaining combinations to a combinations list. Your analysis should consider the nested loops and the list search for exclusions. 

```python
def generate_combinations(toppings, exclusions):
    combinations = []
    for i in range(len(toppings)):              # Loop 1: n iterations
        for j in range(i+1, len(toppings)):     # Loop 2: n - 1, n - 2, ... 1 iterations
            if (toppings[i], toppings[j]) not in exclusions:  # List search: m iterations
                combinations.append((toppings[i], toppings[j]))
return combinations
```

(d)

The function performs a binary search on an array (arr) to find a target value (target) but also has the option to calculate all permutations of the array when a specific depth parameter is set to 0. *Your analysis should consider the logarithmic complexity of binary search and the factorial complexity of calculating permutations.*

```python
def advanced_search(arr, target, depth):
    if depth == 0:
        return calculate_permutations(arr)  
    low = 0
    high = len(arr) - 1
    while low <= high:     
        mid = (low + high) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            low = mid + 1
        else:
            high = mid - 1
    return -1
```
(e) 

The exponential_logarithmic_function combines two operations: generating all subsets of a set s and performing a binary search on an array arr for each subset's length. 

```python
# Function to perform binary search
def binary_search(arr, target):
    low = 0
    high = len(arr) - 1
    while low <= high:
        mid = (low + high) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            low = mid + 1
        else:
            high = mid - 1
    return -1

# Function to generate all subsets of a set
def generate_subsets(s):
    if len(s) == 0:
        return [[]]
    subsets = []
    first = s[0]
    remaining = s[1:]
    for subset in generate_subsets(remaining):
        subsets.append(subset)
        subsets.append([first] + subset)
    return subsets

# Main function that combines both exponential and logarithmic complexities
def exponential_logarithmic_function(arr, s):
    for subset in generate_subsets(s):
        binary_search(arr, len(subset))
```

### **Solution - Question 7**

\begin{aligned}
&(a)  \quad O(n \times m \times p)\\
&(b)  \quad O(n \times m \times p \times q)\\
&(c) \quad O(n^2 \times m)\\
&(d) \quad O(n!)\\
&(e) \quad O(2^n \times \log m)
\end{aligned}



### **Justification and Proof of Correctness - Question 7**

(a)
- There are two nested loops: 

    - The first loop iterates through all of the emails in the list. Let us suppose there are n emails.
    - The second loop iterates through the list of target keywords. Assume you have m target keywords.  

- The function uses the in operator to conduct string matching within the inner loop. In Python, the in operator's worst-case time complexity for strings is O(p), where p is the length of the string being searched.

Given this, the function's worst-case time complexity may be determined as follows:

- The temporal complexity of the nested loops is O(n * m).
- A string matching operation with time complexity O(p) is performed for each iteration of the inner loop.
- The append() operation is generally O(1).


Putting these together, we get a worst-case time complexity of O(n * m * p)

(b)

- The outer loop iterates through each of user1_data's friends. This adds to the time complexity of O(n).
- Within the outer loop, a conditional statement checks to see if friend is in user2_data['friends']. In the worst scenario, checking for list membership has a temporal complexity of O(m).
- A nested loop iterates over each interest of user1_data within the outer loop and the conditional expression. This adds a temporal complexity of O(p).
- Within the nested loop, another conditional statement determines whether interest exists in user2_data['interests']. In the worst scenario, checking for list membership has a time complexity of O(q).

- Finally, the append() operation has a constant time complexity of O(1).

When we add all of them together, we get:

- The outer loop has a time complexity of O(n).
- The time complexity of the list search inside the outer loop is O(m).
- The time complexity of the nested loop is O(p).
- The time complexity of the list search within the nested loop is O(q).

The overall time complexity is O(n×m×p×q).

(c) 

- Outer Loop: Iterates n times, where n is the number of toppings. As a result, its temporal complexity is O(n).

- Inner Loop: The inner loop starts from i+1 to n-1. The number of iterations for the inner loop would be (n−1)+(n−2)+…+1, which forms an arithmetic series. The sum of this series is (n−1)×n/2, which simplifies to O(n^2).

- Exclusion Check: Inside the inner loop, there is a check to see if a particular combination is in the exclusions list. This operation is O(m), where m is the number of exclusions.

- Append Operation: This operation is O(1), a constant time operation.

Combining all these elements, we get the time complexity as O(n^2×m).

(d)

1. When the depth is zero:

    The function in this case calls calculate_permutations(arr). Because you are creating every conceivable arrangement of n elements, calculating all permutations of an array of length n has a time complexity of O(n!).

2. When depth is more than zero:

    In this scenario, the array arr is subjected to a binary search. A binary search operation has a time complexity of O(logn), where n is the size of the array.


Because the function executes either the permutation calculation or the binary search depending on depth, the worst-case time complexity would be the maximum of these two, which is O(n!) when depth is 0.

As a result, the advanced_search function's worst-case time complexity is O(n!).


(e)

1. Generating Subsets (generate_subsets function):

    The generate_subsets function generates all possible subsets of the set s. For a set of size n, there are 2n subsets. Thus, the time complexity for generating all subsets of s is O(2n).

2. Binary Search (binary_search function):

    The binary_search function has a time complexity of O(log⁡m), where m is the size of the array arr.

3. Combined Complexity (exponential_logarithmic_function function):

    In the exponential_logarithmic_function, the function iterates over each subset generated by generate_subsets(s) and performs a binary search using binary_search(arr, len(subset)).

Hence, the overall time complexity of the exponential_logarithmic_function will be O(2^n×log⁡m).


### Question 8 (10 points)

**Problem Statement:**

Emergency Evacuation Plan in a Building Using Graph Traversal Algorithms

Consider a building with numerous floors and rooms, represented as a graph with each node representing a room and each edge representing a doorway connecting two rooms. Some rooms have staircases leading to higher stories. In the event of a fire, the building's residents must be evacuated as fast and safely as possible.

Create pseudocode for a modified BFS or DFS algorithm that will discover the shortest path from each given room to the nearest emergency exit. The algorithm should take into consideration numerous restrictions such as blocked doors, room capacity, and other potential risks. The aim is to get everyone to safety while keeping the overall evacuation time to a minimum.

- Before writing the pseudocode, describe your problem-solving strategy.
- To tackle this problem, write a pseudocode for the updated BFS or DFS algorithm.

Input Format:

- Rooms: An integer N representing the number of rooms (nodes).
- Doorway Count: An integer M representing the number of doors (edges).
- Edge Weights: A list of M tuples (a, b, w) denoting the existence of a doorway with a time-weight of w (to-cross) between rooms a and b.
- Emergency Exits: A list E of rooms marked as emergency exits.
- Impassable Rooms: A list of rooms that are inaccessible or harmful.
- Start Room: An integer S denoting the initial room.

Output Format:

- Shortest Evacuation Time: An integer reflecting the time it takes to go from the starting room to an emergency exit. Return -1 if no route is identified.

**Contraints:**

- The building's graph is linked, however it may contain cycles.
- Some nodes (rooms) may be designated as dangerous or inaccessible.
- Each edge (doorway) may have a weight that represents the amount of time it takes to pass through.
- The building has a restricted number of emergency exits, which are depicted as special nodes in the network.


### **Solution - Question 8**

**Approach:**

- Initialization: Create a data structure to maintain the state of each room, which contains information such as its distance from the beginning room and whether or not it has been visited.

- Constraints: Incorporate checks to skip impassable rooms and to adjust edge weights according to the constraints like blocked doorways, room capacities, and other hazards.

- Emergency Exits: Locate the building's emergency exits and include them in the algorithm's stopping criteria.

- Traversal approach: To traverse the graph, use a modified Breadth-First Search (BFS) method. BFS is preferred over Depth-First Search (DFS) because it finds the shortest pathways in weighted graphs.


**Pseudocode:**

```plaintext
Initialize an empty priority queue Q
Initialize a dictionary distance with all rooms set to infinity, except the starting room S set to 0

Push (0, S) into Q (distance, room)
while Q is not empty:
    current_distance, current_room = Pop the minimum element from Q
    
    if current_room is in Emergency Exits:
        return current_distance

    if current_room is in Impassable Rooms:
        continue

    Mark current_room as visited
    for each neighbor, edge_weight in neighbors of current_room:
        if neighbor is not visited:
            new_distance = current_distance + edge_weight
            
            if new_distance < distance[neighbor]:
                distance[neighbor] = new_distance
                Push (new_distance, neighbor) into Q

return -1  # If no path to any emergency exit is found
```

### **Justification and Proof of Correctness - Question 8**

Pseudocode Justification and correctness proof:

1. Initialization Procedure

    - Setting the distance between all rooms to infinity except the first, which is set to zero. This method ensures that any actual path discovered later will be shorter than infinity and will change the initialized value, making it a minimum value at that time.

    - At the beginning, distance[start_room] equals 0 and distance[any_other_room] equals. This guarantees that the algorithm correctly determines whether or not it can improve on the initial assignment.

2. Priority Queue (Q)

    - Priority queues guarantee that at every stage, we select the room closest to the beginning point. This is critical for effectiveness.

    - Whenever we remove an element from the priority queue, it is always the room with the shortest distance label. This characteristic assures that we explore nodes in the order that corresponds to discovering the shortest pathways first, justifying the usage of a priority queue.

3. Check the Emergency Exit

    - Finding the shortest path to the nearest emergency exit; once found, the algorithm may be safely terminated.

    - The priority queue assures that the first exit encountered has the shortest path, and no other exit has a shorter path. Stopping the algorithm when an emergency exit is reached is thus accurate.

4. Impassable Rooms

    - The algorithm detects and avoids inaccessible rooms, which is crucial for the evacuation's safety.

    - The algorithm only considers a room to be visited if it is not on the list of inaccessible rooms. This guarantees that impenetrable rooms never have an impact on the evacuation plan's outcome.

5. Edge Relaxation and Update

    - When a room is visited, the algorithm iteratively updates the distances to its neighbors, guaranteeing that the shortest path is found.

    - The method assures optimality by updating only when a shorter path is identified (new_distance distance[neighbor]). This guarantees that by the time a room is removed from the priority list, the quickest path to that room has been identified.