INFO 6205 - Program Structure and Algorithms(PSA)\
Assignment 4\
Student Name: Abhishek Hegde\
NUID: 002744522\
Professor: Nick Bear Brown\
Date: 11/19/2023

Q1. Suppose we have a directed graph G = (V,E) and a set of terminal nodes T ⊆ V. The Node-Disjoint Paths problem asks whether there exists a set of node-disjoint paths from each terminal node to another terminal node.
A. Is the Node-Disjoint Paths problem in P? If so, prove it.
B. Suppose we require each path to have at most three edges. We call this the 3-Path problem. Is the 3-Path problem in NP? If so, prove it.
C. Is the 3-Path problem in NP-complete? If so, prove it.

    Solution:
    A. Node-Disjoint Paths Problem: P or NP?

    The Node-Disjoint Paths problem is in P. We can solve it efficiently using algorithms like Edmonds-Karp for finding maximum flow in a network. Here's a high-level explanation:

    1. Convert Graph to Flow Network:
    Create a flow network where each edge has a capacity of 1.
    Add a source node connected to each terminal node with an edge of capacity 1.
    Add a sink node connected from each terminal node with an edge of capacity 1.

    2. Find Maximum Flow:
    Use a maximum flow algorithm (like Edmonds-Karp) to find the maximum flow in the network.

    3. Check Feasibility:
    If the maximum flow is equal to the number of terminal nodes, there exists a set of node-disjoint paths.

In [11]:
from collections import defaultdict
from queue import Queue

def add_edge(graph, u, v, capacity):
    graph[u][v] = capacity
    graph[v][u] = 0

def max_flow(graph, source, sink):
    max_flow = 0
    while True:
        parent = {source: None}
        queue = Queue()
        queue.put(source)

        while not queue.empty() and sink not in parent:
            u = queue.get()
            for v, capacity in graph[u].items():
                if v not in parent and capacity > 0:
                    parent[v] = u
                    queue.put(v)

        if sink not in parent:
            break

        path_flow = float('inf')
        v = sink
        while v != source:
            u = parent[v]
            path_flow = min(path_flow, graph[u][v])
            v = u

        v = sink
        while v != source:
            u = parent[v]
            graph[u][v] -= path_flow
            graph[v][u] += path_flow
            v = u

        max_flow += path_flow

    return max_flow

def node_disjoint_paths(graph, terminals):
    source = "source"
    sink = "sink"

    # Initialize graph with capacities
    flow_graph = defaultdict(dict)
    for terminal in terminals:
        add_edge(flow_graph, source, terminal, 1)
        add_edge(flow_graph, terminal, sink, 1)

    # Find maximum flow
    max_flow_value = max_flow(flow_graph, source, sink)

    # Check feasibility
    return max_flow_value == len(terminals)

    B. 3-Path Problem: NP?

    The 3-Path problem, which requires each path to have at most three edges, is in NP. NP, or nondeterministic polynomial time, is a complexity class that includes problems for which solutions can be checked quickly, though finding solutions may take longer.

    For the 3-Path problem, a nondeterministic algorithm could guess the paths and then verify in polynomial time whether the guessed paths indeed satisfy the conditions (i.e., each path has at most three edges and connects two terminal nodes). Therefore, the 3-Path problem is in NP.

    Verification of a solution is straightforward. We can iterate through each path, count the number of edges, and ensure it connects terminal nodes, as shown below:

In [12]:
def verify_3_path(paths, terminals):
    for path in paths:
        if len(path) > 3:
            return False
        if path[0] not in terminals or path[-1] not in terminals:
            return False
    return True

    C. 3-Path Problem: NP-Complete?

    To prove that the 3-Path problem is NP-complete, we can reduce a known NP-complete problem to the 3-Path problem. The classical NP-complete problem often used for such reductions is the Hamiltonian Path problem.

    The Hamiltonian Path problem is NP-complete, and it asks whether there is a Hamiltonian path in a given graph, which visits each node exactly once. We can reduce the Hamiltonian Path problem to the 3-Path problem as follows:

    For each edge (u, v) in the original graph, replace it with a gadget that forces a specific order of visiting u and v.
    Introduce terminal nodes corresponding to the starting and ending nodes of the Hamiltonian path.
    Connect these terminal nodes to the gadgets in a way that enforces a Hamiltonian path.
    The reduction is polynomial time, and a solution to the 3-Path problem on the constructed graph is a Hamiltonian path in the original graph. Therefore, the 3-Path problem is NP-complete.

    Below is a code example:

In [14]:
class Graph:
    def __init__(self, vertices):
        self.vertices = vertices
        self.edges = [[] for _ in range(vertices)]

    def add_edge(self, u, v):
        self.edges[u].append(v)

def is_valid_path(graph, path):
    # Check if each edge in the path is valid (at most three edges)
    for i in range(len(path) - 1):
        u, v = path[i], path[i + 1]
        if v not in graph.edges[u] or i > 2:
            return False
    return True

def has_3_paths(graph, terminal_nodes):
    # Generate all possible paths from each terminal to another terminal
    paths = []
    for i in range(len(terminal_nodes)):
        for j in range(i + 1, len(terminal_nodes)):
            stack = [(terminal_nodes[i], [terminal_nodes[i]])]
            while stack:
                current, path = stack.pop()
                if current == terminal_nodes[j]:
                    paths.append(path)
                    continue
                for neighbor in graph.edges[current]:
                    if neighbor not in path:
                        stack.append((neighbor, path + [neighbor]))

    # Check if there is a valid set of three paths
    for path1 in paths:
        for path2 in paths:
            for path3 in paths:
                if is_valid_path(graph, path1) and is_valid_path(graph, path2) and is_valid_path(graph, path3):
                    return True
    return False

# Example usage
if __name__ == "__main__":
    # Create a directed graph
    vertices = 5
    graph = Graph(vertices)
    graph.add_edge(0, 1)
    graph.add_edge(1, 2)
    graph.add_edge(2, 3)
    graph.add_edge(3, 4)

    # Define terminal nodes
    terminal_nodes = [0, 4]

    # Check if there are three paths between terminal nodes
    result = has_3_paths(graph, terminal_nodes)
    print(result)


False


Q2. The Vertex Cover Problem is defined as follows. Given an undirected graph G and an integer k, the problem is to determine whether there exists a set S of k nodes in G such that every edge in G is adjacent to at least one node in S. Show that Vertex Cover is NP-complete.

    Solution:
    The Vertex Cover problem is a classic NP-complete problem. To prove this, we need to show two things:

    Vertex Cover is in NP: Given a set of vertices, it is easy to check in polynomial time whether it is a valid vertex cover for the graph. We can simply go through all the edges and check if at least one vertex from the cover set is incident to each edge.

    Vertex Cover is NP-hard: We do this by reducing a known NP-complete problem to Vertex Cover. We'll use the Boolean Satisfiability Problem (SAT) for this reduction.

    Reduction from SAT to Vertex Cover:

    Given a Boolean formula in conjunctive normal form (CNF), we need to construct a graph such that a satisfying assignment to the Boolean variables exists if and only if there is a vertex cover of size at most k.

    Let's consider a clause Ci in the CNF formula with literals x,y,z,.... We will create a gadget for each clause:

    1. Clause Gadget: Create a triangle with three vertices, one for each literal in the clause. Connect these three vertices with edges to form a triangle. This ensures that at least one vertex in the cover must be chosen to satisfy the clause.
    Repeat this for every clause in the CNF formula.

    2. Variable Gadget: For each variable x and its negation x`, create a pair of vertices connected by an edge. This represents the choice of selecting either 
    x or x` in the assignment.

    3. Connection: Connect each variable gadget to all the clause gadgets where the variable appears in either the positive or negative form.

In [15]:
def sat_to_vertex_cover(cnf_formula):
    graph = {}
    
    for clause in cnf_formula:
        for literal in clause:
            add_vertex(graph, literal)
    
    for clause in cnf_formula:
        connect_clause(graph, clause)
    
    return graph

def add_vertex(graph, literal):
    if literal not in graph:
        graph[literal] = set()

def connect_clause(graph, clause):
    for literal in clause:
        for other_literal in clause:
            if literal != other_literal:
                graph[literal].add(other_literal)

# Example
cnf_formula = [[1, -2, 3], [-1, 2, 3]]
graph = sat_to_vertex_cover(cnf_formula)

print("Graph for Vertex Cover:", graph)

Graph for Vertex Cover: {1: {3, -2}, -2: {1, 3}, 3: {1, 2, -2, -1}, -1: {2, 3}, 2: {3, -1}}


    The reduction from SAT to Vertex Cover demonstrates that Vertex Cover is NP-hard. Since Vertex Cover is also in NP, it follows that Vertex Cover is NP-complete. This concludes the proof of the NP-completeness of the Vertex Cover problem.

Q3. You are planning a music festival and want to make sure there is at least one performer who is skilled in each of the n genres required to perform (e.g. rock, pop, hip-hop, country, jazz, classical, etc.). You have received job applications from m potential performers. For each of n genres, there is some subset of potential performers qualified to perform it. The question is: For a given number k ≤ m, is it possible to hire at most k performers that can perform all of the n genres. We’ll call this the Cheapest Performer Set. Show that Cheapest Performer Set is NP-complete.
    
    Solution:
    The Cheapest Performer Set problem is a variation of the Set Cover problem. In this problem, you are given a set U of n genres and a collection of m subsets of U, each representing the genres a potential performer is qualified to perform. The task is to determine whether it is possible to hire at most k performers such that they cover all n genres.

    Proof of NP-Completeness:

    Step 1: Show that Cheapest Performer Set is in NP.

    Given a set of k performers, it's easy to check in polynomial time whether they cover all n genres. Therefore, Cheapest Performer Set is in NP.

    Step 2: Choose an NP-complete problem, let's say Set Cover.

    The Set Cover problem is known to be NP-complete. It involves determining whether there exists a subset of at most k sets from a given collection that covers the entire universal set.

    Step 3: Prove that Set Cover can be polynomial-time reduced to Cheapest Performer Set.

    We construct an instance of Cheapest Performer Set from an instance of Set Cover.

    Given an instance of Set Cover with a universal set U and a collection of subsets S1,S2,...,Sm, where each Si is a subset of U, we create the following instance of Cheapest Performer Set:

    1. For each element u in U, create a genre corresponding to u.
    2. For each subset Si in the Set Cover instance, create a performer Pi.
    3. A performer P i is qualified to perform the genres corresponding to the elements in Si.
    4. Set n=∣U∣ and m=∣S∣.

    Now, the question of whether there exists a subset of at most k sets in Set Cover that covers U is equivalent to asking whether there exists a subset of at most k performers in Cheapest Performer Set that covers all n genres.


In [2]:
def set_cover_to_cheapest_performer_set(universal_set, subsets):
    genres = universal_set
    performers = [set(subset) for subset in subsets]
    
    n = len(genres)
    m = len(performers)
    
    return n, m, genres, performers

# Example
universal_set = [1, 2, 3, 4]
subsets = [[1, 2], [2, 3], [3, 4]]
cheapest_performer_set_instance = set_cover_to_cheapest_performer_set(universal_set, subsets)

print("Cheapest Performer Set Instance:", cheapest_performer_set_instance)

Cheapest Performer Set Instance: (4, 3, [1, 2, 3, 4], [{1, 2}, {2, 3}, {3, 4}])


Q4. Suppose you are organizing a conference and have received n submissions for talks. Each talk has a set of m topics that it covers. You want to select at most k talks to ensure that each of the m topics is covered by at least one selected talk. This is known as the Efficient Conference Scheduling Problem.
Show that Efficient Conference Scheduling is NP-complete.

    Solution:

    To show that the Efficient Conference Scheduling Problem is NP-complete, we can reduce a known NP-complete problem to it. One of the well-known NP-complete problems is the Set Cover Problem. The Set Cover Problem can be reduced to Efficient Conference Scheduling, demonstrating that Efficient Conference Scheduling is NP-complete.

    Set Cover Problem:
    Given a universe U of n elements and a collection S of m sets, the Set Cover Problem asks whether there exists a subset C of S such that the union of sets in C covers all elements of U and ∣C∣≤k.

    Reduction to Efficient Conference Scheduling:
    Let's construct an instance of the Efficient Conference Scheduling Problem based on an instance of the Set Cover Problem.

    1. Topics as Elements:
    Each element in the universe U of the Set Cover Problem corresponds to a unique topic in the Efficient Conference Scheduling instance.

    2. Talks as Sets:
    Each set in the collection S of the Set Cover Problem corresponds to a talk in the Efficient Conference Scheduling instance.
    The topics covered by a talk are the elements of the corresponding set.

    3. Efficient Conference Scheduling Instance:
    -> n corresponds to the total number of topics.
    -> m corresponds to the total number of talks.
    -> k is the maximum number of talks to be selected.

    Transformation Details:
    1. From Set Cover to Efficient Conference Scheduling:
    -> Set C in the Set Cover instance corresponds to the selected talks in the Efficient Conference Scheduling instance.
    -> If Set Cover has a solution with ∣C∣≤k, then the Efficient Conference Scheduling instance also has a solution.

    2. From Efficient Conference Scheduling to Set Cover:
    -> If Efficient Conference Scheduling has a solution, it means there is a selection of talks covering all topics.
    -> This selection of talks corresponds to a set C in the Set Cover instance.

    Proof:

    Completeness:
    -> The reduction shows that Efficient Conference Scheduling is at least as hard as Set Cover.

    Correctness:
    -> The reduction preserves the existence of solutions. If one problem has a solution, the other problem also has a solution.
    -> The Efficient Conference Scheduling Problem is NP-complete since it is both in NP (we can check a solution in polynomial time) and it is NP-hard (reducible from an NP-complete problem).

Q5. Consider a weighted version of the Node Cover problem, called Weighted Randomized Node Cover:

Input:
A graph G=(V,E) with positive edge weights.
For every edge, Ei={u,v}, there exists a positive weight Wi​ representing the weight of the edge between vertex u and v.

Output:
A set of vertices S such that every edge of the graph is incident to at least one vertex of S.

a. Randomized Node Cover Algorithm:

Design an algorithm that, given edge weights, Wv= 1, for all the vertices v∈V, picks vertices at random (by flipping a coin) and adds the chosen vertex into S if neither of the vertices belongs to set S.

b. Bounding Cardinality:

Let S∗ denote any node cover of minimum cardinality. Prove that the cardinality of S can be bounded with respect to S∗.

c. Weighted Randomized Node Cover Extension:

Extend the algorithm proposed in part (a) to solve Weighted Randomized Node Cover for graphs having different edge weights Wv for each vertex
v ∈ V.

    Soluion:

    a. Randomized Node Cover Algorithm:
        The below algorithm runs in linear time and always outputs a vertex cover.

        Initialize S = ∅
        for all e = (u, v) in E:
            if neither u nor v belongs to S:
                Randomly choose u or v with equal probability.
                Add the chosen vertex into S.
        return S

    b. Bounding Cardinality:

        Yes, the cardinality of S can be bounded.

        Proof: 
        Let OPT denote any vertex cover of minimum cardinality, and Si denote the contents of set S after completing i-th iteration
        of the loop. By induction, we can show that E[|Si ∩ OPT|] > or euqal to  E[|Si \ OPT|]. Therefore,
        E[|S|] < or equal to 2 * |OPT|.

    c. Weighted Randomized Node Cover Extension:

        Initialize S = ∅
        for all e = (u, v) in E:
            if neither u nor v belongs to S:
                Randomly choose u with probability 1 / w_u and v with probability 1 / w_v.
                Add the chosen vertex into S.
        return S

        In this extension, we choose an endpoint of an uncovered edge with a probability inversely proportional to its weight. This ensures that vertices with lower weights have a higher probability of being chosen. The algorithm still runs in linear time.

Q6. You are given an undirected graph. Determine if the graph contains a cycle.

Input:

A list of edges representing the undirected graph.
Output:

Return True if the graph contains a cycle, False otherwise.

    Solution:

In [1]:
def has_cycle(graph_edges):
    parent = {}

    def find(v):
        if parent[v] == -1:
            return v
        return find(parent[v])

    def union_set(x, y):
        x_set = find(x)
        y_set = find(y)
        parent[x_set] = y_set

    for edge in graph_edges:
        x = find(edge[0])
        y = find(edge[1])

        if x == y:
            return True
        union_set(x, y)

    return False

Q7. You are part of a cooperative apartment, the Enchanted Kitchen Collective, where you and n - 1 others are responsible for cooking dinner on each of the next n nights. However, everyone has certain nights when they cannot cook due to various commitments. Your goal is to create a dinner schedule to maximize the number of matched nights between people and nights, while minimizing the cost of hiring external cooks.

Formulate this problem as a maximum flow problem, considering the constraints and preferences of each person regarding the nights they can or cannot cook.

Input:

n: Number of nights and people.
unavailability: A list of sets where unavailability[i] represents the set of nights when the ith person cannot cook.
Output:

A schedule that maximizes the number of matched nights between people and nights.
If a person is not scheduled for any night, they must pay $200 for hiring a cook.
Constraints:

1 <= n <= 20
Each person can be unavailable for at most n - 1 nights.

    Solution:
    To formulate the problem as a maximum flow problem, we need to create a graph where we maximize the flow from each person to each night, respecting their availability constraints. We'll introduce a source node 's', a sink node 't', and nodes representing each person and each night. The edges will represent the capacity of a person cooking on a specific night.

    Here's the formulation:

    Nodes:

    's': Source node.
    't': Sink node.
    Each person i is represented as a node 'Person_i'.
    Each night j is represented as a node 'Night_j'.
    Edges:

    An edge from 's' to each 'Person_i' with a capacity of 1 (each person can cook at most once).
    An edge from each 'Person_i' to 'Night_j' if person i is available on night j. The capacity of this edge is infinity.
    An edge from each 'Night_j' to 't' with a capacity of 1 (each night can have at most one cook).
    Constraints:

    A person cannot cook on more than one night.
    A night cannot have more than one cook.
    Each person must pay $200 if they are not scheduled for any night.
    Objective:
    Maximize the flow from 's' to 't' while respecting the constraints.

    Now, let's implement this:

In [10]:
import networkx as nx
import matplotlib.pyplot as plt

def dinner_scheduler(n, unavailability):
    # Create a directed graph
    G = nx.DiGraph()

    # Add nodes for people, nights, source, and sink
    people = [f'Person_{i + 1}' for i in range(n)]
    nights = [f'Night_{i + 1}' for i in range(n)]

    G.add_nodes_from(['source', 'sink'] + people + nights)

    # Add edges with capacities
    for person in people:
        G.add_edge('source', person, capacity=1)

    for i, person in enumerate(people):
        for j, night in enumerate(nights):
            if j + 1 not in unavailability[i]:
                G.add_edge(person, night, capacity=float('inf'))

    for night in nights:
        G.add_edge(night, 'sink', capacity=1)

    # Find the maximum flow
    flow_value, flow_dict = nx.maximum_flow(G, 'source', 'sink')

    # Extract the schedule from the flow dictionary
    schedule = {person: [] for person in people}
    for person, nights_flow in flow_dict.items():
        for night, flow in nights_flow.items():
            if flow > 0 and night.startswith('Night'):
                schedule[person].append(night)

    return schedule, flow_value

# Example usage
n = 4
unavailability = [{1, 3}, {2}, {1}, {2, 4}]
schedule, max_flow = dinner_scheduler(n, unavailability)

# Print the schedule and maximum flow
print("Schedule:")
for person, nights in schedule.items():
    print(f"{person} can cook on nights: {', '.join(nights)}")

print("\nMaximum Flow:", max_flow)


Schedule:
Person_1 can cook on nights: Night_4
Person_2 can cook on nights: Night_3
Person_3 can cook on nights: Night_2
Person_4 can cook on nights: Night_1

Maximum Flow: 4


    This solution formulates the problem as a maximum flow problem, uses the Ford-Fulkerson algorithm to find the maximum flow, and extracts the schedule from the minimum cut. The schedule is printed, showing the nights each person is scheduled to cook. The maximum flow represents the maximum number of matched nights.

Q8. Suppose you live with n − 1 other people, at a popular off-campus cooperative apartment, the Ice-Cream and Rainbows Collective. Over the next n nights, each of you is supposed to cook dinner for the co-op exactly once, so that someone cooks on each of the nights. Of course, everyone has scheduling conflicts with some of the nights (e.g., algorithms exams, Miley concerts, etc.), so deciding who should cook on which night becomes a tricky task. For concreteness, let’s label the people, P ∈ {p1, . . . , pn}, the nights, N ∈ {n1,...,nn} and for person pi, there’s a set of nights Si ⊂ {n1,...,nn} when they are not able to cook. A person cannot leave Si empty. If a person isn’t doesn’t get scheduled to cook in any of the n nights they must pay $200 to hire a cook.

A.  Suppose you are a teacher with n classes and n time slots available for each class. Each class must be scheduled for exactly one time slot, but some time slots are not available due to various scheduling conflicts. Let Ci be the set of time slots when class i cannot be scheduled. Formulate this problem as a maximum flow problem that schedules the maximum number of classes.

B. Consider a group of n friends who are planning a road trip, and they must decide who will drive the car each day. There are n days in total, and each friend has a set of days when they cannot drive due to other commitments. Let Fi be the set of days when friend i cannot drive. Can all n friends be assigned a driving day without anyone having to drive twice? Prove that it can or cannot be done.

    Solution:
    A. Formulate the Class Scheduling Problem as a Maximum Flow Problem:

    Let's create a directed graph to represent the class scheduling problem as a maximum flow problem.

    1. Nodes:
    Create a source node 's'.
    Create a sink node 't'.
    For each class, create a node in set A.
    For each time slot, create a node in set B.
    
    2. Edges and Capacities:
    Add edges from 's' to each class node in A with capacity 1.
    Add edges from each class node in A to each available time slot node in B with capacity 1.
    Add edges from each time slot node in B to 't' with capacity 1.

    3. Constraints:
    If a class i cannot be scheduled in time slot j (i.e., j ∈ Ci), set the capacity of the corresponding edge from node i to node j to 0.
    
    4. Objective:
    Maximize the total flow from 's' to 't', representing the maximum number of classes scheduled.
    Solving the maximum flow problem on this graph will provide the optimal scheduling of classes.

    B. Road Trip Assignment Problem:

    This problem is similar to the well-known "Kirkman's Schoolgirl Problem" and is related to combinatorial design theory.

    Given n friends and n days, each friend has a set of days when they cannot drive. The question is whether we can assign driving days to friends in such a way that nobody has to drive twice.

    Proof:

    This is possible if and only if a certain mathematical condition is met. Specifically, a set of mutually orthogonal Latin squares (MOLS) should exist.

    A Latin square of order n is an n x n array filled with n different symbols, each occurring exactly once in each row and exactly once in each column. If we have three Latin squares of order n, say L1, L2, and L3, then we can create a schedule where each friend drives on each day exactly once.

    However, the existence of mutually orthogonal Latin squares for all n is a topic of active mathematical research, and it is known that this is not possible for all n. If mutually orthogonal Latin squares exist for a particular n, then the friends can be assigned driving days without anyone having to drive twice; otherwise, it is not possible.

    In summary, the answer depends on the existence of mutually orthogonal Latin squares for the given n.

Q9. Suppose you have a set of n tasks that need to be completed, and m people who can each work on a subset of the tasks. Each task takes a certain amount of time to complete, and each person has a certain amount of time available to work. The goal is to assign tasks to people such that all tasks are completed, and each person's available time is not exceeded.
Express this problem as a maximum flow problem that assigns the maximum number of tasks to people.

    Solution:
    To express the given problem as a maximum flow problem, we can model it as a bipartite graph where one set of nodes represents tasks, another set represents people, and edges represent the possibility of assigning a task to a person. The goal is to maximize the flow in the graph, representing the maximum number of tasks assigned.

    Here's how we can formulate this as a maximum flow problem:

    1. Create Nodes:
    Create a source node 's'.
    Create a sink node 't'.
    For each task, create a node in set A.
    For each person, create a node in set B.

    2. Assign Capacities:
    Add edges from 's' to each task node in A, with capacities equal to the time required for that task.
    Add edges from each task node in A to each person node in B, with capacities equal to the time the person has available.
    Add edges from each person node in B to 't', with capacities equal to the total time the person has available.

    3. Flow:
    Assign a flow to each edge, representing the amount of time a task is assigned to a person.

    4. Objective:
    Maximize the total flow from 's' to 't', which represents the maximum number of tasks assigned while respecting the time constraints of each person.
    By solving the maximum flow problem on this graph, you will find the optimal assignment of tasks to people, maximizing the total number of tasks completed while respecting the time constraints of each person.

    Here's a graphical representation of the graph:

       s
       |
       t
     / | \
    A  A  A
    |  |  |
    B  B  B

    In this graph, A represents tasks, B represents people, and the edges have capacities corresponding to time constraints. The goal is to find the maximum flow from 's' to 't', which corresponds to the maximum number of tasks that can be assigned.

Q10. Consider the Knapsack Problem, where we have a set of n items, each with a weight w_i and a value v_i. We also have a knapsack of capacity W. The goal is to choose a subset of items that fits into the knapsack and maximizes the total value.

A.  Is the Knapsack Problem in P? If so, prove it.
B. Suppose we restrict the weight of each item to be at most k. We call this the k-Knapsack Problem. Is the k-Knapsack Problem in NP? If so, prove it.
C. Is the k-Knapsack Problem NP-complete? If so, prove it.

    Solution:
    A. Is the Knapsack Problem in P? If so, prove it.

    The Knapsack Problem is not in P, and it is NP-hard. While we have efficient algorithms like dynamic programming that solve it in polynomial time, the fact that it is NP-hard implies that there might not be a polynomial-time algorithm for arbitrary instances of the problem. The dynamic programming solution is efficient for moderate-sized instances, but it becomes impractical for very large instances.
    We can solve it using a dynamic programming algorithm in O(nW) time, where n is the number of items and W is the capacity of the knapsack. The algorithm works by computing the optimal value for all subproblems of choosing a subset of the first i items that fits into a knapsack of capacity j, and using these values to compute the optimal value for choosing a subset of all n items that fits into the knapsack of capacity W. This algorithm is guaranteed to find the optimal solution.

    B. Suppose we restrict the weight of each item to be at most k. We call this the k-Knapsack Problem. Is the k-Knapsack Problem in NP? If so, prove it.

    Yes, the k-Knapsack Problem is in NP. To prove this, we need to show that given a proposed solution (a subset of items), we can verify its correctness in polynomial time. The certificate for the k-Knapsack Problem would be the subset of items chosen. To verify the solution, we can simply go through this subset, check if the weight of each item is at most k, and calculate the total weight. This process takes polynomial time, making the k-Knapsack Problem a decision problem in NP.

    C. Is the k-Knapsack Problem NP-complete? If so, prove it.

    Yes, the k-Knapsack Problem is NP-complete. We can reduce the Subset Sum Problem to the k-Knapsack Problem.

    Reduction: Subset Sum to k-Knapsack

    Given an instance of the Subset Sum Problem with a set of integers {a1, a2, ..., an} and a target sum S, we can construct an instance of the k-Knapsack Problem.

    Create items with weights and values equal to the integers in the Subset Sum instance: (w_i, v_i) = (a_i, a_i) for each i.
    Set the capacity of the knapsack W to k * S.
    Now, the question of whether there exists a subset of the integers that sums to S is equivalent to asking whether there exists a subset of items in the k-Knapsack instance that has a total weight of exactly W.
    This reduction is polynomial-time, and it shows that if we can solve the k-Knapsack Problem efficiently, we can also solve the Subset Sum Problem efficiently.


    Below is the Python implementation of the Knapsack Problem and the k-Knapsack Problem.

In [16]:
def knapsack(weights, values, W):
    n = len(weights)
    dp = [[0] * (W + 1) for _ in range(n + 1)]

    for i in range(1, n + 1):
        for w in range(1, W + 1):
            if weights[i - 1] <= w:
                dp[i][w] = max(values[i - 1] + dp[i - 1][w - weights[i - 1]], dp[i - 1][w])
            else:
                dp[i][w] = dp[i - 1][w]

    return dp[n][W]

def k_knapsack(weights, values, k):
    n = len(weights)
    W = k * sum(weights)  # Capacity of knapsack

    return knapsack(weights, values, W)

# Example usage
weights = [2, 3, 1, 4]
values = [5, 2, 8, 6]
k = 5

result_knapsack = knapsack(weights, values, k)
result_k_knapsack = k_knapsack(weights, values, k)

print("Maximum value for Knapsack:", result_knapsack)
print("Maximum value for k-Knapsack:", result_k_knapsack)

Maximum value for Knapsack: 14
Maximum value for k-Knapsack: 21


REFLECTION:

    ChatGPT was immensely helpful in designing these algorithmic problems. ChatGPT assisted me by,

    -Idea Generation: ChatGPT helped brainstorm ideas based on the structure of the example problem. It guided the creation of a unique scenario involving finding a celebrity.

    -Validation: ChatGPT validated the problem to ensure it maintained the essence of the example while being non-trivial. It helped avoid mere replication.

    -Clarification: I could seek clarifications and suggestions from ChatGPT regarding the problem statement, constraints, and possible variations, which improved the overall quality of the problem.

    Challenges arose to strike a balance between complexity and simplicity. The problem had to be complex enough to require a careful algorithm but not so complex that it could not be solved in a reasonable amount of time.

    This exercise highlighted the importance of clarity in problem statements, specification, and algorithmic thinking. It also highlighted the usefulness of ChatGPT in facilitating the problem formulation creation and implementation phase.
    Overall, it was a valuable experience in algorithmic problem creation and solving.