# Exercise — Scheduling Optimisation

## Scenario

You must assign 5 tasks to 5 engineers.
Each engineer has a task-specific “cost” (lower is better).
You must assign exactly 1 task per engineer.

### Input

```python
cost_matrix = np.array([
    [9, 2, 7, 8, 6],
    [6, 4, 3, 7, 5],
    [5, 8, 1, 8, 7],
    [7, 6, 9, 4, 3],
    [8, 5, 6, 9, 4]
])
```
Rows: engineers
Columns: tasks

### Task

Write:

```python
def assign_tasks(cost_matrix: np.ndarray):
    ...
```

### Requirements

- Use scipy.optimize.linear_sum_assignment.
- Return a dictionary:
```python
{
    "assignments": [(engineer_idx, task_idx), ...],
    "total_cost": float
}
```

### Evaluation Criteria
- Correct use of the algorithm
- Type hints
- Nicely formatted output
- No unnessary loops



## Thinking through the problem

### 1. What is the problem, in plain terms?

We have:
- `n_engineers`
- `n_tasks`
- `cost_matrix[i, j]` = cost if engineer `i` is assigned to task `j`

We want to assign each engineer to exactly one task (and each task to at most one engineer) such that the total cost is minimized.

The function must return:

```python
{
    "assignments": [(engineer_idx, task_idx), ...],
    "total_cost": float
}
```
Where:
- `engineer_idx` and `task_idx` are 0-based indices (consistent with numpy).
- `total_cost` is the sum of `cost_matrix[i, j]` for all chosen pairs.



### 2. Shape and assumptions

`cost_matrix` is a 2D numpy array of shape:
```python
(n_engineers, n_tasks)
```

Some quick considerations:
- Typical assignment problenms are square: same number of workers and tasks.
- `scipy.optimize.linear_sum_assignment` can handle rectangular matrices.
    - If there are more tasks than engineers, we will assign each engineer to exactly one task, and some tasks remain unassigned.
    - If there are more engineers than tasks, some engineers remain unassigned.

<!-- The problem statement is a bit vague about this, but since they just say “cost matrix (n_engineers, n_tasks)” and don’t complain about rectangular, we can just trust linear_sum_assignment to behave sensibly. -->

### 3. Using `linear_sum_assignment`







`linear_sum_assignment(cost_matrix)` returns two arrays:
```python
row_ind, col_ind = linear_sum_assignment(cost_matrix)
```

- `row_ind[k]` = index of an engineer
-  `col_ind[k]` = index of the task that engineer is assigned to

These indices are aligned, i.e.:
- Assignment `k` is `row_ind[k], col_ind[k]`

So, the list of pairs we want is:
```python
cost_matrix_demo = np.array([
    [9, 2, 7, 8, 6],
    [6, 4, 3, 7, 5],
    [5, 8, 1, 8, 7],
    [7, 6, 9, 4, 3],
    [8, 5, 6, 9, 4]
])
```
You will get some `row_ind` and `col_ind` arrays, and each (row_ind[i], col_ind[i]) is an (engineer_idx, task_idx)` pair.






### 4. Computing the total cost

Once we have the assignments, we compute:
$$
\text{total\_cost} = \sum_k \text{cost\_matrix}[row\_ind[k], col\_ind[k]]
$$
In NumPy terms:
- we can loop:
```python
total_cost = 0.0
for i, j in assignments:
    total_cost += cost_matrix[i, j]
```
- Or we can be fancy and use advanced indexing:
```python
total_cost = float(cost_matrix[row_ind, col_ind].sum())
```

### 5. Edge cases / sanity checks

Conceptually:
- `cost_matrix` should be numeric and 2D.
- We don't typically get infeasibility here; the algorithm always gives a minimal assignment for the given matrix.
- Negative costs are allowed by `linear_sum_assignent`, but usually cost matrices are non-negative. 

Sanity checks after implementation:
- Number of assignments equals `min(n_engineers, n_tasks)` (for rectangular).
- Each engineer index appears at most once.
- Each task index appears at most once.
- `total_cost` equals sum of `cost_matrix[i, j]` for each pair in `assignments`.

## Code implementation

In [None]:
import math 
from typing import Dict, Any, List, Tuple 
import numpy as np 
import pandas as pd 
from scipy.optimize import linear_sum_assignment

def assign_tasks(cost_matrix: np.ndarray) -> Dict[str, Any]:
    """
    Assign tasks to engineers to minimize total cost using the assignment problem.
    
    This function solves a classic assignment problem: given N engineers and N tasks,
    find the one-to-one assignment that minimizes total cost. This is a fundamental
    optimization problem in operations research, useful for resource allocation,
    scheduling, and matching problems.
    
    The Hungarian algorithm (via linear_sum_assignment) efficiently solves this
    in polynomial time, making it practical for real-world applications.

    Args:
        cost_matrix: 2D numpy array of shape (n_engineers, n_tasks)
                    cost_matrix[i, j] is the cost of assigning engineer i to task j.
                    Lower values are better (we're minimizing total cost).
                    
    Returns:
        {
            "assignments": [(engineer_idx, task_idx), ...],
            "total_cost": float
        }
        Each tuple (i, j) in assignments means engineer i is assigned to task j.
        total_cost is the sum of cost_matrix[i, j] for all assignment pairs.
    """
    # STEP 1: Solve the assignment problem using the Hungarian algorithm
    # linear_sum_assignment finds the optimal one-to-one matching that minimizes
    # the sum of costs. It returns two aligned arrays:
    #   - row_ind: engineer indices in the optimal assignment
    #   - col_ind: task indices assigned to those engineers
    # The arrays are aligned: engineer row_ind[k] is assigned to task col_ind[k]
    # 
    # Why Hungarian algorithm? It's efficient (O(n³)) and guarantees optimality
    # for this type of problem, unlike greedy approaches which might miss better solutions
    row_ind, col_ind = linear_sum_assignment(cost_matrix)

    # STEP 2: Format assignments as list of (engineer_idx, task_idx) tuples
    # Convert the aligned arrays into a readable list of pairs
    # This format is more intuitive than separate arrays and easier to iterate over
    # Example: row_ind=[0,1,2], col_ind=[1,0,2] → [(0,1), (1,0), (2,2)]
    #          means engineer 0→task 1, engineer 1→task 0, engineer 2→task 2
    assignments: List[Tuple[int, int]] = list(zip(row_ind.tolist(), col_ind.tolist()))  

    # STEP 3: Compute total cost of the optimal assignment
    # Use NumPy advanced indexing: cost_matrix[row_ind, col_ind] extracts the cost
    # for each assignment pair, creating an array of individual costs
    # Then sum() adds them up to get the total cost
    # This is more efficient than looping: cost_matrix[row_ind, col_ind].sum()
    # Example: If assignments are [(0,1), (1,0), (2,2)], we compute:
    #          cost_matrix[0,1] + cost_matrix[1,0] + cost_matrix[2,2]
    total_cost = float(cost_matrix[row_ind, col_ind].sum())

    # STEP 4: Return structured result
    # Dictionary format provides clear, self-documenting output that's easy to
    # serialize (JSON), display in UIs, or use in downstream processing
    return {
        "assignments": assignments,  # List of (engineer_idx, task_idx) pairs
        "total_cost": total_cost  # Sum of costs for optimal assignment
    }

# if __name__ == "__main__":

cost_matrix_demo = np.array([
    [9, 2, 7, 8, 6],
    [6, 4, 3, 7, 5],
    [5, 8, 1, 8, 7],
    [7, 6, 9, 4, 3],
    [8, 5, 6, 9, 4],
])

result3 = assign_tasks(cost_matrix_demo)
print("Exercise result:", result3)

## sanity check
# Assignments:

# Engineer 0 → Task 1 → cost = 2

# Engineer 1 → Task 0 → cost = 6

# Engineer 2 → Task 2 → cost = 1

# Engineer 3 → Task 3 → cost = 4

# Engineer 4 → Task 4 → cost = 4

# Total cost = 2 + 6 + 1 + 4 + 4 = 17.

Exercise result: {'assignments': [(0, 1), (1, 0), (2, 2), (3, 3), (4, 4)], 'total_cost': 17.0}
