# Lab 05. Heap (Priority Queue)

## Introduction

In the upcoming lab, you will have the opportunity to gain an deep understanding of the heap (priority queue) and its functionality in some particular scenarios involving a single-threaded CPU and dividing an Array Into Subarrays With Minimum Cost.

Please **note** that NO LATE SUBMISSION will be accepted. If you truly has extraordinary conditions that prevent you from submitting on time, please send email with evidence to  *Professor WANG* and the TA of your experimental class. 

### Goal:

1. To acquire a comprehensive understanding of 'heap' data structures.

2. To acquire practical problem-solving skills utilizing 'heap' data structure in a realistic scenario.

## Task 1. Single-Threaded CPU

You are given n​​​​​​ tasks labeled from 0 to n - 1 represented by a 2D integer array tasks, where tasks[i] = [$enqueueTime_i$, $processingTime_i$] means that the i​​​​​​th​​​​ task will be available to process at $enqueueTime_i$ and will take $processingTime_i$ to finish processing.

You have a single-threaded CPU that can process at most one task at a time and will act in the following way:

- If the CPU is idle and there are no available tasks to process, the CPU remains idle.

- If the CPU is idle and there are available tasks, the CPU will choose the one with the shortest processing time. If multiple tasks have the same shortest processing time, it will choose the task with the smallest index.

- Once a task is started, the CPU will process the entire task without stopping.

- The CPU can finish a task then start a new one instantly.

**Return the order in which the CPU will process the tasks.**


Example 1:

Input: tasks = [[1,2],[2,4],[3,2],[4,1]]

Output: [0,2,3,1]

Explanation: The events go as follows: 

- At time = 1, task 0 is available to process. Available tasks = {0}.

- Also at time = 1, the idle CPU starts processing task 0. Available tasks = {}.

- At time = 2, task 1 is available to process. Available tasks = {1}.

- At time = 3, task 2 is available to process. Available tasks = {1, 2}.

- Also at time = 3, the CPU finishes task 0 and starts processing task 2 as it is the shortest. Available tasks = {1}.

- At time = 4, task 3 is available to process. Available tasks = {1, 3}.

- At time = 5, the CPU finishes task 2 and starts processing task 3 as it is the shortest. Available tasks = {1}.

- At time = 6, the CPU finishes task 3 and starts processing task 1. Available tasks = {}.

- At time = 10, the CPU finishes task 1 and becomes idle.


Example 2:

Input: tasks = [[7,10],[7,12],[7,5],[7,4],[7,2]]

Output: [4,3,2,0,1]

Explanation: The events go as follows:

- At time = 7, all the tasks become available. Available tasks = {0,1,2,3,4}.

- Also at time = 7, the idle CPU starts processing task 4. Available tasks = {0,1,2,3}.

- At time = 9, the CPU finishes task 4 and starts processing task 3. Available tasks = {0,1,2}.

- At time = 13, the CPU finishes task 3 and starts processing task 2. Available tasks = {0,1}.

- At time = 18, the CPU finishes task 2 and starts processing task 0. Available tasks = {1}.

- At time = 28, the CPU finishes task 0 and starts processing task 1. Available tasks = {}.

- At time = 40, the CPU finishes task 1 and becomes idle.
 
Constraints:

- tasks.length == n

**TODO: Please write the function below to meet the requirements above.** (20% Marks)

Hits: To simulate the problem we first need to note that if at any point in time there are no enqueued tasks we need to wait to the smallest enqueue time of a non-processed element. The sorted() or Array.sort() are allowed to use when necessary.

In [None]:
import heapq    # you can follow the https://docs.python.org/3/library/heapq.html to find the help of relevant functions

def cpu_tasks(tasks: list[(int, int)]) -> list[int]:
    # TODO (30% Marks)
    taskId = [i for i in range(len(tasks))]
    taskId.sort(key = lambda x : tasks[x][0])
    order = []
    heap = []
    ti, pos = 0, 0
    while len(order) < len(tasks):
        if len(heap) == 0:
            heapq.heappush(heap, (tasks[pos][1], pos))
            ti = tasks[pos][0]
            pos += 1
            while pos < len(tasks) and tasks[taskId[pos]][0] == tasks[taskId[pos - 1]][0]:
                heapq.heappush(heap, (tasks[taskId[pos]][1], taskId[pos]))
                pos += 1
        else:
            order.append(heapq.heappop(heap)[1])
            ti += tasks[order[-1]][1]
            while pos < len(tasks) and tasks[taskId[pos]][0] <= ti:
                heapq.heappush(heap, (tasks[taskId[pos]][1], taskId[pos]))
                pos += 1
        # print(f"{order} {heap}")
        # print(f"{ti} {pos}")
    return order

# Example usage:
tasks1 = [[1,2],[2,4],[3,2],[4,1]]
print(cpu_tasks(tasks1))  # Output: [0, 2, 3, 1]

tasks2 = [[7,10],[7,12],[7,5],[7,4],[7,2]]
print(cpu_tasks(tasks2))  # Output: [4, 3, 2, 0, 1]

tasks3 = [[1,2],[3,2],[4,1],[2,4]]
print(cpu_tasks(tasks3))  # Output: [0, 1, 2, 3]

# You are allowed to modify the code structure above to adjust your code style, but remember to make it as simple as possible and show the same expected result for your codes output.

## Task 2. Divide an Array Into Subarrays With Minimum Cost


You are given a __0-indexed__ array of integers `nums` of length `n`, and two __positive__ integers `k` and `dist`.


The __cost__ of an array is the value of its __first__ element. For example, the cost of [1,2,3] is 1 while the cost of [3,4,1] is 3.


You need to divide `nums` into `k` __disjoint contiguous subarrays (A subarray is a contiguous non-empty sequence of elements within an array.)__, such that the difference between the starting index of the __second__ subarray and the starting index of the `kth` subarray should be __less than or equal to__ `dist`. In other words, if you divide `nums` into the subarrays `nums[0..(i_1 - 1)], nums[i_1..(i_2 - 1)], ..., nums[i_(k-1)..(n - 1)]`, then `i_(k-1) - i_1 <= dist`.

_Return the __minimum__ possible sum of the cost of these subarrays._

 

__Example 1:__


> __Input__: nums = [1,3,2,6,4,2], k = 3, dist = 3
>
> __Output__: 5
>
> __Explanation__: The best possible way to divide nums into 3 subarrays is: [1,3], [2,6,4], and [2]. This choice is valid because ik-1 - i1 is 5 - 2 = 3 which is equal to dist. The total cost is nums[0] + nums[2] + nums[5] which is 1 + 2 + 2 = 5. 
> It can be shown that there is no possible way to divide nums into 3 subarrays at a cost lower than 5.


__Example 2:__

> __Input__: nums = [10,1,2,2,2,1], k = 4, dist = 3
>
> __Output__: 15
>
> __Explanation__: The best possible way to divide nums into 4 subarrays is: [10], [1], [2], and [2,2,1]. This choice is valid because ik-1 - i1 is 3 - 1 = 2 which is less than dist. The total cost is nums[0] + nums[1] + nums[2] + nums[3] which is 10 + 1 + 2 + 2 = 15. 
> The division [10], [1], [2,2,2], and [1] is not valid, because the difference between i_(k-1) and i_1 is 5 - 1 = 4, which is greater than dist.
It can be shown that there is no possible way to divide nums into 4 subarrays at a cost lower than 15.


__Example 3:__

> __Input__: nums = [10,8,18,9], k = 3, dist = 1
> 
> __Output__: 36
>
> __Explanation__: The best possible way to divide nums into 4 subarrays is: [10], [8], and [18,9]. This choice is valid because ik-1 - i1 is 2 - 1 = 1 which is equal to dist.The total cost is nums[0] + nums[1] + nums[2] which is 10 + 8 + 18 = 36.
> The division [10], [8,18], and [9] is not valid, because the difference between ik-1 and i1 is 3 - 1 = 2, which is greater than dist.
It can be shown that there is no possible way to divide nums into 3 subarrays at a cost lower than 36.
 

__Constraints:__

- 3 <= n <= 105

- 1 <= nums[i] <= 109

- 3 <= k <= n

- k - 2 <= dist <= n - 2

**TODO: Please write the function below to meet the requirements above.** (30% Marks)

The array operations for list in python like add() and remove() can be used in your code.

In [None]:
import heapq

def minimumCost(arr : list[int], k : int, dist : int) -> int :
    # f[i][j][k] = min cost of using the first i elements to form j groups, and the second group starts from k
    f : list[list[list[int]]] = [[[-1 for i in range(len(arr) + 1)] for j in range(0, k + 1)] for k0 in range(0, len(arr) + 1)]
    for i in range(2, len(arr) + 1):
        f[i][2][i] = arr[i - 1] + arr[0]
    for j in range(3, k + 1):
        for k0 in range(2, len(arr) + 1):
            mnf = f[k0][j - 1][k0]
            for i in range(k0 + 1, min(len(arr), k0 + dist) + 1):
                if mnf != -1 :
                    f[i][j][k0] = mnf + arr[i - 1]
                if f[i][j - 1][k0] != -1 and (mnf == -1 or f[i][j - 1][k0] < mnf) :
                    mnf = f[i][j - 1][k0]
    minCostSum = -1
    for st in range(2, len(arr) + 1):
        for ed in range(st, min(st + dist, len(arr)) + 1):
            if f[ed][k][st] != -1 and (minCostSum == -1 or f[ed][k][st] < minCostSum):
                minCostSum = f[ed][k][st]
    return minCostSum

# Testing the function with the provided examples
print(minimumCost([1, 3, 2, 6, 4, 2], k=3, dist=3))  # Expected Output: 5
print(minimumCost([10, 1, 2, 2, 2, 1], k=4, dist=3))  # Expected Output: 15
print(minimumCost([10, 8, 18, 9], k=3, dist=1))  # Expected Output: 36

# You are allowed to modify the code structure above to adjust your code style, but remember to make it as simple as possible and show the same expected result for your codes output.

## Grading Policy

The marks of this is lab is composed of:

* Submission: 50%

* Task1: 20%

* Task2: 30%