# Metadata

**L1 Taxonomy** - Problem Solving & Algorithms

**L2 Taxonomy** - Searching Algorithms

**Subtopic** - Interval Containment Search with Linear Scan

**Use Case** - Given two arrays representing each student’s start and end times for doing homework and a queryTime, determine how many students are actively working at that exact moment. A student i is considered active if queryTime lies between startTime[i] and endTime[i] inclusive. Return the total count of such active students.

**Programming Language** - Python

**Target Model** - o1

# Model Breaking Hints

1) What is the initial use case?
The initial problem asks to determine the number of students actively working on homework at a given queryTime, based on each student's start and end times.

2) Why is the initial use case easy?
It requires only simple iteration over the intervals to count how many include the queryTime, involving basic comparisons without advanced algorithms or data structures.

3) How could we make it harder?
By using Hint 1 to extend each student's homework to multiple intervals and process numerous query times, Hint 2 to introduce dependencies where intervals depend on the completion of others, and Hint 3 to require optimizing for the maximum number of students active simultaneously under these constraints, necessitating advanced algorithms.

4) Which parameters can we change?
We can allow multiple intervals per student, introduce dependencies between intervals (forming a graph), and require computation of maximum concurrency, which involves changing the problem into one that requires advanced data structures and algorithms like graph traversal, topological sorting, and interval scheduling optimization.

5) What can be a final hard prompt?
"Given a set of students where each student has multiple homework intervals with varying durations, and certain intervals cannot start until specific other intervals (possibly from other students) have finished (dependencies), determine the maximum number of students that can be concurrently active at any moment while satisfying all given dependencies. Use advanced algorithms to compute this maximum concurrency under these constraints."

# Setup

```requirements.txt
```


You are tasked with designing a system to track student homework activity across multiple time intervals. Each student may have several intervals during which they work on homework. Some of these intervals cannot begin until other intervals (either their own or from other students) are completed. Additionally, you are given a list of time points (query times), and your goal is to compute how many students are actively working at each time point. Finally, among all these query times, determine the maximum number of students that were active at any single time point.

An interval is considered active for a student at a given query time if:

1. The interval has all of its dependencies completed by that time.
2. The query time lies within the interval's start and end time (inclusive).

A student is considered active at a query time if at least one of their intervals is active at that time.

## Input Format

The input consists of the following:

1. intervals: A dictionary mapping integer student IDs to lists of intervals.
   Each interval is a list of two integers: [start_time, end_time], inclusive.
   Format: `intervals: dict[int, list[list[int]]]`

2. dependencies: A dictionary mapping target intervals to lists of prerequisite intervals.
   Each key is a tuple (student_id, interval_index) and each value is a list of tuples representing dependencies in the format (student_id, interval_index).
   Format: `dependencies: dict[tuple[int, int], list[tuple[int, int]]]`

3. query_times: A list of integers representing times at which the number of active students should be evaluated.
   Format: `query_times: list[int]`

## Output Format

Return a single integer: the maximum number of students who were active at any query time from the given list.

## Constraints

* 0 <= number of students <= 10^4
* 0 <= total number of intervals <= 10^5
* 0 <= number of dependencies <= 10^5
* 0 <= start_time <= end_time <= 10^9
* 0 <= length of query_times <= 10^5
* query_times are sorted or unsorted (do not assume any order)

## Examples

Example 1:

Input:

```python
intervals = {
    0: [[1, 3], [5, 6]],
    1: [[2, 4]],
    2: [[3, 5], [6, 8]]
}

dependencies = {
    (0, 1): [(0, 0)],
    (2, 0): [(1, 0)],
    (2, 1): [(2, 0)]
}

query_times = [2, 4, 6]
```

Output:

```python
2
```

Example 2:

Input:

```python
intervals = {
    0: [[1, 4], [6, 8]],
    1: [[2, 5]],
    2: [[5, 7]]
}

dependencies = {
    (0, 1): [(0, 0)],     # Student 0's second interval depends on the first
    (2, 0): [(1, 0)]      # Student 2's only interval depends on Student 1's interval
}

query_times = [3, 5, 6, 7]
```
Output:

```python
2
```


# Requirements

## Explicit Requirements

1. Each student may have zero or more intervals.
2. Dependencies may refer to:

   * Intervals of the same student.
   * Intervals of other students.
3. A dependent interval can only be considered valid for evaluation if all its prerequisite intervals have fully ended before the query time.
4. The student is counted only once per query time, even if multiple of their intervals are active.
5. An interval is valid for inclusion only if all of its dependencies have been satisfied.
6. Dependencies form a directed acyclic graph (DAG). No cycles will be present.
7. Intervals are non-overlapping per student, but may overlap across students.
8. You must compute the number of students active at each time in query_times and return the maximum among those.

## Implicit Requirements

1. Efficient dependency evaluation is needed, so consider topological sorting to determine the order of interval unlocks.
2. You may preprocess intervals to resolve dependency chains before answering queries.
3. Treat all start and end times as integers.
4. All time comparisons are inclusive: query_time is considered inside the interval if start_time <= query_time <= end_time.

## Function definition

```python
def max_active_students(intervals: dict[int, list[list[int]]],
                        dependencies: dict[tuple[int, int], list[tuple[int, int]]],
                        query_times: list[int]) -> int:
```

## Edge Cases

1. In case of invalid input like negative numbers, decimals, invalid data type, empty or missig values, return 0.

2. If the `intervals` dictionary is empty, return 0 regardless of query times.

3. If `query_times` is empty, return 0 even if intervals exist.

4. If interval dependencies form a cycle, treat the input as invalid, return 0.

5. If all students have overlapping intervals and no dependencies, count each unique student active at the query time.

6. If an interval's prerequisites cannot complete before any query time, it should never be counted as active.

## Constraints

* It should handle dependencies efficiently.

* Check across multiple time points.

* Don't use any external libraries.

* Don't use any helper function.

* No vectorized operations and explicitly iterate over arrays, no numpy or parallelization.

* Each interval is uniquely identified by its student ID and local index and must be treated independently in the graph.

* Input lists are not guaranteed to be sorted, so the implementation must handle ordering as needed.

* Memory usage should scale linearly with the number of students, intervals, and dependencies.

* The solution must complete within acceptable time limits for inputs at the constraint boundaries.


In [None]:
# code

def max_active_students(intervals: dict[int, list[list[int]]],
                        dependencies: dict[tuple[int, int], list[tuple[int, int]]],
                        query_times: list[int]) -> int:

    # Performing input validation and checking for edge cases
    try:
        if not isinstance(intervals, dict) or not isinstance(dependencies, dict) or not isinstance(query_times, list):
            return 0
        if not intervals or not query_times:
            return 0
        for student_id, student_intervals in intervals.items():
            if not isinstance(student_id, int) or not isinstance(student_intervals, list):
                return 0
            for interval in student_intervals:
                if (not isinstance(interval, list) or len(interval) != 2 or
                    not all(isinstance(x, int) and x >= 0 for x in interval) or
                    interval[0] > interval[1]):
                    return 0
        for interval_key, prereqs in dependencies.items():
            if (not isinstance(interval_key, tuple) or len(interval_key) != 2 or
                not isinstance(prereqs, list)):
                return 0
            for prereq in prereqs:
                if not isinstance(prereq, tuple) or len(prereq) != 2:
                    return 0
        for time in query_times:
            if not isinstance(time, int) or time < 0:
                return 0
    except:
        return 0

    # Step 1: Assigning unique IDs to all intervals
    interval_id_map = {}  # Maps (student_id, interval_index) to unique ID
    start_times = []
    end_times = []
    student_owners = []
    current_id = 0

    for student_id, student_intervals in intervals.items():
        for idx, (start, end) in enumerate(student_intervals):
            interval_id_map[(student_id, idx)] = current_id
            start_times.append(start)
            end_times.append(end)
            student_owners.append(student_id)
            current_id += 1

    total_intervals = current_id

    # Validating that all dependency references are valid
    for target_interval, prereq_list in dependencies.items():
        if target_interval not in interval_id_map:
            return 0
        for prereq_interval in prereq_list:
            if prereq_interval not in interval_id_map:
                return 0

    # Step 2: Build graph and compute indegrees for topological sort
    graph = [[] for _ in range(total_intervals)]
    indegree = [0] * total_intervals

    for target, prereq_list in dependencies.items():
        target_id = interval_id_map[target]
        for prereq in prereq_list:
            prereq_id = interval_id_map[prereq]
            graph[prereq_id].append(target_id)
            indegree[target_id] += 1

    # Step 3: Doing topological sort using queue (manual)
    topo_queue = []
    for i in range(total_intervals):
        if indegree[i] == 0:
            topo_queue.append(i)

    topo_order = []
    head = 0
    while head < len(topo_queue):
        current = topo_queue[head]
        head += 1
        topo_order.append(current)
        for neighbor in graph[current]:
            indegree[neighbor] -= 1
            if indegree[neighbor] == 0:
                topo_queue.append(neighbor)

    # If not all intervals are sorted, there's a cycle
    if len(topo_order) < total_intervals:
        return 0

    # Step 4: Computing earliest valid start time after dependencies are met
    latest_dependency_end = [0] * total_intervals
    for interval in topo_order:
        for neighbor in graph[interval]:
            latest_dependency_end[neighbor] = max(latest_dependency_end[neighbor], end_times[interval])

    # Step 5: Building per-student list of valid (active) intervals
    student_valid_intervals = {}
    for (student_id, interval_idx), internal_id in interval_id_map.items():
        effective_start = max(start_times[internal_id], latest_dependency_end[internal_id])
        effective_end = end_times[internal_id]
        if effective_start <= effective_end:
            if student_id not in student_valid_intervals:
                student_valid_intervals[student_id] = []
            student_valid_intervals[student_id].append([effective_start, effective_end])

    # Step 6: Merging intervals for each student (per student only counts once)
    all_merged_intervals = []
    for intervals_list in student_valid_intervals.values():
        intervals_list.sort()
        merged_start, merged_end = intervals_list[0]
        for start, end in intervals_list[1:]:
            if start <= merged_end + 1:
                merged_end = max(merged_end, end)
            else:
                all_merged_intervals.append((merged_start, merged_end))
                merged_start, merged_end = start, end
        all_merged_intervals.append((merged_start, merged_end))

    if not all_merged_intervals:
        return 0

    # Step 7: Preparing sweep line events
    events = []
    for start, end in all_merged_intervals:
        events.append((start, 1))      # Student becomes active
        events.append((end + 1, -1))   # Student becomes inactive

    events.sort()

    # Step 8: Processing all queries to count active students
    queries = [(time, idx) for idx, time in enumerate(query_times)]
    queries.sort()

    active_students = 0
    result = [0] * len(query_times)
    event_ptr = 0

    for query_time, original_idx in queries:
        while event_ptr < len(events) and events[event_ptr][0] <= query_time:
            active_students += events[event_ptr][1]
            event_ptr += 1
        result[original_idx] = active_students

    return max(result)

In [None]:
# tests
"""Unit tests for the max_active_students function."""

import unittest
from main import max_active_students


class TestMaxActiveStudentsFull(unittest.TestCase):
    """Test suite for the max_active_students function."""

    def test_case_01_empty_inputs(self):
        """Test with all empty inputs."""
        self.assertEqual(max_active_students({}, {}, []), 0)

    def test_case_02_wrong_types(self):
        """Test with wrong input types."""
        self.assertEqual(max_active_students([], {}, [1]), 0)
        self.assertEqual(max_active_students({}, [], [1]), 0)
        self.assertEqual(max_active_students({}, {}, 'invalid'), 0)

    def test_case_03_invalid_interval_ranges(self):
        """Test with invalid interval ranges."""
        self.assertEqual(max_active_students({1: [[1, 0]]}, {}, [1]), 0)
        self.assertEqual(max_active_students({1: [[-1, 5]]}, {}, [1]), 0)

    def test_case_04_invalid_dependencies(self):
        """Test with invalid dependency definitions."""
        deps = {(1, 0): [('x', 1)]}
        self.assertEqual(max_active_students({1: [[1, 5]]}, deps, [1]), 0)

        deps = {(1, 0): [(1, 1)]}
        self.assertEqual(max_active_students({1: [[1, 5]]}, deps, [1]), 0)

    def test_case_05_invalid_query(self):
        """Test with invalid query values."""
        self.assertEqual(max_active_students({1: [[1, 5]]}, {}, [-1]), 0)

    def test_case_06_single_student_no_dependencies(self):
        """Test a single student with multiple intervals."""
        intervals = {1: [[1, 5], [10, 15]]}
        queries = [0, 1, 5, 10, 12, 15, 16]
        self.assertEqual(max_active_students(intervals, {}, queries), 1)

    def test_case_07_self_dependency(self):
        """Test where a student's interval depends on another of their own."""
        intervals = {1: [[1, 5], [6, 10]]}
        dependencies = {(1, 1): [(1, 0)]}
        queries = [1, 5, 6, 10]
        self.assertEqual(
            max_active_students(intervals, dependencies, queries), 1
        )

    def test_case_08_cross_student_dependencies(self):
        """Test intervals with inter-student dependencies."""
        intervals = {1: [[1, 4]], 2: [[2, 6], [7, 10]]}
        dependencies = {(2, 1): [(1, 0), (2, 0)]}
        queries = [1, 3, 6, 7, 9, 10]
        self.assertEqual(
            max_active_students(intervals, dependencies, queries), 2
        )

    def test_case_09_cycle_detection(self):
        """Test with cyclic dependencies between intervals."""
        intervals = {1: [[1, 5], [6, 10]]}
        dependencies = {(1, 0): [(1, 1)], (1, 1): [(1, 0)]}
        self.assertEqual(
            max_active_students(intervals, dependencies, [1, 2, 3]), 0
        )

    def test_case_10_merging_intervals(self):
        """Test overlapping intervals."""
        intervals = {1: [[1, 3], [2, 5], [7, 9]], 2: [[4, 6]]}
        queries = [1, 3, 4, 5, 6, 7, 8, 9]
        self.assertEqual(max_active_students(intervals, {}, queries), 2)

    def test_case_11_disjoint_students(self):
        """Test disjoint student intervals."""
        intervals = {1: [[1, 2]], 2: [[3, 4]], 3: [[5, 6]]}
        self.assertEqual(max_active_students(intervals, {}, [1, 3, 5]), 1)

    def test_case_12_chain_dependencies(self):
        """Test chain of dependencies across students."""
        intervals = {
            1: [[1, 2]],
            2: [[3, 4]],
            3: [[5, 6]],
            4: [[7, 8]],
        }
        dependencies = {
            (2, 0): [(1, 0)],
            (3, 0): [(2, 0)],
            (4, 0): [(3, 0)],
        }
        queries = [1, 3, 5, 7]
        self.assertEqual(
            max_active_students(intervals, dependencies, queries), 1
        )

    def test_case_13_parallel_dependencies(self):
        """Test parallel dependencies for one interval."""
        intervals = {1: [[1, 3]], 2: [[1, 3]], 3: [[4, 6]]}
        dependencies = {(3, 0): [(1, 0), (2, 0)]}
        self.assertEqual(
            max_active_students(intervals, dependencies, [1, 4]), 2
        )

    def test_case_14_multiple_deps_same_interval(self):
        """Test interval with multiple dependencies."""
        intervals = {1: [[1, 2]], 2: [[1, 2]], 3: [[3, 4]]}
        dependencies = {(3, 0): [(1, 0), (2, 0)]}
        self.assertEqual(max_active_students(intervals, dependencies, [3]), 1)

    def test_case_15_all_active_same_time(self):
        """Test all students active at the same time with no dependencies."""
        intervals = {
            1: [[1, 5]],
            2: [[1, 5]],
            3: [[1, 5]],
        }
        self.assertEqual(max_active_students(intervals, {}, [3]), 3)

    def test_case_16_dependency_causes_expiry(self):
        """Test where dependency still allows activity in a tight window."""
        intervals = {
            1: [[1, 2]],
            2: [[2, 2]],
        }
        dependencies = {(2, 0): [(1, 0)]}
        self.assertEqual(
            max_active_students(intervals, dependencies, [2]), 2
        )

    def test_case_17_large_range(self):
        """Test single large interval with wide query range."""
        intervals = {1: [[0, 1000000]]}
        self.assertEqual(max_active_students(intervals, {}, [500000]), 1)

    def test_case_18_dependency_skips_window(self):
        """Test where dependency causes window to be missed."""
        intervals = {
            1: [[1, 2]],
            2: [[1, 1]],
        }
        dependencies = {(2, 0): [(1, 0)]}
        self.assertEqual(
            max_active_students(intervals, dependencies, [1, 2]), 1
        )

    def test_case_19_query_before_any_interval(self):
        """Test query timestamp before any interval starts."""
        intervals = {1: [[10, 20]]}
        self.assertEqual(max_active_students(intervals, {}, [5]), 0)

    def test_case_20_query_after_all_intervals(self):
        """Test query timestamp after all intervals end."""
        intervals = {1: [[1, 5]]}
        self.assertEqual(max_active_students(intervals, {}, [10]), 0)

    def test_case_21_disconnected_dependency_components(self):
        """Test multiple dependency trees isolated from each other."""
        intervals = {
            1: [[1, 2]],
            2: [[2, 3]],
            3: [[10, 11]],
            4: [[12, 13]],
        }
        dependencies = {
            (2, 0): [(1, 0)],
            (4, 0): [(3, 0)],
        }
        queries = [2, 3, 10, 13]
        self.assertEqual(
            max_active_students(intervals, dependencies, queries), 2
        )


# Model Breaking Proof

#### Model Breaking Task URL: https://labeling-ms.turing.com/conversations/44967/view

#### Model code:

```python
def max_active_students(intervals: dict[int, list[list[int]]],
                        dependencies: dict[tuple[int, int], list[tuple[int, int]]],
                        query_times: list[int]) -> int:
    # 1) Basic validations and early returns
    if not isinstance(intervals, dict) or not isinstance(dependencies, dict) or not isinstance(query_times, list):
        return 0
    if len(intervals) == 0 or len(query_times) == 0:
        return 0
    
    # Check for obviously invalid data (negative times, etc.)
    # Since problem statement just says "return 0" for invalid, we do a quick check.
    # (We won't be exhaustive; just a minimal sanity check.)
    for s_id, ivs in intervals.items():
        if s_id < 0:
            return 0
        if not isinstance(ivs, list):
            return 0
        for rng in ivs:
            if (not isinstance(rng, list) or len(rng) != 2 or
                not isinstance(rng[0], int) or not isinstance(rng[1], int) or
                rng[0] < 0 or rng[1] < 0 or rng[0] > rng[1]):
                return 0
    for t in query_times:
        if not isinstance(t, int) or t < 0:
            return 0
    # Dependencies check
    for (ts, ti), deps in dependencies.items():
        if ts < 0 or ti < 0:
            return 0
        if not isinstance(deps, list):
            return 0
        for (ds, di) in deps:
            if ds < 0 or di < 0:
                return 0

    # 2) Collect all intervals in a global list, assign each (student, local_index) => global_id
    all_interval_info = []  # list of (student_id, start_time, end_time)
    map_to_global = {}      # (student_id, local_index) => global_id
    
    global_index = 0
    for s_id, iv_list in intervals.items():
        for i, (st, en) in enumerate(iv_list):
            map_to_global[(s_id, i)] = global_index
            all_interval_info.append((s_id, st, en))
            global_index += 1
    
    total_intervals = len(all_interval_info)
    if total_intervals == 0:
        # No intervals at all
        return 0
    
    # 3) Build adjacency list + in_degree.  Edges go: dep -> target
    adjacency = [[] for _ in range(total_intervals)]
    in_degree = [0] * total_intervals
    
    # Fill edges from dependencies
    for (t_s, t_i), dep_list in dependencies.items():
        # Target's global id
        if (t_s, t_i) not in map_to_global:
            # Invalid dependency reference. Return 0 as stated.
            return 0
        target_id = map_to_global[(t_s, t_i)]
        
        for (d_s, d_i) in dep_list:
            if (d_s, d_i) not in map_to_global:
                # Invalid reference => 0
                return 0
            dep_id = map_to_global[(d_s, d_i)]
            # Edge: dep_id -> target_id
            adjacency[dep_id].append(target_id)
            in_degree[target_id] += 1

    # <Issue>: Model violates constraint of no imports allowed. Moreover this is not the right place for imports.
    # 4) Topological sort to compute dep_end for each interval
    from collections import deque
    
    dep_end = [0] * total_intervals  # maximum end time of all dependencies
    queue = deque()
    # Initially push all intervals with in_degree=0
    for i in range(total_intervals):
        if in_degree[i] == 0:
            queue.append(i)
    
    visited_count = 0
    while queue:
        cur = queue.popleft()
        visited_count += 1
        
        # cur_end_time = all_interval_info[cur].end_time
        # but let's fetch from stored array
        _, _, cur_end_time = all_interval_info[cur]
        
        # For each neighbor, update dep_end
        for nxt in adjacency[cur]:
            dep_end[nxt] = max(dep_end[nxt], cur_end_time)
            in_degree[nxt] -= 1
            if in_degree[nxt] == 0:
                queue.append(nxt)
    
    # If we didn't visit all intervals, there's a cycle => return 0
    if visited_count < total_intervals:
        return 0
    
    # 5) Compute each interval's effective start
    #    effective_start[i] = max(original_start[i], dep_end[i])
    #    If effective_start[i] <= end_time[i], it is valid
    events = []  # (time, type, param)
    # type: 0=START, 1=QUERY, 2=END
    # param for START/END is student_id, for QUERY is index in query_times
    
    for i in range(total_intervals):
        s_id, st, en = all_interval_info[i]
        eff_start = max(st, dep_end[i])
        if eff_start <= en:
            # add start and end events
            # start: (eff_start, 0, s_id)
            # end:   (en+1, 2, s_id)
            # so that an interval that ends at 'en' is still active at time=en,
            # and becomes inactive exactly at en+1.
            events.append((eff_start, 0, s_id))
            events.append((en + 1, 2, s_id))
    
    # 6) Add query events
    #    For each query time q, we will store (q, 1, index_of_this_query)
    for idx, q in enumerate(query_times):
        events.append((q, 1, idx))
    
    # 7) Sort events by (time, type)
    #    We want START before QUERY before END if they share the same time
    #    => we can rely on sorting by "type" in ascending order
    events.sort(key=lambda x: (x[0], x[1]))
    
    # <Issue>: Model violates constraint of no imports allowed. Moreover this is not the right place for imports.
    # 8) Sweep line to answer queries
    from collections import defaultdict
    student_active_count = defaultdict(int)
    active_students = 0
    answers = [0] * len(query_times)
    
    # <Issue>: Model did not implement the overlapping intervals logic correctly and falsely double-counts the intervals of same student
    for (time, etype, param) in events:
        if etype == 0:
            # START
            st_id = param
            old_val = student_active_count[st_id]
            new_val = old_val + 1
            student_active_count[st_id] = new_val
            if old_val == 0:
                active_students += 1
        
        elif etype == 1:
            # QUERY
            q_index = param
            answers[q_index] = active_students
        
        else:
            # END
            st_id = param
            old_val = student_active_count[st_id]
            new_val = old_val - 1
            student_active_count[st_id] = new_val
            if new_val == 0:
                active_students -= 1
                
    # <Issue>: Model doesn't merge intervals per student accurately resulting in incorrect active_students count

    # 9) Return the maximum among all answers
    if not answers:  # in case query_times was empty, but we've handled that earlier
        return 0
    return max(answers)


# ------------------------------------------------------------------
#  Quick Examples Based on the Prompt
#
# Example 1
# intervals = {
#    0: [[1, 3], [5, 6]],
#    1: [[2, 4]],
#    2: [[3, 5], [6, 8]]
# }
# dependencies = {
#    (0, 1): [(0, 0)],
#    (2, 0): [(1, 0)],
#    (2, 1): [(2, 0)]
# }
# query_times = [2, 4, 6]
# print(max_active_students(intervals, dependencies, query_times))  # Expect 2
#
# Example 2
# intervals = {
#    0: [[1, 4], [6, 8]],
#    1: [[2, 5]],
#    2: [[5, 7]]
# }
# dependencies = {
#    (0, 1): [(0, 0)],
#    (2, 0): [(1, 0)]
# }
# query_times = [3, 5, 6, 7]
# print(max_active_students(intervals, dependencies, query_times))  # Expect 2
#
# You can test these examples by uncommenting and running.
```