# Metadata

**L1 Taxonomy** - Problem Solving & Algorithms

**L2 Taxonomy** - Searching Algorithms

**Subtopic** - Range Majority Query with Preprocessed Indices

**Use Case** - Implement a MajorityChecker class for an integer array that supports queries of the form query(left, right, threshold). Each query must return an element in arr[left…right] that appears at least threshold times (with 2×threshold > subarray length), or -1 if none exists. Preprocess the array to ensure queries run efficiently for arrays up to length 20 000 and up to 10 000 queries.

**Programming Language** - Python

**Target Model** - GPT-4o

# Model Breaking Hints


1) **What is the initial use case?**
   
   Implementing a `MajorityChecker` class for an integer array that supports efficient queries to find an element appearing at least a certain number of times within a subarray.

2) **Why is the initial use case easy?**
   
   The problem deals with a static array and allows preprocessing for efficient queries. Standard data structures like segment trees can handle the required operations efficiently due to the manageable size (arrays up to length 20,000 and 10,000 queries).

3) **How could we make it harder?**
   
   - **Introduce Dynamic Updates (Hint 1):** Allow insertions and deletions in the array, requiring the data structure to handle modifications efficiently.
   - **Transform into a Tree Structure (Hint 2):** Change the array into a tree, necessitating advanced algorithms like Heavy-Light Decomposition to manage path queries.
   - **Combine Multiple Constraints (Hint 4):** Add the condition that the element must be a prime number, introducing additional complexity.
   - **Incorporate Time-Dependent Changes (Hint 5):** Allow elements to change over time and handle historical queries, increasing the difficulty of maintaining past states efficiently.

4) **Which parameters can we change?**

   - **Data Structure:** Replace the array with a dynamic tree to increase complexity.
   - **Operations:** Include dynamic updates such as insertions and deletions of nodes.
   - **Constraints:** Require the returned element to be prime and appear at least **k** times on a path where **2×k >** path length.
   - **Scale:** Increase the number of nodes and operations to up to 100,000 to prevent naive solutions from being efficient.
   - **Additional Mechanics:** Incorporate multiple interacting constraints and require efficient handling of queries and updates, possibly using advanced data structures like persistent segment trees.

5) **What can be a final hard prompt?**

   "Design a data structure for a dynamic tree with up to 100,000 nodes, where each node holds an integer value. The tree must efficiently support the following operations:
   
   1. **Update(u, x)**: Update the value at node **u** to **x**.
   2. **Insert(p, u, x)**: Insert a new node **u** with value **x** as a child of node **p**.
   3. **Delete(u)**: Delete node **u** and its subtree.
   4. **Query(u, v, k)**: On the path from node **u** to node **v**, find an element that appears at least **k** times (with **2×k >** path length) and is a prime number. Return this element or **-1** if none exists.
   
   Handle up to 100,000 mixed operations of updates, insertions, deletions, and queries efficiently."

# Setup

```requirements.txt
```


# Prompt

You are given an integer array arr. Your task is to implement a class MajorityChecker that efficiently answers queries of the form query(l, r, t). Each query should return an element in the subarray arr[l:r] that appears at least t (threshold) times. If no such element exists, return -1.You can assume, 2 * t > r - l + 1. In other words, if an answer exists, it must be a strict majority in the queried subarray. Each query should be answered faster than linear time relative to the subarray size.

Here are a few examples:

### Example 1:
Input:
arr = [1, 1, 2, 2, 1, 1]
majorityChecker = MajorityChecker(arr)
majorityChecker.query(0, 5, 4)

Output:
1

Explanation:
Subarray = [1, 1, 2, 2, 1, 1]; 1 appears 4 times, which is >= threshold.

### Example 2:
Input:
arr = [1, 1, 2, 2, 1, 1]
majorityChecker = MajorityChecker(arr)
majorityChecker.query(0, 3, 3)

Output:
-1

Explanation:
Subarray = [1, 1, 2, 2]; No element appears 3 or more times.

### Example 3:
Input:
arr = [5, 5, 1, 5, 2, 5, 5, 3]
majorityChecker = MajorityChecker(arr)
majorityChecker.query(0, 7, 5)

Output:
5

Explanation:
Subarray = [5, 5, 1, 5, 2, 5, 5, 3]; 5 appears 5 times.


### Constraints:

1 <= arr.length <= 20,000

0 <= arr[i] <= 10^5

1 <= left <= right < arr.length

1 <= threshold <= right - left + 1

1 <= queries.length <= 10000

# Requirements


### Explicit Requirements
- Implement MajorityChecker class with the following two mandatory function
  - `__init__(arr: List[int])`
  - `query(left: int, right: int, threshold: int) -> int`


### Implicit Requirements
- Each query must run faster than linear time.
- Return any one valid majority element if multiple exist.

### Function Signatures

```
class MajorityChecker:
    def __init__(self, arr: List[int])
    def query(self, left: int, right: int, threshold: int) -> int
```

### Edge Case Handling
  - No element meets threshold -> return -1
  - Exactly threshold times -> check if 2 x threshold > subarray length
  - Multiple valid elements -> return any one
  - Single-element subarrays -> return the element if threshold == 1


In [None]:
# code
from collections import defaultdict
from bisect import bisect_left, bisect_right
from typing import List

class MajorityChecker:
    def __init__(self, arr: List[int]):
        self.arr = arr
        self.n = len(arr)
        self.segment_tree = [None] * (4 * self.n)
        self.build(0, 0, self.n - 1)
        self.pos = defaultdict(list)
        for idx, val in enumerate(arr):
            self.pos[val].append(idx)

    def build(self, node, start, end):
        if start == end:
            self.segment_tree[node] = (self.arr[start], 1)
        else:
            mid = (start + end) // 2
            left_child = 2 * node + 1
            right_child = 2 * node + 2
            self.build(left_child, start, mid)
            self.build(right_child, mid + 1, end)
            self.segment_tree[node] = self.merge(
                self.segment_tree[left_child],
                self.segment_tree[right_child]
            )

    def merge(self, left_info, right_info):
        if left_info[0] == right_info[0]:
            return (left_info[0], left_info[1] + right_info[1])
        elif left_info[1] > right_info[1]:
            return (left_info[0], left_info[1] - right_info[1])
        else:
            return (right_info[0], right_info[1] - left_info[1])

    def count_in_range(self, val, left, right):
      indices = self.pos[val]
      return bisect_right(indices, right) - bisect_left(indices, left)

    def query(self, left: int, right: int, threshold: int) -> int:
        maj = self.query_majority(0, 0, self.n - 1, left, right)
        if maj == -1:
            return -1

        count = self.count_in_range(maj, left, right)
        return maj if count >= threshold else -1

    def query_majority(self, node, start, end, left, right):
        if right < start or end < left:
            return -1
        if left <= start and end <= right:
            return self.segment_tree[node][0]
        mid = (start + end) // 2
        left_maj = self.query_majority(2 * node + 1, start, mid, left, right)
        right_maj = self.query_majority(2 * node + 2, mid + 1, end, left, right)

        if left_maj == -1:
            return right_maj
        if right_maj == -1:
            return left_maj

        if left_maj == right_maj:
            return left_maj

        left_count = sum(1 for i in range(max(left, start), min(right + 1, end + 1)) if self.arr[i] == left_maj)
        right_count = sum(1 for i in range(max(left, start), min(right + 1, end + 1)) if self.arr[i] == right_maj)

        return left_maj if left_count > right_count else right_maj

# Example usage:
arr = [1, 1, 2, 2, 1, 1]
majorityChecker = MajorityChecker(arr)
print(majorityChecker.query(0, 5, 4))  # Output: 1
print(majorityChecker.query(0, 3, 3))  # Output: -1
arr = [5, 5, 1, 5, 2, 5, 5, 3]
majorityChecker = MajorityChecker(arr)
print(majorityChecker.query(0, 7, 5))  # Output: 5

In [None]:
# test

import unittest as ut
from main import MajorityChecker

class TestMajorityChecker(ut.TestCase):
    def test_case_1(self):
        nums = [1, 1, 2, 2, 1, 1]
        checker = MajorityChecker(nums)
        self.assertEqual(checker.query(0, 5, 4), 1)

    def test_case_2(self):
        nums = [1, 1, 2, 2, 1, 1]
        checker = MajorityChecker(nums)
        self.assertEqual(checker.query(0, 3, 3), -1)

    def test_case_3(self):
        nums = [5, 5, 1, 5, 2, 5, 5, 3]
        checker = MajorityChecker(nums)
        self.assertEqual(checker.query(0, 7, 5), 5)

    def test_case_4(self):
        nums = [2, 2, 2, 2, 2]
        checker = MajorityChecker(nums)
        self.assertEqual(checker.query(1, 3, 2), 2)

    def test_case_5(self):
        nums = [3, 3, 4, 4, 4, 3, 3]
        checker = MajorityChecker(nums)
        self.assertEqual(checker.query(0, 6, 4), 3)

    def test_case_6(self):
        nums = [1, 2, 3, 4, 5]
        checker = MajorityChecker(nums)
        self.assertEqual(checker.query(0, 4, 2), -1)

    def test_case_7(self):
        nums = [7, 7, 7, 7, 7]
        checker = MajorityChecker(nums)
        self.assertEqual(checker.query(0, 4, 5), 7)

    def test_case_8(self):
        nums = [9]
        checker = MajorityChecker(nums)
        self.assertEqual(checker.query(0, 0, 1), 9)

    def test_case_9(self):
        nums = [1, 2, 1, 2, 1]
        checker = MajorityChecker(nums)
        self.assertEqual(checker.query(0, 4, 3), 1)

    def test_case_10(self):
        nums = [1, 2, 3, 2, 2, 2, 4]
        checker = MajorityChecker(nums)
        self.assertEqual(checker.query(1, 5, 3), 2)

if __name__ == "__main__":
    ut.main()

# Model Breaking Proof

#### Model Breaking Task URL: https://rlhf-v3.turing.com/prompt/c7824db2-d808-4f94-a2a0-bfc5040b03e9?origin=https%3A%2F%2Flabeling-ms.turing.com&redirect_url=https%3A%2F%2Flabeling-ms.turing.com%2Fconversations%2F45426%2Fview

#### Model code:

```python
from collections import defaultdict

class MajorityChecker:
    def __init__(self, arr):
        self.arr = arr
        self.n = len(arr)
        self.segment_tree = [None] * (4 * self.n)
        self.build(0, 0, self.n - 1)

    def build(self, node, start, end):
        if start == end:
            self.segment_tree[node] = (self.arr[start], 1)
        else:
            mid = (start + end) // 2
            left_child = 2 * node + 1
            right_child = 2 * node + 2
            self.build(left_child, start, mid)
            self.build(right_child, mid + 1, end)
            self.segment_tree[node] = self.merge(
                self.segment_tree[left_child],
                self.segment_tree[right_child]
            )

    def merge(self, left_info, right_info):
        if left_info[0] == right_info[0]:
            return (left_info[0], left_info[1] + right_info[1])
        elif left_info[1] > right_info[1]:
            return (left_info[0], left_info[1] - right_info[1])
        else:
            return (right_info[0], right_info[1] - left_info[1])

    def query(self, left, right, threshold):
        maj = self.query_majority(0, 0, self.n - 1, left, right)
        if maj == -1:
            return -1
        # Check actual count of `maj` in the range [left, right]
        count = sum(1 for i in range(left, right + 1) if self.arr[i] == maj)
        return maj if count >= threshold else -1

    def query_majority(self, node, start, end, left, right):
        if right < start or end < left:
            return -1
        if left <= start and end <= right:
            return self.segment_tree[node][0]
        mid = (start + end) // 2
        left_maj = self.query_majority(2 * node + 1, start, mid, left, right)
        right_maj = self.query_majority(2 * node + 2, mid + 1, end, left, right)

        # If one returns -1, return the other
        if left_maj == -1:
            return right_maj
        if right_maj == -1:
            return left_maj

        # If both are the same, return it
        if left_maj == right_maj:
            return left_maj

        # Otherwise return the more frequent one in this range
        left_count = sum(1 for i in range(max(left, start), min(right + 1, end + 1)) if self.arr[i] == left_maj)
        right_count = sum(1 for i in range(max(left, start), min(right + 1, end + 1)) if self.arr[i] == right_maj)

        return left_maj if left_count > right_count else right_maj

# Example usage:
arr = [1, 1, 2, 2, 1, 1]
majorityChecker = MajorityChecker(arr)
print(majorityChecker.query(0, 5, 4))  # Output: 1
print(majorityChecker.query(0, 3, 3))  # Output: -1
print(majorityChecker.query(0, 7, 5))  # Output: 5

# <Issue>: The model generated wrong code for execution and resulted in an error
```