# Metadata

**L1 Taxonomy** - Background Processes

**L2 Taxonomy** - Cron Jobs

**Subtopic** - Implementing a priority for scheduled tasks (some tasks can preempt others if scheduled at same time)

**Use Case** - Develop a Python script that mimics a simplified cron job scheduler. This scheduler should be able to schedule tasks at different intervals and prioritize them based on a predefined priority level. If two tasks are scheduled at the same time, the task with the higher priority should preempt the other. The tasks can be simple Python functions that print a message to the console. The scheduler should keep running in a loop, checking for scheduled tasks and executing them as per their priority and scheduled time.

**Programming Language** - Python

**Target Model** - GPT-4o

# Setup

```requirements.txt
```


# Prompt
Problem Statement:
- Develop a Python script that mimics a simplified cron job scheduler.
- The scheduler should support task execution at custom intervals.
- Each task must have a predefined priority level.
- If multiple tasks are scheduled at the same time, the one with the higher priority should execute first.
- Tasks are simple Python functions that print a message to the console.
- The scheduler should continuously run in a loop, checking for and executing due tasks based on time and priority.

Function Signature:
```python
def start_scheduler(tasks: List[Dict[str, Any]]) -> None:
```

Input Format:
- A list of task dictionaries passed to the scheduler.
- Each task dictionary contains:
 - "name" (str): Unique name of the task.
 - "interval" (int): Time interval in seconds between executions.
 - "priority" (int): Priority level (higher number = higher priority).
 - "function" (Callable): Python function to execute.

Input Constraints:
- Each task must be defined with the following fields:
 - name (str): Unique task identifier (non-empty string).
 - interval (int): Time interval in seconds (must be > 0).
 - priority (int): Integer priority (higher number = higher priority).
 - function (Callable): A Python function that takes no arguments and prints directly to the console.
- All intervals must be integers greater than zero.
- Priority must be an integer; higher value indicates higher priority.
- Task names must be unique.
- A maximum of 100 tasks can be scheduled.
- The scheduler assumes real-time simulation (i.e., time.sleep is used).

Output Format:
- Console output showing task execution logs.
- Each log entry includes:
 - Simulated timestamp of execution.
 - Task name.
 - Task priority.
 - Task message printed by the function.

Example:
```python
Input:
tasks = [
    {
        "name": "task1",
        "interval": 5,
        "priority": 1,
        "function": lambda: print("Running task1")
    },
    {
        "name": "task2",
        "interval": 5,
        "priority": 2,
        "function": lambda: print("Running task2")
    },
    {
        "name": "task3",
        "interval": 3,
        "priority": 0,
        "function": lambda: print("Running task3")
    }
]

Output:
[TIMESTAMP] Running task3 (Priority: 0)
[TIMESTAMP] Running task2 (Priority: 2)
[TIMESTAMP] Running task3 (Priority: 0)
[TIMESTAMP] Running task2 (Priority: 2)
[TIMESTAMP] Running task3 (Priority: 0)
[TIMESTAMP] Running task2 (Priority: 2)
... (continues in loop based on scheduling)
```

# Requirements
Explicit Requirements:
- The scheduler must run continuously in a loop.
- Each task has an interval (in seconds) that determines its execution frequency.
- Each task has a priority (integer) used to resolve conflicts when tasks overlap.
- When two or more tasks are scheduled at the same time, the one with higher priority runs first.
- Tasks are simple Python functions that print messages to the console.

Implicit Requirements:
- Lower interval values mean higher frequency of execution.
- Tasks with the same priority and interval may be executed in insertion order.
- No two tasks should run simultaneously if they conflict in schedule; they should execute sequentially based on priority.
- The task list should be user-defined before the scheduler starts and remains static during execution.
- The system time is assumed to be simulated or based on time.time() internally.

Solution Expectations:
- Implement a loop that continuously checks the current time to determine which tasks should run.
- Maintain a schedule of tasks, each defined by a function, execution interval, and priority.
- Correctly calculate the next run time for each task based on its interval.
- When multiple tasks are due at the same time, execute the one with the highest priority.
- Ensure that each executed task prints its message to the console as defined.
- Handle timing accurately without excessive CPU usage (e.g., use time.sleep).
- Ensure the scheduler does not crash unexpectedly due to unhandled exceptions in tasks.
- Design should be modular: separate task definition, scheduling, and execution logic.

Edge Cases and Behavior:
- If two or more tasks are scheduled at the exact same time:
 - Execute the task with the highest priority (higher number = higher priority).
 - If priorities are equal, execute in the order they were added.
- If a task has a zero or negative interval:
 - Skip or raise an error during validation (invalid configuration).
- If a task raises an exception during execution:
 - Catch and log the exception, continue running other tasks.
- If the system clock changes (e.g., jumps backward/forward):
 - Scheduler may misalign; optionally log a warning or recalculate all next run times.

Solution Constraints:
- Must not use the system's native cron scheduler or external schedulers.
- Task checking and execution must run in a continuous loop.
- Must prioritize and handle task collisions based on priority values.
- Task execution should not block the scheduler loop indefinitely.
- Task functions must be simple callables (no threads, subprocesses, or async required).
- Must enforce uniqueness of task names.
- Must validate that the task function is callable.
- Must enforce task cap at 100.

In [None]:
# code
"""
Simplified cron job scheduler that runs tasks.

It is based on interval and priority.
"""

import time
from typing import List, Dict, Any
import datetime


def start_scheduler(tasks: List[Dict[str, Any]]) -> None:
    """Start a simplified cron job scheduler.

    This function runs a continuous loop that checks and executes scheduled
    tasks based on their configured interval and priority.

    Args:
        tasks (List[Dict[str, Any]]): A list of task dictionaries. Each task
        must contain "name", "interval", "priority", and "function" keys.
    """
    if not tasks:
        print("Scheduler started with no tasks.")
        return

    if len(tasks) > 100:
        print("Error: Maximum of 100 tasks can be scheduled.")
        return

    scheduled_tasks = {}
    task_names = set()

    for task in tasks:
        name = task.get("name")
        interval = task.get("interval")
        priority = task.get("priority")
        function = task.get("function")

        if not isinstance(name, str) or not name:
            print(f"Error: Task name '{name}' is invalid. Skipping task.")
            continue
        if name in task_names:
            print(f"Error: Duplicate task name '{name}'. Skipping task.")
            continue
        task_names.add(name)

        if not isinstance(interval, int) or interval <= 0:
            print(
                f"Error: Task '{name}' has an invalid interval "
                f"({interval}). Skipping task."
            )
            continue

        if not isinstance(priority, int):
            print(
                f"Error: Task '{name}' has an invalid priority "
                f"({priority}). Skipping task."
            )
            continue

        if not callable(function):
            print(f"Error: Task '{name}'"
                  f" has an invalid function. Skipping task.")
            continue

        scheduled_tasks[name] = {
            "interval": interval,
            "priority": priority,
            "function": function,
            "next_run_time": time.time()
        }

    if not scheduled_tasks:
        print("No valid tasks to schedule. Exiting scheduler.")
        return

    print("Scheduler started. Press Ctrl+C to stop.")

    try:
        while True:
            current_time = time.time()
            due_tasks = []

            for name, task_info in scheduled_tasks.items():
                if current_time >= task_info["next_run_time"]:
                    due_tasks.append((
                        task_info["priority"],
                        name,
                        task_info["function"]
                    ))

            due_tasks.sort(key=lambda x: x[0], reverse=True)

            for priority, name, func in due_tasks:
                try:
                    timestamp = datetime.datetime.fromtimestamp(
                        current_time
                    ).strftime('%Y-%m-%d %H:%M:%S')
                    print(
                        f"[{timestamp}] Executing {name} "
                        f"(Priority: {priority})"
                    )
                    func()
                    scheduled_tasks[name]["next_run_time"] = (
                        current_time + scheduled_tasks[name]["interval"]
                    )
                except Exception as e:
                    print(
                        f"[{timestamp}] Error executing task '{name}': {e}"
                    )

            time.sleep(0.1)

    except KeyboardInterrupt:
        print("\nScheduler stopped by user.")
    except Exception as e:
        print(f"\nAn unexpected error occurred in the scheduler: {e}")


In [None]:
# tests

import unittest
import threading
import time
import sys
from io import StringIO
from main import start_scheduler


class TestCronScheduler(unittest.TestCase):

    def setUp(self):
        self.execution_log = []
        self.scheduler_thread = None

        self.original_stdout = sys.stdout
        self.captured_output = StringIO()
        sys.stdout = self.captured_output

    def tearDown(self):
        sys.stdout = self.original_stdout

        if self.scheduler_thread and self.scheduler_thread.is_alive():
            self.scheduler_thread.join(timeout=0.05)

    def create_trackable_function(self, task_name):
        def tracked_function():
            self.execution_log.append(task_name)

        return tracked_function

    def run_scheduler_briefly(self, tasks, duration=0.05):
        def scheduler_wrapper():
            try:
                start_scheduler(tasks)
            except KeyboardInterrupt:
                pass
            except Exception:
                pass

        self.scheduler_thread = threading.Thread(target=scheduler_wrapper)
        self.scheduler_thread.daemon = True
        self.scheduler_thread.start()

        time.sleep(duration)

        return self.captured_output.getvalue()

    def test_single_task_execution(self):
        tasks = [
            {
                "name": "test_task",
                "interval": 1,
                "priority": 1,
                "function": self.create_trackable_function("test_task"),
            }
        ]

        output = self.run_scheduler_briefly(tasks)

        self.assertIn("Scheduler started", output)
        self.assertIn("Executing test_task", output)
        self.assertIn("test_task", self.execution_log)

    def test_multiple_tasks_priority_order(self):
        tasks = [
            {
                "name": "low_priority",
                "interval": 1,
                "priority": 1,
                "function": self.create_trackable_function("low_priority"),
            },
            {
                "name": "high_priority",
                "interval": 1,
                "priority": 5,
                "function": self.create_trackable_function("high_priority"),
            },
        ]

        output = self.run_scheduler_briefly(tasks, 0.08)

        self.assertIn("Executing low_priority", output)
        self.assertIn("Executing high_priority", output)

        lines = output.split("\n")
        high_priority_line = None
        low_priority_line = None

        for i, line in enumerate(lines):
            if "Executing high_priority" in line:
                high_priority_line = i
            if "Executing low_priority" in line:
                low_priority_line = i

        if high_priority_line is not None and low_priority_line is not None:
            self.assertLess(high_priority_line, low_priority_line)

    def test_task_exception_handling(self):
        def failing_function():
            raise ValueError("Task failed!")

        def working_function():
            self.execution_log.append("working_task")

        tasks = [
            {
                "name": "failing_task",
                "interval": 1,
                "priority": 1,
                "function": failing_function,
            },
            {
                "name": "working_task",
                "interval": 1,
                "priority": 2,
                "function": working_function,
            },
        ]

        output = self.run_scheduler_briefly(tasks, 0.08)

        self.assertIn("Error executing task 'failing_task'", output)
        self.assertIn("working_task", self.execution_log)

    def test_empty_task_list(self):
        tasks = []

        output = self.run_scheduler_briefly(tasks, 0.1)

        self.assertIn("Scheduler started with no tasks", output)

    def test_duplicate_task_names(self):
        tasks = [
            {
                "name": "duplicate",
                "interval": 1,
                "priority": 1,
                "function": lambda: None,
            },
            {
                "name": "duplicate",
                "interval": 2,
                "priority": 2,
                "function": lambda: None,
            },
        ]

        output = self.run_scheduler_briefly(tasks, 0.1)

        self.assertIn("Duplicate task name 'duplicate'", output)

    def test_invalid_interval(self):
        tasks = [
            {
                "name": "invalid_interval",
                "interval": 0,
                "priority": 1,
                "function": lambda: None,
            }
        ]

        output = self.run_scheduler_briefly(tasks, 0.1)

        self.assertIn("invalid interval (0)", output)
        self.assertIn("No valid tasks to schedule", output)

    def test_invalid_priority(self):
        tasks = [
            {
                "name": "invalid_priority",
                "interval": 1,
                "priority": "high",
                "function": lambda: None,
            }
        ]

        output = self.run_scheduler_briefly(tasks, 0.1)

        self.assertIn("invalid priority (high)", output)

    def test_non_callable_function(self):
        tasks = [
            {
                "name": "invalid_function",
                "interval": 1,
                "priority": 1,
                "function": "not_callable",
            }
        ]

        output = self.run_scheduler_briefly(tasks, 0.1)

        self.assertIn("invalid function", output)

    def test_too_many_tasks(self):
        tasks = []
        for i in range(101):
            tasks.append(
                {
                    "name": f"task_{i}",
                    "interval": 1,
                    "priority": 1,
                    "function": lambda: None,
                }
            )

        output = self.run_scheduler_briefly(tasks, 0.1)

        self.assertIn("Maximum of 100 tasks", output)

    def test_missing_required_fields(self):
        tasks = [{"interval": 1, "priority": 1, "function": lambda: None}]
        output = self.run_scheduler_briefly(tasks)
        self.assertIn("invalid", output.lower())

    def test_same_priority_insertion_order(self):
        tasks = [
            {
                "name": "first",
                "interval": 1,
                "priority": 1,
                "function": self.create_trackable_function("first"),
            },
            {
                "name": "second",
                "interval": 1,
                "priority": 1,
                "function": self.create_trackable_function("second"),
            },
        ]
        self.run_scheduler_briefly(tasks, 0.08)
        self.assertIn("first", self.execution_log)
        self.assertIn("second", self.execution_log)

    def test_different_intervals_timing(self):
        tasks = [
            {
                "name": "fast",
                "interval": 1,
                "priority": 1,
                "function": self.create_trackable_function("fast"),
            },
            {
                "name": "slow",
                "interval": 3,
                "priority": 1,
                "function": self.create_trackable_function("slow"),
            },
        ]
        self.run_scheduler_briefly(tasks, 0.12)
        fast_count = self.execution_log.count("fast")
        slow_count = self.execution_log.count("slow")
        self.assertGreaterEqual(fast_count, slow_count)


if __name__ == "__main__":
    unittest.main()


# Model Breaking Proof

#### Model Breaking Task URL: https://rlhf-v3.turing.com/prompt/a9c6e494-00cd-41c5-af10-cfddbec24e8a?origin=https%3A%2F%2Flabeling-ms.turing.com&redirect_url=https%3A%2F%2Flabeling-ms.turing.com%2Fconversations%2F74435%2Fview

#### Model code:

```python
import time
from typing import List, Dict, Any, Callable
import heapq


def start_scheduler(tasks: List[Dict[str, Any]]) -> None:
    if len(tasks) > 100:
        raise ValueError("Cannot schedule more than 100 tasks.")
    
    # <Issue>: Missing task name uniqueness check
    
    # Validate tasks and prepare them
    for task in tasks:
        if not isinstance(task['name'], str) or not task['name']:
            raise ValueError("Task name must be a non-empty string.")
        if not isinstance(task['interval'], int) or task['interval'] <= 0:
            raise ValueError("Task interval must be an integer greater than zero.")
        if not isinstance(task['priority'], int):
            raise ValueError("Task priority must be an integer.")
        if not callable(task['function']):
            raise ValueError("Task function must be callable.")

    # To track the next execution time of each task
    schedule = []
    current_time = time.time()

    # Initialize the schedule
    for task in tasks:
        next_execution_time = current_time + task['interval']
        # Use tuple (next_execution_time, priority, index, task) to prioritize correctly
        heapq.heappush(schedule, (next_execution_time, -task['priority'], task['name'], task))

    # Continuous loop for the scheduler
    while True:
        if not schedule:
            break  # Exit if no tasks are present

        current_time = time.time()

        # Collect tasks that should be executed now
        while schedule and schedule[0][0] <= current_time:
            _, _, _, task = heapq.heappop(schedule)
            try:
                print(f"[{time.ctime()}] Executing {task['name']} (Priority: {task['priority']})")
                task['function']()
            except Exception as e:
                print(f"Exception while executing task {task['name']}: {e}")

            # Re-calculate the next execution time and re-add to schedule
            next_execution_time = current_time + task['interval']
            heapq.heappush(schedule, (next_execution_time, -task['priority'], task['name'], task))

        # Sleep until the earliest next execution time or a short wait
        if schedule:
            next_run_time = schedule[0][0]
            sleep_time = max(0, next_run_time - current_time)
            time.sleep(min(sleep_time, 0.5))  # Give a small delay to reduce CPU usage
        else:
            time.sleep(0.5)

```