# Advanced Features - Priority & Fair Share

### Priority Queue
- **HIGH**: Urgent jobs, production deadlines
- **MEDIUM**: Standard training jobs (default)
- **LOW**: Experiments, development work

### Fair Share
Prevents one user from using all GPUs:
- Tracks GPU-hours per user
- Limits users to 2× fair share
- Example: 3 users → each gets ~33% of resources

In [1]:
import sys
sys.path.insert(0, '..')

# Demo mode - these would connect to real Redis/Celery in production
print("⚠️  Demo Mode: Code examples shown for educational purposes")
print("📦 In production, this requires Redis + Celery workers\n")

from src.scheduler.job_queue import JobConfig, Priority
from src.monitoring.logger import setup_logging
import time

setup_logging(level="INFO")
print("✓ Setup complete!")

⚠️  Demo Mode: Code examples shown for educational purposes
📦 In production, this requires Redis + Celery workers



  import pynvml


2025-10-09 20:46:35 - root - [32mINFO[0m - Logging initialized at level INFO [logger.py:202]
✓ Setup complete!


## 1. Priority Demonstration

Submit 3 jobs with different priorities and watch queue ordering.

In [5]:
from src.scheduler.job_queue import get_job_queue
from src.scheduler.priority_manager import get_priority_manager

job_queue = get_job_queue()
priority_manager = get_priority_manager()

# Define base configuration with timeout
base_config = {
    "user_id": "user-A",
    "job_type": "fine_tuning",
    "num_gpus": 1,
    "pool_type": "development",
    "is_preemptible": True,
    "model_name": "distilbert-base-uncased",
    "dataset_path": "./data/sample.csv",
    "config": {
        "max_steps": 10,  # Reduced steps for faster completion
        "max_train_time": 60,  # 60 seconds (1 minute) timeout
        "early_stopping": True
    }
}

# Submit all jobs at once
priorities = [
    ("LOW", Priority.LOW),
    ("MEDIUM", Priority.MEDIUM),
    ("HIGH", Priority.HIGH)
]

for priority_name, priority_level in priorities:
    job = JobConfig(
        job_id=f"{priority_name.lower()}-priority-001",
        output_dir=f"./output/{priority_name.lower()}",
        priority=priority_name,
        **base_config
    )
    job_queue.submit_job(job, priority_level)
    print(f"✓ {priority_name} priority job submitted (max 60s)")

2025-10-09 20:51:33 - src.scheduler.job_queue - [32mINFO[0m - Submitted job low-priority-001 to queue 'low_priority' (priority=LOW) [job_queue.py:340]
✓ LOW priority job submitted (max 60s)
2025-10-09 20:51:33 - src.scheduler.job_queue - [32mINFO[0m - Submitted job medium-priority-001 to queue 'default' (priority=MEDIUM) [job_queue.py:340]
✓ MEDIUM priority job submitted (max 60s)


KeyboardInterrupt: 

In [7]:
# Check queue order
print("\nQueue Positions (lower = runs first):")
for job_id in ["low-priority-001", "medium-priority-001", "high-priority-001"]:
    position = priority_manager.get_queue_position(job_id)
    print(f"  {job_id}: Position {position}")

print("\n⚠️  Notice: HIGH priority job is position 1, even though submitted last!")


Queue Positions (lower = runs first):
  low-priority-001: Position 3
  medium-priority-001: Position 2
  high-priority-001: Position 1

⚠️  Notice: HIGH priority job is position 1, even though submitted last!


## 2. Fair Share Demo

Simulate 3 users to show fair share in action.

In [8]:
# Simulate GPU usage for 3 users
users = ["alice", "bob", "charlie"]

# Alice uses 10 GPU-hours
priority_manager.record_job_completion(
    job_id="alice-job-1",
    user_id="alice",
    num_gpus=2,
    duration=18000  # 5 hours × 2 GPUs = 10 GPU-hours
)

# Bob uses 10 GPU-hours
priority_manager.record_job_completion(
    job_id="bob-job-1",
    user_id="bob",
    num_gpus=1,
    duration=36000  # 10 hours × 1 GPU = 10 GPU-hours
)

# Charlie uses 50 GPU-hours (GREEDY!)
priority_manager.record_job_completion(
    job_id="charlie-job-1",
    user_id="charlie",
    num_gpus=4,
    duration=45000  # 12.5 hours × 4 GPUs = 50 GPU-hours
)

print("GPU Usage Recorded:")
for user in users:
    stats = priority_manager.get_user_stats(user)
    if stats:
        print(f"  {user}: {stats.total_gpu_hours:.1f} GPU-hours")

GPU Usage Recorded:
  alice: 10.0 GPU-hours
  bob: 10.0 GPU-hours
  charlie: 50.0 GPU-hours


In [9]:
# Check fair share limits
total_hours = sum(
    priority_manager.get_user_stats(u).total_gpu_hours 
    for u in users if priority_manager.get_user_stats(u)
)

print(f"\nTotal GPU-hours used: {total_hours:.1f}")
print(f"Fair share per user: {total_hours/3:.1f} (33.3%)")
print(f"Max allowed (2× fair share): {total_hours/3*2:.1f} (66.7%)\n")

for user in users:
    stats = priority_manager.get_user_stats(user)
    if stats:
        share_pct = (stats.total_gpu_hours / total_hours) * 100
        status = "✓ OK" if share_pct < 66.7 else "⚠️ OVER QUOTA"
        print(f"  {user}: {share_pct:.1f}% {status}")

print("\n⚠️  Charlie exceeded fair share! Next job may be delayed.")


Total GPU-hours used: 70.0
Fair share per user: 23.3 (33.3%)
Max allowed (2× fair share): 46.7 (66.7%)

  alice: 14.3% ✓ OK
  bob: 14.3% ✓ OK
  charlie: 71.4% ⚠️ OVER QUOTA

⚠️  Charlie exceeded fair share! Next job may be delayed.


## 3. Starvation Prevention

Jobs waiting too long (>1 hour) get priority boost.

In [10]:
# Check starvation timeout
starvation_timeout = 3600  # 1 hour default

print(f"Starvation Prevention:")
print(f"  Timeout: {starvation_timeout}s (1 hour)")
print(f"  Action: Boost priority to HIGH")
print(f"\n  Example:")
print(f"  - LOW priority job waits 65 minutes")
print(f"  - System auto-boosts to HIGH priority")
print(f"  - Job moves to front of queue")
print(f"  - Prevents indefinite waiting")

Starvation Prevention:
  Timeout: 3600s (1 hour)
  Action: Boost priority to HIGH

  Example:
  - LOW priority job waits 65 minutes
  - System auto-boosts to HIGH priority
  - Job moves to front of queue
  - Prevents indefinite waiting


## 4. Queue Management

In [11]:
# Get queue summary
summary = priority_manager.get_queue_summary()

print("Queue Summary:")
print(f"  Total jobs: {summary['total_jobs']}")
print(f"  Unique users: {summary['unique_users']}")
print(f"  Total GPUs requested: {summary['total_requested_gpus']}")
print(f"  Oldest job wait time: {summary['oldest_job_wait_time']:.0f}s\n")

print("Priority Breakdown:")
for priority, count in summary['priority_breakdown'].items():
    print(f"  {priority}: {count} jobs")

Queue Summary:
  Total jobs: 3
  Unique users: 1
  Total GPUs requested: 3
  Oldest job wait time: 539s

Priority Breakdown:
  HIGH: 1 jobs
  MEDIUM: 1 jobs
  LOW: 1 jobs


In [12]:
# Cancel a job
cancelled = priority_manager.cancel_job("low-priority-001")

if cancelled:
    print("✓ Job cancelled successfully")
    print("  Job removed from queue")
    print("  Resources freed for other jobs")
else:
    print("✗ Job not found or already running")

2025-10-09 20:56:11 - src.scheduler.priority_manager - [32mINFO[0m - Cancelled job low-priority-001 [priority_manager.py:339]
✓ Job cancelled successfully
  Job removed from queue
  Resources freed for other jobs


## 5. Custom Fair Share Quotas

Set different quotas for different users/teams.

In [13]:
# Set custom quotas
# Default quota = 1.0 (equal share)

# Premium user gets 2× resources
priority_manager.set_user_fair_share("premium-user", quota=2.0)

# Research team gets 1.5× resources  
priority_manager.set_user_fair_share("research-team", quota=1.5)

# Free tier gets 0.5× resources
priority_manager.set_user_fair_share("free-tier", quota=0.5)

print("Custom Quotas Set:")
print(f"  premium-user: 2.0× (gets 2× normal share)")
print(f"  research-team: 1.5× (gets 1.5× normal share)")
print(f"  free-tier: 0.5× (gets half normal share)")
print(f"\n  Example with 100 GPU-hours available:")
print(f"  - premium-user can use up to 50 hours")
print(f"  - research-team can use up to 37.5 hours")
print(f"  - free-tier can use up to 12.5 hours")

2025-10-09 20:56:19 - src.scheduler.priority_manager - [32mINFO[0m - Set fair share quota for user premium-user: 2.0 [priority_manager.py:447]
2025-10-09 20:56:19 - src.scheduler.priority_manager - [32mINFO[0m - Set fair share quota for user research-team: 1.5 [priority_manager.py:447]
2025-10-09 20:56:19 - src.scheduler.priority_manager - [32mINFO[0m - Set fair share quota for user free-tier: 0.5 [priority_manager.py:447]
Custom Quotas Set:
  premium-user: 2.0× (gets 2× normal share)
  research-team: 1.5× (gets 1.5× normal share)
  free-tier: 0.5× (gets half normal share)

  Example with 100 GPU-hours available:
  - premium-user can use up to 50 hours
  - research-team can use up to 37.5 hours
  - free-tier can use up to 12.5 hours


## 6. Real-World Scenarios

### Scenario 1: Production Deadline

```python
# Urgent production model needs HIGH priority
urgent_job = JobConfig(
    job_id="prod-urgent-001",
    priority="HIGH",
    is_preemptible=False,  # Don't interrupt
    pool_type="production"
)
job_queue.submit_job(urgent_job, Priority.HIGH)
# → Goes to front of queue immediately
```