# <font color="#418FDE" size="6.5" uppercase>**Greedy In Python**</font>

>Last update: 20260102.
    
By the end of this Lecture, you will be able to:
- Use Python sorting and heapq to implement efficient greedy selections. 
- Translate high-level greedy strategies into clear, maintainable Python code. 
- Measure and interpret the performance of greedy implementations on realistic input sizes. 


## **1. Sorting for Greedy**

### **1.1. Custom Key Sorting**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master Python Algorithms/Module_07/Lecture_B/image_01_01.jpg?v=1767342376" width="250">



>* Use custom keys to express greedy priorities
>* Keys make choices ordered, explicit, and adjustable

>* Combine multiple attributes into one sorting key
>* Easily switch greedy priorities by changing key

>* Simple key functions keep large sorts fast
>* Balance key clarity with efficiency for scalability



In [None]:
#@title Python Code - Custom Key Sorting

# Demonstrate custom key sorting for greedy style task selection.
# Show sorting by earliest deadline then shortest duration using key functions.
# Compare default sorting with custom priority based sorting clearly.
# pip install some_required_library_if_needed_but_standard_libraries_are_sufficient.

# Define a simple list containing small task dictionaries for sorting demonstration.
tasks = [
    {"name": "Task A", "deadline_days": 3, "duration_hours": 5},
    {"name": "Task B", "deadline_days": 1, "duration_hours": 4},
    {"name": "Task C", "deadline_days": 1, "duration_hours": 2},
    {"name": "Task D", "deadline_days": 5, "duration_hours": 1},
]

# Print original unsorted tasks to observe initial ordering before applying sorting.
print("Original tasks order:")
for task in tasks:
    print(task)

# Define a key function expressing greedy priority using deadline then duration attributes.
def greedy_priority(task):
    return (task["deadline_days"], task["duration_hours"])

# Sort tasks using custom key showing earliest deadline then shortest duration ordering.
sorted_tasks = sorted(tasks, key=greedy_priority)

# Print sorted tasks to see greedy priority order clearly and understandably.
print("\nSorted by deadline then duration:")
for task in sorted_tasks:
    print(task)



### **1.2. Choosing sorted or sort**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master Python Algorithms/Module_07/Lecture_B/image_01_02.jpg?v=1767342394" width="250">



>* Choose between new sorted list or in-place
>* Preserve original data to compare greedy strategies

>* Use in-place sort when original order unnecessary
>* Avoids extra lists, helpful for large datasets

>* New sorted lists fit pipelines and readability
>* In-place sort supports staged, stateful greedy flows



In [None]:
#@title Python Code - Choosing sorted or sort

# Demonstrate choosing sorted or sort for greedy style selections.
# Show preserving original list versus modifying list in place.
# Print results to compare behaviors clearly and concisely.

# pip install commands are unnecessary because script uses only standard library.

# Create original list of job offers with salary and commute minutes.
jobs = [
    {"title": "JobA", "salary": 90000, "commute_minutes": 20},
    {"title": "JobB", "salary": 75000, "commute_minutes": 10},
    {"title": "JobC", "salary": 120000, "commute_minutes": 45},
]

# Use sorted to create new list ordered by highest salary first.
jobs_by_salary = sorted(jobs, key=lambda job: job["salary"], reverse=True)

# Use sorted again to create another list ordered by shortest commute first.
jobs_by_commute = sorted(jobs, key=lambda job: job["commute_minutes"])

# Show that original jobs list remains unchanged after using sorted.
print("Original jobs list remains unchanged:", [job["title"] for job in jobs])

# Show greedy view that prefers highest salary using sorted result.
print("Greedy by salary using sorted view:", [job["title"] for job in jobs_by_salary])

# Show greedy view that prefers shortest commute using another sorted view.
print("Greedy by commute using sorted view:", [job["title"] for job in jobs_by_commute])

# Now demonstrate in place sort when original order is not needed anymore.
jobs.sort(key=lambda job: job["salary"], reverse=True)

# Show that jobs list is now permanently ordered by salary for greedy scanning.
print("Jobs list after in place sort by salary:", [job["title"] for job in jobs])



### **1.3. Stable Sort Behavior**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master Python Algorithms/Module_07/Lecture_B/image_01_03.jpg?v=1767342412" width="250">



>* Stable sort keeps equal-key items in order
>* This built-in stability supports correct greedy tie-breaking

>* Stable sort supports layered multi-key ordering
>* Ensures consistent tie-breaking for complex greedy rules

>* Stable sort preserves context like arrival order
>* Helps greedy choices stay fair, predictable, maintainable



In [None]:
#@title Python Code - Stable Sort Behavior

# Demonstrate stable sorting behavior with simple job scheduling example.
# Show how earlier jobs keep order when primary sort keys are equal.
# Illustrate two-pass sorting where second sort preserves previous ordering.

# pip install commands are unnecessary because script uses only standard library.

# Define a list of jobs with deadline and processing time minutes.
jobs = [
    {"name": "Job A", "deadline": 2, "time": 30},
    {"name": "Job B", "deadline": 1, "time": 20},
    {"name": "Job C", "deadline": 1, "time": 10},
    {"name": "Job D", "deadline": 2, "time": 15},
]

# Print original job order before any sorting operations.
print("Original jobs order:")
for job in jobs:
    print(job)

# First sort by processing time minutes ascending using stable sort behavior.
jobs_sorted_time = sorted(jobs, key=lambda job: job["time"])

# Second sort by deadline days ascending preserving previous time ordering ties.
jobs_sorted_deadline_then_time = sorted(jobs_sorted_time, key=lambda job: job["deadline"])

# Print final order showing stable tie breaking by processing time minutes.
print("\nSorted by deadline then time:")
for job in jobs_sorted_deadline_then_time:
    print(job)



## **2. Greedy Heaps and Priority**

### **2.1. Heap Push and Pop**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master Python Algorithms/Module_07/Lecture_B/image_02_01.jpg?v=1767342429" width="250">



>* Heap push adds new candidates with priorities
>* Heap pop returns current best choice for progress

>* Make heap priorities explicit, meaningful, and visible
>* Use clear structures, names, and comments for priorities

>* Control when pushes and pops happen during solving
>* Match codeâ€™s heap operations to plain-language strategy



In [None]:
#@title Python Code - Heap Push and Pop

# Demonstrate heap push and pop for greedy task scheduling decisions.
# Show how earliest deadline tasks are always popped and processed first.
# Keep code beginner friendly with clear comments and printed explanations.

# pip install commands are unnecessary because heapq exists in Python standard library.

# Import heapq module for heap based priority queue operations.
import heapq

# Define simple tasks with names and deadlines measured in hours from now.
tasks = [
    (4, "Write project report"),
    (2, "Reply to urgent emails"),
    (6, "Prepare meeting slides"),
]

# Create empty list that will store heap items as (deadline, name) pairs.
heap = []

# Push each task into heap so smallest deadline always stays at heap front.
for deadline, name in tasks:
    heapq.heappush(heap, (deadline, name))

# Print current heap content to show internal ordering after all pushes.
print("Heap after pushes (deadline, task):", heap)

# Simulate arrival of new urgent task with even earlier deadline than others.
new_task = (1, "Fix critical server bug")

# Push new urgent task into heap so greedy choice can immediately see it.
heapq.heappush(heap, new_task)

# Print heap again to observe how new urgent task changes internal ordering.
print("Heap after new urgent push:", heap)

# Pop tasks one by one always getting smallest deadline first for greedy scheduling.
while heap:
    deadline, name = heapq.heappop(heap)
    print("Next chosen task:", name, "with deadline in", deadline, "hours")



### **2.2. Priority Selection Loops**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master Python Algorithms/Module_07/Lecture_B/image_02_02.jpg?v=1767342445" width="250">



>* Heap loop repeatedly picks current best option
>* Select, act, update, repeat to express strategy

>* Initialize heap, repeatedly pick best remaining candidate
>* Update state, push new items, keep logic separated

>* Real examples show heaps guiding greedy choices
>* Clear loops reveal how priorities drive decisions



In [None]:
#@title Python Code - Priority Selection Loops

# Demonstrates a greedy priority selection loop using a simple delivery scheduling example.
# Uses heapq to always pick the next most profitable feasible delivery first.
# Shows clear select act update repeat structure for maintainable greedy code.

# pip install commands are not required because heapq is included by default.

# Import heapq for priority queue style heap operations.
import heapq

# Define a simple Delivery tuple with start time, end time, and profit dollars.
Delivery = tuple

# Create a small list of deliveries with start, end, and profit values.
deliveries: list[Delivery] = [
    (1, 3, 50),
    (2, 5, 60),
    (4, 7, 120),
    (6, 9, 80),
]

# Sort deliveries by start time to easily find newly feasible deliveries.
deliveries.sort(key=lambda job: job[0])

# Initialize current time and total profit for the delivery route.
current_time: int = 0

total_profit: int = 0

# Initialize an empty heap that will store available deliveries by negative profit.
available_heap: list[tuple[int, int, int]] = []

# Index tracks which deliveries have become available based on current time.
index: int = 0

# Print header explaining the upcoming greedy selection steps clearly.
print("Step, chosen_delivery, current_time, total_profit")

# Loop while there are unprocessed deliveries or available candidates in the heap.
step: int = 1
while index < len(deliveries) or available_heap:

    # Push newly available deliveries whose start time is not greater than current time.
    while index < len(deliveries) and deliveries[index][0] <= current_time:
        start, end, profit = deliveries[index]
        heapq.heappush(available_heap, (-profit, end, start))
        index += 1

    # If no deliveries are available, advance time to next delivery start time.
    if not available_heap and index < len(deliveries):
        current_time = deliveries[index][0]
        continue

    # Pop the most profitable available delivery from the heap for greedy selection.
    profit_neg, end, start = heapq.heappop(available_heap)

    # If delivery end time is before current time, skip because it is infeasible.
    if end < current_time:
        continue

    # Update current time and total profit after accepting the chosen delivery.
    current_time = end
    total_profit += -profit_neg

    # Print the step details showing the greedy decision and updated state.
    print(f"{step}, ({start}-{end}, ${-profit_neg}), {current_time}, ${total_profit}")
    step += 1

# Print final summary of total profit earned by the greedy delivery schedule.
print(f"Finished route with total profit ${total_profit} and final time {current_time}.")



### **2.3. Simulating Priority Queues**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master Python Algorithms/Module_07/Lecture_B/image_02_03.jpg?v=1767342463" width="250">



>* Greedy choices naturally map to priority queues
>* Separate priority-queue logic from low-level heap details

>* Decide what priority value each item uses
>* Store priority with related data in simple records

>* Priority queue loop drives all greedy decisions
>* This structure keeps code modular, clear, adaptable



In [None]:
#@title Python Code - Simulating Priority Queues

# Demonstrate simulating priority queues using heapq in a simple greedy scenario.
# Show patients entering triage with severity priorities and being treated in order.
# Emphasize separating priority meaning from underlying heap implementation details.
# pip install some_required_library_if_needed_here but standard_libraries_are_sufficient_today.

# Import heapq module for using a list as a binary heap.
import heapq

# Define a simple Patient tuple like structure using regular Python tuples.
# Each entry stores negative severity, arrival index, and patient name together.

# Create a list that will act as our priority queue heap.
priority_queue_heap = []

# Track arrival order index so earlier arrivals break ties consistently.
arrival_counter_index = 0

# Helper function pushes new patient into the simulated priority queue heap.
def add_patient_to_queue(name, severity):
    global arrival_counter_index
    # Use negative severity because heapq implements a min heap structure.
    entry = (-severity, arrival_counter_index, name)
    # Push the new entry into the heap based priority queue.
    heapq.heappush(priority_queue_heap, entry)
    # Increment arrival index so next patient has a larger arrival value.
    arrival_counter_index += 1

# Helper function pops next best patient based on highest severity priority.
def treat_next_patient_from_queue():
    # If heap is empty, return message indicating no patients remain.
    if not priority_queue_heap:
        return "No patients waiting now."
    # Pop best entry which has smallest tuple, meaning highest severity value.
    severity_negated, arrival_index, name = heapq.heappop(priority_queue_heap)
    # Convert negative severity back to positive for human friendly printing.
    severity_positive = -severity_negated
    # Return formatted string describing which patient is being treated now.
    return f"Treating {name} with severity {severity_positive}."

# Add several initial patients representing current triage room state.
add_patient_to_queue("Alice", 5)
add_patient_to_queue("Bob", 2)
add_patient_to_queue("Carlos", 8)

# Print current action showing first greedy choice from priority queue.
print(treat_next_patient_from_queue())

# New patient arrives with very high severity, simulate dynamic queue update.
add_patient_to_queue("Diana", 10)

# Another patient arrives with moderate severity, added after Diana arrival.
add_patient_to_queue("Evan", 4)

# Continue treating patients, always selecting best available from priority queue.
print(treat_next_patient_from_queue())
print(treat_next_patient_from_queue())
print(treat_next_patient_from_queue())

# Final call shows either last patient treated or empty queue message.
print(treat_next_patient_from_queue())



## **3. Greedy Performance Analysis**

### **3.1. Benchmarking Greedy Implementations**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master Python Algorithms/Module_07/Lecture_B/image_03_01.jpg?v=1767342482" width="250">



>* Design experiments across realistic input sizes
>* Measure time and memory to study scaling

>* Control environment and record all benchmark settings
>* Repeat runs, handle warm-up, summarize with statistics

>* Plot runtime versus input size to spot scaling
>* Compare results to real-world constraints and requirements



In [None]:
#@title Python Code - Benchmarking Greedy Implementations

# Demonstrate benchmarking a simple greedy algorithm with increasing input sizes.
# Measure runtime for a greedy interval scheduling implementation using random intervals.
# Show how runtime grows as the number of intervals increases.
# pip install matplotlib.

# Import required standard libraries for timing and random data generation.
import time
import random
import statistics

# Import matplotlib for simple runtime visualization using a line plot.
import matplotlib.pyplot as plt

# Define a simple greedy interval scheduling algorithm using sorting by end time.
def greedy_interval_scheduling(intervals):
    # Sort intervals by their end times to enable greedy earliest finish selection.
    intervals_sorted = sorted(intervals, key=lambda x: x[1])
    # Initialize selected intervals list and track the current end time.
    selected = []
    current_end = float('-inf')
    # Iterate through sorted intervals and greedily select non overlapping intervals.
    for start, end in intervals_sorted:
        if start >= current_end:
            selected.append((start, end))
            current_end = end
    return selected

# Generate random intervals with lengths between one and ten minutes for benchmarking.
def generate_random_intervals(count):
    # Create intervals with random start times and random durations in minutes.
    intervals = []
    for _ in range(count):
        start = random.randint(0, 10000)
        duration = random.randint(1, 10)
        intervals.append((start, start + duration))
    return intervals

# Benchmark the greedy algorithm for several input sizes with repeated runs for stability.
def benchmark_greedy(sizes, repeats):
    # Store average runtimes for each input size in a list for plotting.
    avg_times = []
    for n in sizes:
        times = []
        for _ in range(repeats):
            intervals = generate_random_intervals(n)
            start_time = time.perf_counter()
            greedy_interval_scheduling(intervals)
            end_time = time.perf_counter()
            times.append(end_time - start_time)
        avg_times.append(statistics.mean(times))
    return avg_times

# Define input sizes representing different realistic workloads for the greedy algorithm.
input_sizes = [1000, 3000, 5000, 7000, 9000]
# Run the benchmark with several repeats per size to smooth timing noise.
average_runtimes = benchmark_greedy(input_sizes, repeats=5)

# Print a concise summary table showing size and corresponding average runtime seconds.
print("Intervals\tAverage runtime seconds")
for n, t in zip(input_sizes, average_runtimes):
    print(f"{n}\t{t:.6f}")

# Plot runtime versus input size to visually inspect how performance scales.
plt.figure(figsize=(6, 4))
plt.plot(input_sizes, average_runtimes, marker='o')
plt.xlabel('Number of intervals scheduled')
plt.ylabel('Average runtime seconds')
plt.title('Greedy interval scheduling benchmark results')
plt.grid(True)
plt.tight_layout()
plt.show()



### **3.2. Profiling Greedy Hotspots**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master Python Algorithms/Module_07/Lecture_B/image_03_02.jpg?v=1767342505" width="250">



>* Profiling measures where greedy code spends time
>* Helps find repeated operations that dominate runtime

>* Link profiler data to greedy loop structure
>* Identify tiny repeated operations that dominate runtime

>* Use profiling to target the real bottlenecks
>* Iterate optimizations to build scalable, robust greedies



In [None]:
#@title Python Code - Profiling Greedy Hotspots

# Demonstrate profiling greedy hotspots using a simple interval scheduling example.
# Compare slow linear search selection with faster heap based selection using cProfile.
# Show which functions become hotspots when processing many random intervals.

# pip install line_profiler memory_profiler optional profiling tools if needed.

# Import required standard library modules for random intervals and profiling.
import random
import time
import cProfile
import pstats

# Define a function generating random intervals representing jobs with start and end times.
def generate_intervals(count, max_start, max_length):
    intervals = []
    for _ in range(count):
        start = random.randint(0, max_start)
        length = random.randint(1, max_length)
        end = start + length
        intervals.append((start, end))
    return intervals

# Define a slow greedy scheduler using repeated linear scans for next compatible interval.
def greedy_schedule_slow(intervals):
    selected = []
    current_end = -1
    remaining = sorted(intervals, key=lambda x: x[1])
    while remaining:
        best_index = None
        best_end = float('inf')
        for i, (start, end) in enumerate(remaining):
            if start >= current_end and end < best_end:
                best_end = end
                best_index = i
        if best_index is None:
            break
        chosen = remaining.pop(best_index)
        selected.append(chosen)
        current_end = chosen[1]
    return selected

# Import heapq for faster greedy selection using a priority queue based structure.
import heapq

# Define a faster greedy scheduler using a heap for selecting earliest finishing intervals.
def greedy_schedule_fast(intervals):
    heap = []
    for start, end in intervals:
        heapq.heappush(heap, (end, start))
    selected = []
    current_end = -1
    while heap:
        end, start = heapq.heappop(heap)
        if start >= current_end:
            selected.append((start, end))
            current_end = end
    return selected

# Define a wrapper running both greedy versions to create realistic profiling workload.
def run_schedulers():
    intervals = generate_intervals(count=2000, max_start=5000, max_length=200)
    slow_result = greedy_schedule_slow(intervals)
    fast_result = greedy_schedule_fast(intervals)
    assert len(slow_result) == len(fast_result)

# Profile the workload and print top functions by cumulative time to reveal hotspots.
profiler = cProfile.Profile()
profiler.enable()
run_schedulers()
profiler.disable()
stats = pstats.Stats(profiler).strip_dirs().sort_stats('cumulative')
stats.print_stats(8)



### **3.3. Comparing to naive solutions**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master Python Algorithms/Module_07/Lecture_B/image_03_03.jpg?v=1767342524" width="250">



>* Compare greedy code to a simple baseline
>* Measure how runtimes diverge as input grows

>* Scale input sizes to reveal runtime growth
>* Compare timings to see real-world performance impact

>* Relate speed differences to algorithms and data
>* Use patterns to judge tradeoffs and refine implementations



In [None]:
#@title Python Code - Comparing to naive solutions

# Demonstrate comparing greedy and naive solutions performance clearly.
# Show runtime difference for selecting smallest elements repeatedly.
# Use simple prints suitable for beginners in Google Colab.

# pip install commands are unnecessary because we use only standard libraries.

# Import required standard library modules for timing and random data.
import time
import random

# Define a naive function repeatedly scanning list for smallest element.
def naive_select_k_smallest(numbers, k):
    result = []
    data = numbers.copy()
    for _ in range(k):
        smallest_index = 0
        for i in range(1, len(data)):
            if data[i] < data[smallest_index]:
                smallest_index = i
        result.append(data.pop(smallest_index))
    return result

# Define a greedy function using heapq for efficient smallest selection.
import heapq

def greedy_select_k_smallest(numbers, k):
    data = numbers.copy()
    heapq.heapify(data)
    result = [heapq.heappop(data) for _ in range(k)]
    return result

# Define a helper function for timing another function with given arguments.
def time_function(func, numbers, k):
    start = time.perf_counter()
    func(numbers, k)
    end = time.perf_counter()
    return end - start

# Prepare different input sizes to compare growth of runtimes clearly.
input_sizes = [1_000, 5_000, 10_000]

# Choose how many smallest elements we want to select each time.
k = 50

# Print header describing what the following timing table represents.
print("Comparing naive and greedy runtimes for selecting", k, "smallest values.")

# Loop over sizes, generate data, and measure both implementations.
for n in input_sizes:
    numbers = [random.randint(0, 1_000_000) for _ in range(n)]
    naive_time = time_function(naive_select_k_smallest, numbers, k)
    greedy_time = time_function(greedy_select_k_smallest, numbers, k)
    print(f"Size {n:6d} items -> naive {naive_time:7.4f}s, greedy {greedy_time:7.4f}s")



# <font color="#418FDE" size="6.5" uppercase>**Greedy In Python**</font>


In this lecture, you learned to:
- Use Python sorting and heapq to implement efficient greedy selections. 
- Translate high-level greedy strategies into clear, maintainable Python code. 
- Measure and interpret the performance of greedy implementations on realistic input sizes. 

In the next Module (Module 8), we will go over 'Dynamic Programming'