
# DRF Fairness — Tiny Simulator (Notebook)
This notebook demonstrates **Dominant Resource Fairness (DRF)** inside a single pool (e.g., Batch).
You can tweak pool size, executor size, starting allocations, and ceilings, then simulate how the next N executors would be allocated.

**What you'll see:**
- Final allocation per user (CPU, Memory, dominant share, executor count)
- Step-by-step log of who received each executor (with dominant share before/after)



> **Notes:**  
> - Only standard Python + `pandas` is used.
> - DRF here operates on **actual CPU and Memory amounts**, not just executor counts.
> - This is illustrative; a production scheduler also considers queue order, job priority, and SLOs.


In [None]:

from dataclasses import dataclass, field
from typing import List, Dict, Tuple
import pandas as pd

@dataclass
class Pool:
    name: str
    total_cpu: float          # total vCPU in the pool
    total_mem_gb: float       # total GB RAM in the pool
    exec_cpu: float           # vCPU per executor
    exec_mem_gb: float        # GB RAM per executor

@dataclass
class UserState:
    name: str
    cpu: float                # current allocated CPU
    mem_gb: float             # current allocated Memory
    max_executors: int        # ceiling on number of executors the user can receive
    executors: int = field(default=0)  # derived

    def can_receive(self, pool: Pool) -> bool:
        return self.executors < self.max_executors

    def allocate_one(self, pool: Pool):
        self.cpu += pool.exec_cpu
        self.mem_gb += pool.exec_mem_gb
        self.executors += 1

def dominant_share(user: UserState, pool: Pool) -> float:
    cpu_share = user.cpu / pool.total_cpu if pool.total_cpu else 0.0
    mem_share = user.mem_gb / pool.total_mem_gb if pool.total_mem_gb else 0.0
    return max(cpu_share, mem_share)

def drf_allocate(pool: Pool, users: List[UserState], extra_executors: int, verbose: bool=False) -> Tuple[List[UserState], List[Dict]]:
    history = []
    for step in range(extra_executors):
        eligible = [u for u in users if u.can_receive(pool)]
        if not eligible:
            if verbose:
                print(f"[step {step}] No eligible users left (ceilings reached).")
            break

        shares = [(u, dominant_share(u, pool)) for u in eligible]
        shares.sort(key=lambda x: (round(x[1], 12), x[0].name))  # lowest dominant share wins (tie-break by name)
        chosen, share_before = shares[0]

        chosen.allocate_one(pool)
        share_after = dominant_share(chosen, pool)

        history.append({
            "step": step + 1,
            "allocated_to": chosen.name,
            "dominant_share_before": round(share_before, 4),
            "dominant_share_after": round(share_after, 4),
            "total_execs_for_user": chosen.executors
        })

        if verbose:
            print(f"[step {step+1}] -> {chosen.name}: {share_before:.4f} -> {share_after:.4f} (execs={chosen.executors})")

    return users, history

def results_dataframe(users: List[UserState], pool: Pool) -> pd.DataFrame:
    rows = []
    for u in users:
        rows.append({
            "user": u.name,
            "cpu_alloc": u.cpu,
            "mem_alloc_gb": u.mem_gb,
            "dominant_share": round(dominant_share(u, pool), 4),
            "executors": u.executors,
            "ceiling_execs": u.max_executors
        })
    df = pd.DataFrame(rows).sort_values(by=["dominant_share","user"]).reset_index(drop=True)
    return df



## Configure a scenario
Edit the values below and re-run this cell to try different pool sizes, executor sizes, starting allocations, ceilings, and how many new executors to distribute.


In [None]:

# --- Pool setup (example: Batch pool) ---
pool = Pool(
    name="batch",
    total_cpu=200,       # total vCPU in the pool
    total_mem_gb=800,    # total GB RAM in the pool
    exec_cpu=2,          # vCPU per executor
    exec_mem_gb=16       # GB RAM per executor
)

# --- Users starting state ---
users = [
    # name, starting CPU, starting MEM (GB), ceiling on executors
    UserState(name="TeamA_CPUheavy", cpu=60, mem_gb=120, max_executors=30),
    UserState(name="TeamB_MEMheavy", cpu=40, mem_gb=400, max_executors=30),
    UserState(name="TeamC_Light",    cpu=20, mem_gb=40,  max_executors=30),
]

# --- How many new executors to allocate using DRF ---
extra_executors = 15



## Run the DRF allocation


In [None]:

users_after, history = drf_allocate(pool, users, extra_executors=extra_executors, verbose=False)
final_df = results_dataframe(users_after, pool)
hist_df = pd.DataFrame(history)

print("=== DRF Final Allocation per User ===")
display(final_df)

print("\n=== DRF Allocation Steps (who got each executor) ===")
display(hist_df)



### How to read the results
- **dominant_share** is the max of CPU share and Memory share per user (e.g., 0.30 = 30% of that resource in the pool).
- The **next executor** is always given to the user with the **lowest** dominant share (subject to their ceiling).
- If you increase `extra_executors`, you’ll see users’ dominant shares equalise over time.
- Lower a user's `max_executors` to see how **ceilings** cap growth even if they have the lowest share.
