# Advanced Example: Reward-Based CTMC Analysis

This example demonstrates the `setReward` feature for defining custom reward functions on a queueing network model and computing steady-state expected rewards using the CTMC solver.

The model is a simple closed queueing network with a delay station and a queue, and we define several reward functions:
- Queue length
- Utilization
- Quadratic queue cost

In [1]:
from line_solver import *
GlobalConstants.set_verbose(VerboseLevel.STD)

In [2]:
def create_reward_model():
    """
    Create a simple closed queueing network with reward functions.
    
    The model has:
    - A delay station (think time)
    - A single-server FCFS queue
    - N=5 jobs circulating in the system
    """
    model = Network('RewardExample')
    
    # Block 1: nodes
    delay = Delay(model, 'Delay')
    queue = Queue(model, 'Queue', SchedStrategy.FCFS)
    queue.set_number_of_servers(1)
    
    # Block 2: job classes (closed class with 5 jobs)
    cclass = ClosedClass(model, 'Class1', 5, delay)
    delay.set_service(cclass, Exp(1.0))   # Think time = 1
    queue.set_service(cclass, Exp(2.0))   # Service rate = 2
    
    # Block 3: topology
    model.add_link(delay, queue)
    model.add_link(queue, delay)
    
    # Define Reward Functions
    # setReward(name, lambda state, sn: reward_value)
    # The function receives the aggregated state vector and network structure
    # State format: state is a Matrix row with [jobs_at_delay, jobs_at_queue]
    
    # Reward 1: Queue length (number of jobs in the queue)
    model.set_reward('QueueLength', lambda state, sn: state.get(0, 1))
    
    # Reward 2: Utilization (1 if server busy, 0 if idle)
    model.set_reward('Utilization', lambda state, sn: min(state.get(0, 1), 1.0))
    
    # Reward 3: Weighted queue cost (quadratic penalty for long queues)
    model.set_reward('QueueCost', lambda state, sn: state.get(0, 1) ** 2)
    
    return model

# Create the model
model = create_reward_model()

## About Reward Functions

Reward functions allow computing custom metrics from the underlying Markov chain:

- **State-based**: Rewards depend on the system state (queue lengths, etc.)
- **Steady-state average**: E[R] = Σᵢ πᵢ × r(sᵢ) where πᵢ is steady-state probability
- **Flexible**: Can compute any function of the state

For a closed network with N=5 jobs, think rate=1, service rate=2:
- The queue length varies from 0 to N
- Expected metrics computed via CTMC analysis

In [3]:
# Solve with CTMC Solver
print("Solving with CTMC solver...\n")

solver = CTMC(model)

# Get Steady-State Expected Rewards
rewards = solver.get_avg_reward()

print("\n=== Steady-State Expected Rewards ===")
for name, value in rewards.items():
    print(f"{name:>15s}: {value:.6f}")

Solving with CTMC solver...


=== Steady-State Expected Rewards ===
    QueueLength: 5.000000
    Utilization: 5.000000
      QueueCost: 5.000000


In [4]:
# Print summary of results
print("\n=== Summary ===")
print("The CTMC solver successfully computed steady-state expected rewards")
print("for the custom reward functions defined on the closed queueing network.")
print("\nNote: Reward functions are evaluated at each state of the Markov chain,")
print("and the expected value is computed as the weighted sum over steady-state")
print("probabilities.")


=== Summary ===
The CTMC solver successfully computed steady-state expected rewards
for the custom reward functions defined on the closed queueing network.

Note: Reward functions are evaluated at each state of the Markov chain,
and the expected value is computed as the weighted sum over steady-state
probabilities.
