# SimPy Tutorial: From Basics to Queue Simulation

This tutorial introduces key SimPy concepts and demonstrates how to simulate a basic queue system. You'll learn about:
- **Environment**: The simulation context
- **Process**: Actions that occur over time
- **Resources**: Shared, limited capacity objects
- **Queue Simulation**: Putting it all together

SimPy is a process-based discrete-event simulation framework based on standard Python.

## 1. Installation and Setup

First, let's import the necessary libraries.

In [61]:
import simpy
import random

## 2. SimPy Environment

The **Environment** is the core of SimPy. It manages the simulation time and schedules events.

Let's create a simple environment and run it:

In [62]:
# Create a SimPy environment
env = simpy.Environment()

# Check the current simulation time
print(f"Initial simulation time: {env.now}")

# Run the simulation for 10 time units
env.run(until=10)

print(f"Simulation time after running: {env.now}")

Initial simulation time: 0
Simulation time after running: 10


#### ðŸ’¡ Try It Yourself

Modify the code above to run the simulation for 50 time units instead of 10. What is the final simulation time?

## 3. Processes

A **Process** is a function that describes events happening over time. Processes use Python's generator functions (with `yield`) to interact with the simulation.

### 3.1 Simple Process with Timeout

Let's create a process that prints messages at different times:

In [63]:
def my_process(env, name, duration):
    """A simple process that waits for a specified duration."""
    print(f"{env.now}: {name} starting")
    
    # Wait for the specified duration
    yield env.timeout(duration)
    
    print(f"{env.now}: {name} finished after {duration} time units")

# Create environment
env = simpy.Environment()

# Start the process
env.process(my_process(env, "Process A", 5))
env.process(my_process(env, "Process B", 3))

# Run simulation
env.run()

0: Process A starting
0: Process B starting
3: Process B finished after 3 time units
5: Process A finished after 5 time units


#### ðŸ’¡ Try It Yourself

Add a third process called "Process C" that runs for 7 time units. Observe the order in which the processes finish.

### 3.2 Multiple Activities in a Process

A process can have multiple activities (timeouts) in sequence:

In [64]:
def car_journey(env, name):
    """Simulate a car journey with multiple stages."""
    print(f"{env.now}: {name} starts driving")
    
    # Drive to gas station
    yield env.timeout(2)
    print(f"{env.now}: {name} arrives at gas station")
    
    # Refuel
    yield env.timeout(1)
    print(f"{env.now}: {name} finishes refueling")
    
    # Drive to destination
    yield env.timeout(3)
    print(f"{env.now}: {name} arrives at destination")

# Create and run simulation
env = simpy.Environment()
env.process(car_journey(env, "Car 1"))
env.run()

0: Car 1 starts driving
2: Car 1 arrives at gas station
3: Car 1 finishes refueling
6: Car 1 arrives at destination


#### ðŸ’¡ Try It Yourself

Modify the `car_journey` function to add a fourth stage: stopping at a rest area for 0.5 time units before reaching the destination. Run the simulation and verify the total journey time increases accordingly.

## 4. Resources

A **Resource** represents something with limited capacity that processes need to use (e.g., servers, machines, parking spaces). Processes must request access to a resource and may have to wait if it's busy.

### 4.1 Single Server Resource

In [65]:
def user(env, name, server):
    """A user tries to use a shared server."""
    print(f"{env.now}: {name} arrives and requests server")
    
    # Request the resource
    with server.request() as request:
        # Wait until the resource is available
        yield request
        print(f"{env.now}: {name} got the server")
        
        # Use the resource for some time
        yield env.timeout(2)
        print(f"{env.now}: {name} releases the server")

# Create environment and a single server resource
env = simpy.Environment()
server = simpy.Resource(env, capacity=1)

# Start multiple users
env.process(user(env, "User A", server))
env.process(user(env, "User B", server))
env.process(user(env, "User C", server))

# Run simulation
env.run()

0: User A arrives and requests server
0: User B arrives and requests server
0: User C arrives and requests server
0: User A got the server
2: User A releases the server
2: User B got the server
4: User B releases the server
4: User C got the server
6: User C releases the server


Notice how User B and User C had to wait because the server was busy with the previous user. This is the essence of queueing!

#### ðŸ’¡ Try It Yourself

Change the server capacity from 1 to 2 and run the simulation again. How does the behavior change? Do any users still have to wait?

## 5. Collecting Statistics and Monitoring

Before building a complete queue simulation, it's essential to understand how to collect and monitor performance metrics. This section covers different approaches to gathering simulation data.

### 5.1 Basic Statistics Collection

The simplest way to collect statistics is by using a dictionary to store different metrics. Let's track wait times and queue lengths:

In [None]:
def user_with_stats(env, name, server, stats):
    """A user that records wait time and queue length."""
    arrival_time = env.now
    print(f"{env.now:.2f}: {name} arrives")
    
    # Log queue length at arrival
    queue_length = len(server.queue)
    stats['queue_lengths'].append((env.now, queue_length))
    
    with server.request() as request:
        # Wait for the resource
        yield request
        
        # Calculate and record wait time
        wait_time = env.now - arrival_time
        stats['wait_times'].append(wait_time)
        print(f"{env.now:.2f}: {name} got server (waited {wait_time:.2f})")
        
        # Use the resource
        yield env.timeout(2)
        print(f"{env.now:.2f}: {name} releases server")

# Create environment and resources
env = simpy.Environment()
server = simpy.Resource(env, capacity=1)
stats = {
    'wait_times': [],
    'queue_lengths': []
}

# Start multiple users
env.process(user_with_stats(env, "User A", server, stats))
env.process(user_with_stats(env, "User B", server, stats))
env.process(user_with_stats(env, "User C", server, stats))

# Run simulation
env.run()

0.00: User A arrives
0.00: User B arrives
0.00: User C arrives
0.00: User A got server (waited 0.00)
2.00: User A releases server
2.00: User B got server (waited 2.00)
4.00: User B releases server
4.00: User C got server (waited 4.00)
6.00: User C releases server


#### Analyzing the Statistics

Now let's analyze the collected statistics:

In [None]:
print(f"\nWait Times: {stats['wait_times']}")
print(f"Average wait time: {sum(stats['wait_times']) / len(stats['wait_times']):.2f}")
print(f"Maximum wait time: {max(stats['wait_times']):.2f}")
print(f"Total users served: {len(stats['wait_times'])}")

print(f"\nQueue Lengths (time, length): {stats['queue_lengths']}")
queue_values = [q[1] for q in stats['queue_lengths']]
print(f"Average queue length: {sum(queue_values) / len(queue_values):.2f}")


Collected wait times: [0, 2, 4]
Average wait time: 2.00
Maximum wait time: 4.00
Total users served: 3


**Key insight**: By using a dictionary to store different metrics, we can collect multiple types of data throughout the simulation and analyze them afterwards. This technique scales well for complex simulations with many performance indicators.

#### ðŸ’¡ Try It Yourself

Add a fourth user (User D) that arrives at the same time as User A. Run the simulation and examine both the wait times and queue lengths. What patterns do you notice?

### 5.2 Custom Resource with Built-in Monitoring

Instead of using a separate monitoring process, we can create a custom resource class that automatically logs utilization. This is an example of extending SimPy's built-in classes:

In [None]:
class MonitoredResource(simpy.Resource):
    """
    A Resource that automatically logs utilization to a stats dictionary.
    
    This class extends simpy.Resource and overrides the request method
    to track utilization data in the same format as other statistics.
    """
    
    def __init__(self, env, capacity=1, stats=None):
        """
        Initialize the monitored resource.
        
        Args:
            env: SimPy environment
            capacity: Number of simultaneous users allowed
            stats: Dictionary to store utilization data (must have 'utilization' key)
        """
        super().__init__(env, capacity)
        self._env = env
        self.stats = stats if stats is not None else {'utilization': []}
    
    def request(self, *args, **kwargs):
        """
        Override the request method to log utilization.
        
        Returns:
            Request object from parent class
        """
        # Log current utilization (time, utilization) tuple
        utilization = self.count / self.capacity
        self.stats['utilization'].append((self._env.now, utilization))
        return super().request(*args, **kwargs)

Now let's test the MonitoredResource with a stats dictionary:

In [69]:
def user_monitored(env, name, server, service_time, stats):
    """A user that uses a monitored resource."""
    arrival_time = env.now
    print(f"{env.now:.2f}: {name} arrives")
    
    # Log queue length at arrival
    queue_length = len(server.queue)
    stats['queue_lengths'].append((env.now, queue_length))
    
    with server.request() as request:
        yield request
        
        # Log wait time
        wait_time = env.now - arrival_time
        stats['wait_times'].append(wait_time)
        print(f"{env.now:.2f}: {name} starts service (waited {wait_time:.2f})")
        
        yield env.timeout(service_time)
        print(f"{env.now:.2f}: {name} finishes")

# Create environment with MonitoredResource and stats dictionary
env = simpy.Environment()
stats = {
    'wait_times': [],
    'queue_lengths': [],
    'utilization': []
}
monitored_server = MonitoredResource(env, capacity=1, stats=stats)

# Generate some users with varying service times
random.seed(42)
for i in range(5):
    arrival_time = i * 1.5  # Staggered arrivals
    service_time = random.uniform(1, 3)
    env.process(user_monitored(env, f"User {i+1}", monitored_server, service_time, stats))

# Run simulation
env.run()

0.00: User 1 arrives
0.00: User 2 arrives
0.00: User 3 arrives
0.00: User 4 arrives
0.00: User 5 arrives
0.00: User 1 starts service (waited 0.00)
2.28: User 1 finishes
2.28: User 2 starts service (waited 2.28)
3.33: User 2 finishes
3.33: User 3 starts service (waited 3.33)
4.88: User 3 finishes
4.88: User 4 starts service (waited 4.88)
6.33: User 4 finishes
6.33: User 5 starts service (waited 6.33)
8.80: User 5 finishes


Examine the automatically collected utilization data:

In [70]:
print("\n" + "="*60)
print("MONITORED RESOURCE STATISTICS")
print("="*60)

# Wait Time Statistics
if stats['wait_times']:
    avg_wait = sum(stats['wait_times']) / len(stats['wait_times'])
    print(f"\nWAIT TIME:")
    print(f"  Total customers: {len(stats['wait_times'])}")
    print(f"  Average: {avg_wait:.3f} time units")

# Queue Length Statistics
if stats['queue_lengths']:
    queue_values = [q[1] for q in stats['queue_lengths']]
    avg_queue = sum(queue_values) / len(queue_values)
    print(f"\nQUEUE LENGTH:")
    print(f"  Average: {avg_queue:.2f} customers")
    print(f"  Maximum: {max(queue_values)} customers")

# Utilization Statistics (automatically logged by MonitoredResource)
if stats['utilization']:
    util_values = [u[1] for u in stats['utilization']]
    avg_util = sum(util_values) / len(util_values)
    print(f"\nSERVER UTILIZATION:")
    print(f"  Average: {avg_util:.1%}")
    print(f"  Log entries: {len(stats['utilization'])}")
    print(f"  Sample measurements: {stats['utilization'][:3]}")

print("="*60)


MONITORED RESOURCE STATISTICS

WAIT TIME:
  Total customers: 5
  Average: 3.362 time units

QUEUE LENGTH:
  Average: 1.20 customers
  Maximum: 3 customers

SERVER UTILIZATION:
  Average: 80.0%
  Log entries: 5
  Sample measurements: [(0, 0.0), (0, 1.0), (0, 1.0)]


**Benefits of MonitoredResource with stats dictionary:**

1. **Consistent data structure**: Utilization is logged in the same `stats` dictionary as wait times and queue lengths
2. **Automatic logging**: Utilization is tracked automatically whenever the resource is requested
3. **Encapsulation**: Monitoring logic is contained within the resource class
4. **No separate process needed**: Unlike the `monitor_utilization` function, this approach doesn't require a separate monitoring process
5. **Easy integration**: Works seamlessly with existing statistics collection patterns

This object-oriented approach is cleaner and more maintainable for complex simulations with multiple metrics to track.

#### ðŸ’¡ Try It Yourself

Modify the MonitoredResource class to also track and log the queue length whenever a request is made. Add a new key `'queue_lengths_auto'` to the stats dictionary and log `(self._env.now, len(self.queue))` tuples. Test your modification with the code above.

## 6. Complete Queue Simulation

Now let's put everything together to create a realistic queue simulation with:
- Random arrival times (Poisson process)
- Random service times (exponential distribution)
- Comprehensive statistics tracking (wait times, queue length, utilization)
- MonitoredResource for automatic utilization logging

### 6.1 Define the Customer Process

In [71]:
def customer(env, name, server, service_time, stats):
    """
    Represents a customer arriving, waiting, being served, and leaving.
    
    Args:
        env: SimPy environment
        name: Customer identifier
        server: MonitoredResource instance
        service_time: Time required for service
        stats: Dictionary containing lists for different metrics
    """
    arrival_time = env.now
    
    # Log queue length at arrival
    queue_length = len(server.queue)
    stats['queue_lengths'].append((env.now, queue_length))
    
    with server.request() as request:
        # Wait in queue (server.request() automatically logs utilization)
        yield request
        wait_time = env.now - arrival_time
        stats['wait_times'].append(wait_time)
        
        # Being served
        yield env.timeout(service_time)

### 6.2 Define the Customer Generator

This creates a continuous stream of arriving customers:

In [72]:
def customer_generator(env, server, arrival_rate, service_rate, stats):
    """
    Generates customers arriving at the queue.
    
    Args:
        env: SimPy environment
        server: MonitoredResource instance
        arrival_rate: Average arrivals per time unit (lambda)
        service_rate: Average services per time unit (mu)
        stats: Dictionary to store statistics
    """
    customer_count = 0
    while True:
        # Wait for next arrival (exponentially distributed)
        yield env.timeout(random.expovariate(arrival_rate))
        
        # Create new customer
        customer_count += 1
        service_time = random.expovariate(service_rate)
        env.process(
            customer(env, f'Customer {customer_count}', server, 
                    service_time, stats)
        )

### 6.2 Run the Simulation

Let's set up and run the queue simulation with comprehensive statistics tracking:

In [73]:
# Simulation parameters
ARRIVAL_RATE = 1.0      # Average 1 customer per time unit
SERVICE_RATE = 1.5      # Average 1.5 customers served per time unit
SIMULATION_TIME = 20.0  # Run for 20 time units
random.seed(42)         # For reproducibility

# Create environment and initialize statistics dictionary
env = simpy.Environment()
stats = {
    'wait_times': [],
    'queue_lengths': [],
    'utilization': []
}

# Create MonitoredResource (automatically logs utilization)
server = MonitoredResource(env, capacity=1, stats=stats)

# Start the customer generator
env.process(
    customer_generator(env, server, ARRIVAL_RATE, SERVICE_RATE, stats)
)

# Run simulation
env.run(until=SIMULATION_TIME)

### 6.3 Analyze the Results

Let's examine the comprehensive statistics collected during the simulation:

In [74]:
print("\n" + "="*60)
print("QUEUE SIMULATION STATISTICS")
print("="*60)

# Wait Time Statistics
if stats['wait_times']:
    avg_wait = sum(stats['wait_times']) / len(stats['wait_times'])
    max_wait = max(stats['wait_times'])
    min_wait = min(stats['wait_times'])
    
    print(f"\nWAIT TIME STATISTICS:")
    print(f"  Total customers served: {len(stats['wait_times'])}")
    print(f"  Average wait time: {avg_wait:.3f} time units")
    print(f"  Maximum wait time: {max_wait:.3f} time units")
    print(f"  Minimum wait time: {min_wait:.3f} time units")

# Queue Length Statistics
if stats['queue_lengths']:
    queue_values = [q[1] for q in stats['queue_lengths']]
    avg_queue = sum(queue_values) / len(queue_values)
    max_queue = max(queue_values)
    
    print(f"\nQUEUE LENGTH STATISTICS:")
    print(f"  Average queue length: {avg_queue:.2f} customers")
    print(f"  Maximum queue length: {max_queue} customers")

# Server Utilization Statistics (automatically logged by MonitoredResource)
if stats['utilization']:
    utilization_values = [u[1] for u in stats['utilization']]
    avg_utilization = sum(utilization_values) / len(utilization_values)
    
    print(f"\nSERVER UTILIZATION STATISTICS:")
    print(f"  Average utilization: {avg_utilization:.1%}")
    print(f"  Measurements taken: {len(stats['utilization'])}")

print("="*60)


QUEUE SIMULATION STATISTICS

WAIT TIME STATISTICS:
  Total customers served: 17
  Average wait time: 0.081 time units
  Maximum wait time: 0.442 time units
  Minimum wait time: 0.000 time units

QUEUE LENGTH STATISTICS:
  Average queue length: 0.00 customers
  Maximum queue length: 0 customers

SERVER UTILIZATION STATISTICS:
  Average utilization: 33.3%
  Measurements taken: 18


#### ðŸ’¡ Try It Yourself

Experiment with different arrival and service rates:
1. Change `ARRIVAL_RATE = 2.0` and `SERVICE_RATE = 1.5` (system overload - arrivals exceed service capacity)
2. Change `ARRIVAL_RATE = 0.5` and `SERVICE_RATE = 2.0` (system underload - plenty of spare capacity)

Run the simulation for each scenario and compare the wait times, queue lengths, and utilization. How do they differ from the original balanced scenario?

## 7. Key Takeaways

### SimPy Concepts Covered:

1. **Environment (`simpy.Environment()`)**: The simulation context that manages time and events
   - `env.now` - current simulation time
   - `env.run(until=X)` - run simulation until time X
   - `env.process()` - start a new process

2. **Process (generator functions with `yield`)**: Activities that occur over time
   - Use `yield env.timeout(t)` to wait for a duration
   - Use `yield request` to wait for a resource

3. **Resource (`simpy.Resource()`)**: Shared, limited capacity objects
   - `capacity` - number of simultaneous users allowed
   - `request()` - request access to the resource
   - Use `with resource.request() as request:` for automatic release

4. **Statistics Collection**: Multiple approaches for monitoring simulation performance
   - Basic: Pass mutable objects (lists/dicts) to processes
   - Advanced: Custom resource classes with built-in monitoring

### Queue Simulation Pattern:

```python
# 1. Create environment
env = simpy.Environment()

# 2. Create shared resources
server = simpy.Resource(env, capacity=1)

# 3. Define process functions with yield
def customer(env, name, server):
    yield env.timeout(...)  # activities
    with server.request() as req:
        yield req  # wait for resource
        yield env.timeout(...)  # use resource

def customer_generator(env, server, arrival_rate, service_rate):
    customer_count = 0
    while True:
        yield env.timeout(random.expovariate(arrival_rate))
        customer_count += 1
        service_time = random.expovariate(service_rate)
        env.process(customer(env, f'Customer {customer_count}', 
                            server, service_time))

# 4. Start processes
env.process(customer_generator(env, server, arrival_rate, service_rate))

# 5. Run simulation
env.run(until=TIME)
```

This pattern can be extended to model complex systems with multiple resource types, priorities, interruptions, and more!