Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

M/M/c model way too slow #1

Open
itsdfish opened this issue Jul 6, 2020 · 21 comments
Open

M/M/c model way too slow #1

itsdfish opened this issue Jul 6, 2020 · 21 comments

Comments

@itsdfish
Copy link

itsdfish commented Jul 6, 2020

Hi pbayer-

I am trying to implement the queue model in DiscreteEvents. One problem I have run into is that I don't know how to enforce a first come first served wait policy. In the example below, there are 10 customers and 2 servers, which process one customer at a time. Each customer arrives in succession and must wait if no servers are available. As you can see, customer 3 waits for either 1 or 2 to finish. However, when customer 2 finishes first at 1.735, customer 3 continues to wait and customer 4 eventually arrives and takes the available server. Eventually customer 3 finishes at the very end.

Customer 1 arrived: 0.1229193244813443
Customer 1 starting service: 0.1229193244813443
Customer 2 arrived: 0.22607641035584877
Customer 2 starting service: 0.22607641035584877
Customer 3 arrived: 0.4570009029409502
Customer 3 is waiting: 0.4570009029409502
Customer 2 finishing service: 1.7657345101378559
Customer 1 finishing service: 2.154824561031012
Customer 4 arrived: 2.3661687470062995
Customer 4 starting service: 2.3661687470062995
.
.
.
Customer 10 finishing service: 9.64997419599946
Customer 3 starting service: 10.0
Customer 3 finishing service: 10.366274594583697

Is this a bug or am I doing something incorrectly?

using DiscreteEvents, DataStructures, Distributions, Random

mutable struct Server
    capacity::Int
    ids::Vector{Int}
end

Server(capacity) = Server(capacity, Int[])

is_full(s) = length(s.ids) == s.capacity ? true : false

remove!(s, id) = filter!(x-> x != id, s.ids)

add!(s, id) = push!(s.ids, id)

Random.seed!(8710) # set random number seed for reproducibility
num_customers = 10 # total number of customers generated
num_servers = 2 # number of servers
mu = 1.0 / 2 # service rate
lam = 0.9 # arrival rate
arrival_dist = Exponential(1 / lam) # interarrival time distriubtion
service_dist = Exponential(1 / mu); # service time distribution

function enter_line(clock, server, id, service_dist, arrival_time)
    delay!(clock, arrival_time)
    now!(clock, fun(println, "Customer $id arrived: ", clock.time))
    if is_full(server)
        now!(clock, fun(println, "Customer $id is waiting: ", clock.time))
        wait!(clock, fun(()->!is_full(server)))
    end
    now!(clock, fun(println,"Customer $id starting service: ", clock.time))
    add!(server, id)
    tΔ = rand(service_dist)
    delay!(clock, tΔ)
    leave(clock, server, id)
    # now!(clock, fun(println, "servers available ", server.capacity-length(server.ids), " ", clock.time))
    # now!(clock, fun(println, "servers full ", is_full(server), " ", clock.time))
end

function leave(clock, server, id)
    now!(clock, fun(println, "Customer $id finishing service: ", clock.time))
    remove!(server, id)
end

function initialize!(clock, arrival_dist, service_dist, num_customers, server)
    arrival_time = 0.0
    for i in 1:num_customers
        arrival_time += rand(arrival_dist)
        process!(clock, Prc(i, enter_line, server, i, service_dist, arrival_time), 1)
    end
end

function benchmark(arrival_dist, service_dist, num_customers, num_servers)
    clock = Clock()
    server = Server(num_servers)
    initialize!(clock, arrival_dist, service_dist, num_customers, server)
    run!(clock, 10^4)
end

benchmark(arrival_dist, service_dist, num_customers, num_servers)
@pbayer
Copy link
Owner

pbayer commented Jul 7, 2020

Hi,
thank you, it took me some time to figure that out.

The problem is the sampling rate. It is calculated in _register!(c::Clock, cond::DiscreteCond) as c.Δt = _scale(c.end_time - c.time)/100). In your case then it comes out as roughly 10² where it should be 0.01. Thus the resolution is way to low and the condition established by your wait!(clock, fun(()->!is_full(server))) is evaluated not in time.

You can fix it now if you do run!(clock, 20) instead of 10⁴. Then the simulation comes out right:

Customer 1 arrived: 0.1229193244813443
Customer 1 starting service: 0.1229193244813443
Customer 2 arrived: 0.22607641035584877
Customer 2 starting service: 0.22607641035584877
Customer 3 arrived: 0.4570009029409502
Customer 3 is waiting: 0.4570009029409502
Customer 2 finishing service: 1.7657345101378559
Customer 3 starting service: 1.8000000000000005   # now served
Customer 1 finishing service: 2.154824561031012
Customer 3 finishing service: 2.3107941984759206
Customer 4 arrived: 2.3661687470062995
...

More realistically would be to model the server as a process, taking the next customer immediately after becoming idle without waiting. Furthermore I would implement the queue as a Channel and let the server process implicitly wait with take! from the Channel.

I will open an issue on that problem over at DiscreteEvents.jl and look for a better heuristic to calculate c.Δt.

@itsdfish
Copy link
Author

itsdfish commented Jul 7, 2020

Thanks. I will see if I can figure out how to use a channel as you recommended. I might retain this model to see how it compares to the channel approach. The preliminary results (with the incorrect temporal resolution) suggest that this model is much slower (about 200 times slower) than the same model in SimJulia.

@itsdfish itsdfish closed this as completed Jul 7, 2020
@pbayer
Copy link
Owner

pbayer commented Jul 7, 2020

… about 200 times slower … 🥺

This is interesting. There are several ways how the queue_mmc problem can be implemented in DiscreteEvents.jl. It would be interesting to know how they compare (and what causes the slowdown).

I can confirm that it is slower, albeit not 200 times:
queue_mmc_SimJulia.jl takes on my machine

BenchmarkTools.Trial: 
  memory estimate:  75.86 KiB
  allocs estimate:  1249
  --------------
  minimum time:     237.364 μs (0.00% GC)
  median time:      250.990 μs (0.00% GC)
  mean time:        266.797 μs (2.84% GC)
  maximum time:     6.294 ms (95.37% GC)
  --------------
  samples:          10000
  evals/sample:     1

queue_mmc_0.jl (your code) takes:

BenchmarkTools.Trial: 
  memory estimate:  37.56 KiB
  allocs estimate:  813
  --------------
  minimum time:     1.206 ms (0.00% GC)
  median time:      3.803 ms (0.00% GC)
  mean time:        7.855 ms (0.00% GC)
  maximum time:     133.642 ms (0.00% GC)
  --------------
  samples:          636
  evals/sample:     1

the files are in ./benchmarks/compare.

There we can improve and compare the approaches.

@pbayer pbayer reopened this Jul 7, 2020
pbayer added a commit that referenced this issue Jul 7, 2020
@itsdfish
Copy link
Author

itsdfish commented Jul 7, 2020

I don't have the ability to check at the moment unfortunately. It's possible that the discrepancy in benchmarks is due to the number servers and customers I used. I think I used 100 and 10,000 respectively before I decreased the values for validation.

@itsdfish
Copy link
Author

itsdfish commented Jul 7, 2020

Ok. I had the opportunity to run a benchmark. The problem here is poor scalability.

num_customers = 200 # total number of customers generated
num_servers = 10 # number of servers
DiscreteEvents mean time:        71.594 ms (0.52% GC)
SimJulia mean time:                  4.336 ms (3.56% GC)
ratio ≈ 18

num_customers = 2000 # total number of customers generated
num_servers = 10 # number of servers
DiscreteEvents mean time:        10.158 s (0.35% GC)
SimJulia mean time:                  48.460 ms (3.51% GC)
ratio ≈ 207

@itsdfish
Copy link
Author

itsdfish commented Jul 8, 2020

I think varying the number of customers is more informative. I'll do the same with the channel approach if I can figure it out.

SimJulia
SimJulia_Queue
DiscreteEevents
DiscreteEvents_Queue

@pbayer
Copy link
Owner

pbayer commented Jul 8, 2020

Now I did the exact analog to the SimJulia example in DiscreteEvents with Julia channels. It looks very similar to the original:

using DiscreteEvents, Printf, Distributions, Random

# Define Constants
Random.seed!(8710)   # set random number seed for reproducibility
num_customers = 10   # total number of customers generated
num_servers = 2      # number of servers
μ = 1.0 / 2          # service rate
λ = 0.9              # arrival rate
arrival_dist = Exponential(1/λ)  # interarrival time distriubtion
service_dist = Exponential(1/μ); # service time distribution

# Define Customer Behavior
function customer(clk::Clock, server::Channel, id::Int, ta::Float64, ds::Distribution)
    delay!(clk, ta)       # customer arrives
    now!(clk, ()->@printf("%5.3f: customer %d arrived\n", tau(clk), id))
    s = take!(server)     # customer starts service
    now!(clk, ()->@printf("%5.3f: customer %d entered service\n", tau(clk), id))
    delay!(clk, rand(ds)) # server is busy
    put!(server, s)       # customer exits service
    now!(clk, ()->@printf("%5.3f: customer %d exited service\n", tau(clk), id))
end

# Setup and Run Simulation
clk = Clock() # initialize simulation environment
server = Channel{Int}(num_servers)  # initialize servers
for i in 1:num_servers
    put!(server, i)
end
arrival_time = 0.0
for i = 1:num_customers # initialize customers
    global arrival_time += rand(arrival_dist)
    process!(clk, Prc(i, customer, server, i, arrival_time, service_dist), 1)
end
run!(clk, 20) # run simulation

The file is ./examples/queue_mmc_chn.jl. Can you look how this compares?

On writing this I realised that the customers are created before running the simulation. This is not fair enough. Customers (arrivals) should be created during a running simulation.

pbayer added a commit that referenced this issue Jul 8, 2020
@itsdfish
Copy link
Author

itsdfish commented Jul 8, 2020

I adapted your code to run the same benchmark. Unfortunately, the results were very similar (although slightly better):

DiscreteEvents_Queue_Channel

@pbayer
Copy link
Owner

pbayer commented Jul 8, 2020

This is tremendous 😰. There are two problems for the DiscreteEvents script compared with `SimJulia‘: speed and scaling. I guess

  1. is due to task creation or startup (e.g. @async slower than @process)
  2. is due to scheduling (very long task/customer queues)

I will test those hypotheses one by one tomorrow.

pbayer added a commit to JuliaDynamics/DiscreteEvents.jl that referenced this issue Jul 9, 2020
pbayer added a commit that referenced this issue Jul 9, 2020
@pbayer
Copy link
Owner

pbayer commented Jul 9, 2020

Now I did the following:

  1. I changed the model to generate the arrivals during the simulation run. Thus the queue of running tasks/processes remains shorter. In order to achieve this I had
  2. to improve the process startup on DiscreteEvents (so please update to the latest commit)

The benchmark #2 then comes out as follows:

result benchmark #2

As you see there is a big improvement. So the above slowdown is likely due to the Julia scheduler. Still the simulation is slower than the SimJulia model. In order to proof that the lag in performance yet is due to the Julia scheduler (see Chain of lightweight tasks much slower on thread 1 than 2-4 #34875) I execute the body of the run_model function on thread 2. Then the benchmark #3 comes out as:

result benchmark #3

Interestingly I couldn't gain anything if I only moved the first model to thread 2 (benchmark #1). Still the task queues are too long. I hope that anytime soon those problems with the Julia scheduler will be improved.

@pbayer pbayer changed the title Execute process as soon as wait condition is satisfied M/M/c model much slower in DiscreteEvents than SimJulia Jul 9, 2020
@pbayer pbayer changed the title M/M/c model much slower in DiscreteEvents than SimJulia M/M/c model way too slow Jul 9, 2020
@itsdfish
Copy link
Author

itsdfish commented Jul 9, 2020

Great work! As far as I can tell, SimJulia uses a priority queue to manage a resource. Do you think this is the reason for the performance difference and the fact that it can front load all of the customers?

@pbayer
Copy link
Owner

pbayer commented Jul 9, 2020

This is not the problem. In our example servers are seen as resources and there are only two of them. But in a sense you are right: SimJulia handles its coroutines also in a priority queue and then executes them sequentially. The SimJulia model is only one user task for Julia. Everything is handled internally.

In DiscreteEvents processes are native Julia tasks, handled by the Julia scheduler. In our example we generate up to 4000 asynchronous tasks which must be checked at every yield() if they can run. This is expensive. And they must compete against all the other stuff going on on thread one and be handled by the scheduler.

But remaining native Julia has multiple advantages in being versatile to express simulation problems and - as you see - if you do the right thing, you can also be pretty fast.

I am rewriting the documentation and examples in DiscreteEventsCompanion and If you don't mind, I will use also the M/M/c example for that.

BTW and IMHO creating the customer/process queue beforehand is not the best thing to do and feasible only for simple examples.

  • Consider for example customer reneging (if the customer must wait too long he leaves the queue). This cannot be foreseen since queue length in reality depends on stochastic events.
  • In simulations often we are interested in queue lengths. This also can be modeled better with a discrete arrival process.
  • ...

@itsdfish
Copy link
Author

itsdfish commented Jul 9, 2020

That seems like a reasonable course of action. In fact, it looks like benchmark 3 is faster than SimJulia. Perhaps the reason is that the arrivals are distributed throughout the simulation rather than all at the beginning.

Please feel free to use my implementation as an example of poorly written code. I think it will be instructive for others =)

@pbayer
Copy link
Owner

pbayer commented Jul 10, 2020

My main point here is modeling approaches. There are various ways to express a simulation problem in DiscreteEvents. If the model is done right, simple models take only a fraction of a second to run (even over thousands of simulation steps). Otherwise something is conceptually wrong. The conceptual problem here didn't come from your code but from the original example. Therefore this is a nice illustration.

For practical purposes speed does mostly matter

  • in big simulations or
  • in stochastic exploration like sensitivity or perturbation analyses, optimisation, machine learning … where you do lots of simulations.

Those are mostly uncharted territories for simulation systems, which I hope to open further with DiscreteEvents. With development, parallelism, optimisation … the speed of simulations with Julia and DiscreteEvents is likely to only improve. But the flexibility in modeling means that user choices can have a big influence on speed.

pbayer added a commit that referenced this issue Jul 10, 2020
pbayer added a commit that referenced this issue Jul 10, 2020
updating documentation 
see also issue #1
@pbayer
Copy link
Owner

pbayer commented Jul 10, 2020

Now the next step: modeling the servers instead of the customers eliminates most of the task setup needed for generating a task for every customer. Instead the next model (bench_queue_srv.jl) works only with three processes/tasks: one for the arrivals, two for the servers. Customers are simply handled as tokens flowing into an input channel, taken by the servers and put into an output channel.

The server benchmark comes out as:

image

It is yet faster than benchmark #2 (modified model). Can this be improved by moving it to another thread as before?

The server benchmark on thread 2 (bench_queue_srv1.jl) gives the following results:

image

It now takes about 30 ms for 4000 customers. Clearly the model matters a lot for speed.

@pbayer
Copy link
Owner

pbayer commented Jul 11, 2020

Now it would be interesting to see if you can speed up similarly the model in SimJulia and how it goes in SimPy.

@itsdfish
Copy link
Author

This is interesting. Tomorrow I will have time to study your examples in more detail. I will try to make a similar model in SimJulia.

pbayer added a commit to JuliaDynamics/DiscreteEvents.jl that referenced this issue Jul 13, 2020
pbayer added a commit that referenced this issue Jul 13, 2020
@pbayer
Copy link
Owner

pbayer commented Jul 13, 2020

Now I wanted yet to add another activity-based solution. This model doesn't work with processes/tasks but only with event-scheduling. Thus there is no task-switching and no intervention of the Julia scheduler.

The activity based model bench_queue_mmc_act.jl gives the following results:

image

It takes about 60 ms for 4000 customers, longer than the last model since it checks conditions with clock sampling. But it is still pretty fast.

@itsdfish
Copy link
Author

I'm having trouble trying to make an analog of bench_queue_srv.jl in SimJulia. I'll have to dig into the documentation to see if it is possible.

@pbayer
Copy link
Owner

pbayer commented Jul 14, 2020

I think you could implement the customer queue as a Store in SimJulia with the arrival process doing a put and the server process a get on it. I can find no documentation over there (just a mention), but look at stores.jl to see how it is called.

Unfortunately documentation doesn't seem one of the strengths there 🙁.

@itsdfish
Copy link
Author

Thanks for the suggestions. Unfortunately, sometimes code itself is the only documentation. I will have time to look into this periodically throughout the week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants