New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement select, seize_selected, release_selected #52

Closed
Enchufa2 opened this Issue May 23, 2016 · 12 comments

Comments

Projects
None yet
3 participants
@Enchufa2
Member

Enchufa2 commented May 23, 2016

Suppose we have a pool of resources of the same kind, let's say 10 agents handling a stream of tickets, and we want to assign those tickets. Currently, it can be done with a branch, but this is slow and verbose. And since there are some typical interesting cases, we may implement them as activities. For instance,

agents <- c("agent1", "agent2", "agent3", ...)

t <- create_trajectory() %>%
  seize_rr(agents, ...)
  ...

would select agents following a Round-Robin policy. Or seize_shortest() for the shortest queue, and so on and so forth.

@Enchufa2

This comment has been minimized.

Member

Enchufa2 commented May 24, 2016

The problem with the approach above comes with the release part, because the modeller doesn't know which resource should release. So, maybe it's better something like this:

agents <- c("agent1", "agent2", "agent3",  ...)

env <- simmer()
lapply(agents, function(i) env %>% add_resource(i, 1)) %>% invisible

t <- create_trajectory() %>%
  select(agents, policy="shortest-queue") %>%
  seize_selected(amount=1) %>%
  ...
  release_selected(amount=1)

And, of course, this select would allow for the following:

env <- simmer() %>%
  add_resource("dummy", 1)

t <- create_trajectory() %>%
  select("dummy") %>%
  ...
  seize_selected(1) %>%
  ...
  release_selected(1)

which would be equivalent to

t <- create_trajectory() %>%
  ...
  seize("dummy", 1) %>%
  ...
  release("dummy", 1)

but far more flexible. What do you think, @Bart6114?

@Bart6114

This comment has been minimized.

Member

Bart6114 commented May 24, 2016

I think the suggested functionality is definitely more user friendly than immediately going the branching route for these types of problems.

What about simply leveraging the attributes for this? Let's say we make it possible for the resource argument of seize and release to be a function. We could then e.g. do:

env<-simmer()

shortest_queue_selector<-function(env, ...){
  resource_names = c(...)

  lapply(resource_names, get_queue_count, env=env) %>%
    which.min() %>%
    resource_names[.]
}


t0<-
  create_trajectory() %>%
  timeout(function(attrs){
    attrs[['resource_to_select']] <- 
      shortest_queue_selector(env, "dummy1", "dummy2", "dummy3")
    cat("I'm going to select", attrs[['resource_to_select']], "\n")

    0
  }) %>%
  seize(function(attrs) attrs[['resource_to_select']]) %>%
  timeout(5) %>%
  release(function(attrs) attrs[['resource_to_select']])


env %>%
  add_resource("dummy1") %>%
  add_resource("dummy2") %>%
  add_resource("dummy3") %>%
  add_generator("test", t0, at(1:20)) %>%
  run()

I think it will be a more flexible approach, which will make it more easy to implement a number of useful convenience functions around it (a la shortest_queue_selector).

This makes me think that it might be more intuitive if we would also have e.g. a set() trajectory function. Which would basically be the same as a timeout() but would always return 0 as duration. I personally feel the name of the timeout() command makes it feel like a hack if you use it for e.g. only setting an attribute or implementing a custom monitor.

@Enchufa2

This comment has been minimized.

Member

Enchufa2 commented May 24, 2016

This makes me think that it might be more intuitive if we would also have e.g. a set() trajectory function.

To set attributes, we have set_attribute(), remember? It was your idea! 😄😄😄😄

What about simply leveraging the attributes for this?

There are several problems:

  • As you said, seize and release don't support a function as the name of the resource. And after priorities, preemption, etcetera, etcetera, I'm becoming more and more disinclined to complicate them further.
  • The performance of this approach is very poor, even worse than branching, because bringing data back and forth, from C++ to R, from R to C++, is expensive. It's nice to have such a flexible mechanism, but I would restrict its use to specific cases. They are great to keep track of arrivals' properties and they are handy to perform some tricks at a given moment, but no less, no more.
  • Attributes are numeric. Supporting strings would be possible, but even more expensive in terms of performance, and not so useful in general, I think.

My approach requires the implementation of the scheduling policies in the C++ core, yes, but there are not as many useful algorithms out there for this task, I mean...

  • Round-Robin.
  • Shortest queue.
  • First available (in order).
  • Random.
  • ...?
@Bart6114

This comment has been minimized.

Member

Bart6114 commented May 25, 2016

To set attributes, we have set_attribute(), remember? It was your idea! 😄😄😄😄

Pfoe... let's say it was a long day yesterday...

There are several problems:

As you said, seize and release don't support a function as the name of the resource. And after priorities, preemption, etcetera, etcetera, I'm becoming more and more disinclined to complicate them further.

Fully agree, the list of arguments and the value they can take is becoming long... However, given that for amount we can pass a function I don't think there are good arguments for disallowing it for the resource name (aside from the work related to it of course).

The performance of this approach is very poor, even worse than branching, because bringing data back and forth, from C++ to R, from R to C++, is expensive. It's nice to have such a flexible mechanism, but I would restrict its use to specific cases. They are great to keep track of arrivals' properties and they are handy to perform some tricks at a given moment, but no less, no more.

Did a few simple benchmarks:

library(simmer)

t0<-create_trajectory() %>%
  timeout(2)

t1<-create_trajectory() %>%
  timeout(function() 2)

t2<-create_trajectory() %>%
  timeout(function(attrs) 2)


ptm1 <- proc.time()
simmer() %>%
  add_generator("t0", t0, at(0:1e4)) %>%
  run(until = 1e9)

ptmr1 <- proc.time() - ptm1


ptm2 <- proc.time()
simmer() %>%
  add_generator("t1", t1, at(0:1e4)) %>%
  run(until = 1e9)

ptmr2 <- proc.time() - ptm2

ptm3 <- proc.time()
simmer() %>%
  add_generator("t2", t2, at(0:1e4)) %>%
  run(until = 1e9)

ptmr3 <- proc.time() - ptm3
> ptmr1
   user  system elapsed 
  0.176   0.001   0.176 
> ptmr2
   user  system elapsed 
  0.300   0.001   0.301 
> ptmr3
   user  system elapsed 
  0.324   0.002   0.325 

So about a 70-100% increase in computation time in this case (which is significant).

Attributes are numeric. Supporting strings would be possible, but even more expensive in terms of performance, and not so useful in general, I think.

Don't think we should support strings either. But we can simply use e.g. indices instead of the name here.

My approach requires the implementation of the scheduling policies in the C++ core, yes, but there are not as many useful algorithms out there for this task, I mean...

What it comes down to is a trade-off in terms of flexibility in favour of performance and vice versa. In the end it's a matter of personal preference. While I agree with your arguments, I tend to be in favour of going the flexibility route, but making sure we make it transparent what the performance impact is of using specific approaches. I feel this route would also make it easier to allow others to create plugins for simmer.

@Enchufa2

This comment has been minimized.

Member

Enchufa2 commented May 25, 2016

What it comes down to is a trade-off in terms of flexibility in favour of performance and vice versa.

Well, my point is that tha's not true necessarily. You could have maximum performance:

t <- create_trajectory() %>%
  select(agents, policy="shortest-queue") %>%
  seize_selected(amount=1) %>%
  ...
  release_selected(amount=1)

as welll as maximum flexibility:

custom_policy <- function(options) {
  # select and return one option
}

t <- create_trajectory() %>%
  select(agents, policy=custom_policy) %>%
  seize_selected(amount=1) %>%
  ...
  release_selected(amount=1)

with the additional advantage of a pristine syntax. And with this approach, I'm sure that the latter example would be much more efficient than doing the same with attributes, because you are calling only one R function instead of three. ;-)

@Evgeniy-

This comment has been minimized.

Evgeniy- commented May 25, 2016

Hi! Is it possible to implement it in a way that would help with the original request from the Google group ("10 agents...")? This will also enable modeling of semiconductor manufacturing networks. There are multiple machines of the same kind working in parallel (resources) and re-entrant flow of lots. Every lot has to be processes on these machines multiple times (steps). And for certain steps, the lot has to come back to exactly the same machine it was processed in the previous step. For the rest of the steps, the lot can go to any available machine. And, yes, there are significant difference in waiting time between these two cases. Especially when machines break down and lots have to wait for their preferred machine to be repaired. It would be nice to have a way to simulate this behavior, and to seize a resource (machine) either based on name/number, or just seize first available resource. Thank you!

@Enchufa2

This comment has been minimized.

Member

Enchufa2 commented May 26, 2016

Yeap, your problem would be automatically solved with this new feature this way:

engineers <- c("engineer1", "engineer2", ...)

cases <- create_trajectory("Resolve new Case") %>%
  select(engineers, policy="shortest-queue") %>%
  seize_selected(amount = 1) %>%
  branch(function() sample(1:2, 1, FALSE, c(0.4, 0.6)), merge = c(F, F),
         create_trajectory("first day close") %>%
           # agent can close this case
           timeout(function() runif(1, 30, 60)) %>%
           release_selected(amount = 1),
         create_trajectory("research/analyze") %>%
           # agent does research but will need more information from customer
           # do the research and send customer the request for information
           timeout(function() rexp(1, 0.0333)) %>%
           release_selected(amount = 1) %>%
           # now engineer is now free but case is not yet resolved - wait for customer to respond
           timeout(function() rexp(1, 0.1666) ) %>%
           # repeat this cycle of steps a random number of times until the case is closed
           # roll back this trajectory and repeat with probability 30%
           rollback(amount = 4, check = function() sample(0:1, 1, FALSE, c(0.3, 0.7)))
  )

Note that you select an engineer at the beginning and then you seize the very same one in each rollback.

@Bart6114

This comment has been minimized.

Member

Bart6114 commented May 26, 2016

Just challenging this a bit more 😁

What would you do in a nested structure? E.g.:
You seize a resource based on a policy (using seize_selected based on select). While still being seized, you seize another resource based on another policy (so a new seize_selected based on another select). We have now seized two resources, how easy will it be to only release the one that was seize_selected first or second?

@Enchufa2

This comment has been minimized.

Member

Enchufa2 commented May 26, 2016

Very good point! Your feedback is, as always, invaluable. 😉

How about using an optional identifier, so you can select as many as you need? Example:

t <- create_trajectory() %>%
  select(resource_pool_1, policy=policy_1 id=1) %>%
  select(resource_pool_2, policy=policy_2, id=2) %>%
  seize_selected(amount=1, id=1) %>%
  seize_selected(amount=1, id=2) %>%
  ...
  release_selected(amount=1, id=2) %>%
  release_selected(amount=1, id=1)
@Bart6114

This comment has been minimized.

Member

Bart6114 commented May 26, 2016

Am I understanding correct that the select's policy argument would then allow for passing a policy string or a custom function? The string would then map to one of the more prominent policies implemented in the C++ backend while the function would be evaluated on the R side (and thus incur a performance decrease)?

That would indeed be a very nice solution. Efficient resource select policies, custom select policies and the ability for nested usage! Looks great!

@Enchufa2

This comment has been minimized.

Member

Enchufa2 commented May 26, 2016

Correct. 😄

@Evgeniy-

This comment has been minimized.

Evgeniy- commented May 26, 2016

Iñaki, thank you for your reply! And Bart thank you for your comment! Looking forward for this new feature!

@Enchufa2 Enchufa2 changed the title from Implement seize_* activities, where * is a policy to Implement select, seize_selected, release_selected Jun 4, 2016

@Enchufa2 Enchufa2 added this to the v3.3.0 milestone Jun 18, 2016

@Enchufa2 Enchufa2 closed this in ca577f9 Jun 18, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment