-
Notifications
You must be signed in to change notification settings - Fork 1
Rollout algorithm #90
Conversation
This generalizes random, dqn, rollout so we don't have to write a separate "run_xxx" every time
dqn needs to load a network model before starting the epochs. This is currently not possible in run_dispatch
Because of moving data between C++ and Python? I suspect we can shorten this a bit if rollout turns out to be promising. |
I checked again: its roughly 10-20 ms for the C++ and Python interaction due to |
In multiple places we now have |
I benchmarked 78408e3. Performance is still the retained, i.e., roughly 10k improvement over greedy+. |
I'll continue with this today, mostly by profiling stuff and finding ways to make it faster. |
dynamic/rollout/simulate_instance.py
Outdated
cust_idx = rng.integers(n_customers, size=n_samples) + 1 | ||
tw_idx = rng.integers(n_customers, size=n_samples) + 1 | ||
service_idx = rng.integers(n_customers, size=n_samples) + 1 | ||
|
||
# These are unnormalized time windows and release times, which are used to | ||
# determine request feasibility. Will be clipped later. | ||
sim_tw = tws[tw_idx] | ||
sim_epochs = np.repeat(np.arange(1, max_lookahead + 1), EPOCH_N_REQUESTS) | ||
sim_release = start_time + sim_epochs * EPOCH_DURATION | ||
sim_service = static_inst["service_times"][service_idx] | ||
|
||
# Earliest arrival is release time + drive time or earliest time window. | ||
earliest_arrival = np.maximum(sim_release + dist[0, cust_idx], | ||
sim_tw[:, 0]) | ||
earliest_return = earliest_arrival + sim_service + dist[cust_idx, 0] | ||
feas = (earliest_arrival <= sim_tw[:, 1]) & (earliest_return <= tws[0, 1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It happens fairly often that feas
is less than 30% of the initial customers. That's unfortunate, of course, but I do not see a good way around this because we should replicate the way the controller generates customers (and they do this as well).
But it does mean we can postpone some work until after we know the (much smaller) subset of feasible customers.
if n_new_customers == 0: # this should not happen a lot | ||
return simulate_instance(info, obs, rng, n_lookahead) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will probably never happen, but it might, and then the code below it crashes. So to avoid that we should just try again.
@leonlan @jaspervd96 I think this is basically done for a first implementation. Shall we merge this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Two remaining (small) points are:
- Use the environment to get constants
EPOCH_N_REQUESTS
andEPOCH_DURATION
. - Prevent simulation exceeding time limit by too much.
Do you mind addressing these points as well?
Sure! |
Turns out that removing these is pretty hard because the |
@jaspervd96 can you approve if you're OK with this PR? I'll wait for the CI to complete as well. |
… reason So we'll just use the (static) benchmark for now
Will do. Only added 1 more small suggestion |
Co-authored-by: jaspervd96 <jasper.vandoorn@hotmail.com>
This PR
run_dispatch
, which takes as input a dispatching strategy to solve the dynamic problemsolve_static
(not used yet)