Skip to content
This repository has been archived by the owner on Dec 7, 2022. It is now read-only.

Rollout algorithm #90

Merged
merged 50 commits into from
Sep 30, 2022
Merged

Rollout algorithm #90

merged 50 commits into from
Sep 30, 2022

Conversation

leonlan
Copy link
Collaborator

@leonlan leonlan commented Sep 26, 2022

This PR

  • introduces the rollout algorithm (#93), which is a dispatching policy based on Monte Carlo simulations
  • introduces run_dispatch, which takes as input a dispatching strategy to solve the dynamic problem
  • adds an argument initial solutions for solve_static (not used yet)

dynamic/run_dispatch.py Outdated Show resolved Hide resolved
dynamic/run_dispatch.py Outdated Show resolved Hide resolved
@N-Wouda
Copy link
Owner

N-Wouda commented Sep 26, 2022

There is a 75ms overhead for running simulations

Because of moving data between C++ and Python? I suspect we can shorten this a bit if rollout turns out to be promising.

@leonlan
Copy link
Collaborator Author

leonlan commented Sep 26, 2022

There is a 75ms overhead for running simulations

Because of moving data between C++ and Python? I suspect we can shorten this a bit if rollout turns out to be promising.

I checked again: its roughly 10-20 ms for the C++ and Python interaction due to hgspy, the other time is due to everything that happens in a simulation that isn't running the solver.

@N-Wouda
Copy link
Owner

N-Wouda commented Sep 26, 2022

In multiple places we now have time_limit - 1 to ensure we stay within the contest's time limits. But our overhead is not that big, and there's a 2 second grace period. So maybe we should do time_limit + 1 instead? That still gives us 1s to wrap up, and 2s more time for computations (per epoch of 60s; that's almost 3.5% more).

@leonlan
Copy link
Collaborator Author

leonlan commented Sep 30, 2022

I benchmarked 78408e3. Performance is still the retained, i.e., roughly 10k improvement over greedy+.

@N-Wouda
Copy link
Owner

N-Wouda commented Sep 30, 2022

I'll continue with this today, mostly by profiling stuff and finding ways to make it faster.

Comment on lines 26 to 41
cust_idx = rng.integers(n_customers, size=n_samples) + 1
tw_idx = rng.integers(n_customers, size=n_samples) + 1
service_idx = rng.integers(n_customers, size=n_samples) + 1

# These are unnormalized time windows and release times, which are used to
# determine request feasibility. Will be clipped later.
sim_tw = tws[tw_idx]
sim_epochs = np.repeat(np.arange(1, max_lookahead + 1), EPOCH_N_REQUESTS)
sim_release = start_time + sim_epochs * EPOCH_DURATION
sim_service = static_inst["service_times"][service_idx]

# Earliest arrival is release time + drive time or earliest time window.
earliest_arrival = np.maximum(sim_release + dist[0, cust_idx],
sim_tw[:, 0])
earliest_return = earliest_arrival + sim_service + dist[cust_idx, 0]
feas = (earliest_arrival <= sim_tw[:, 1]) & (earliest_return <= tws[0, 1])
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It happens fairly often that feas is less than 30% of the initial customers. That's unfortunate, of course, but I do not see a good way around this because we should replicate the way the controller generates customers (and they do this as well).

But it does mean we can postpone some work until after we know the (much smaller) subset of feasible customers.

Comment on lines +46 to +47
if n_new_customers == 0: # this should not happen a lot
return simulate_instance(info, obs, rng, n_lookahead)
Copy link
Owner

@N-Wouda N-Wouda Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will probably never happen, but it might, and then the code below it crashes. So to avoid that we should just try again.

@N-Wouda
Copy link
Owner

N-Wouda commented Sep 30, 2022

@leonlan @jaspervd96 I think this is basically done for a first implementation. Shall we merge this?

Copy link
Collaborator Author

@leonlan leonlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Two remaining (small) points are:

  • Use the environment to get constants EPOCH_N_REQUESTS and EPOCH_DURATION.
  • Prevent simulation exceeding time limit by too much.

Do you mind addressing these points as well?

@N-Wouda
Copy link
Owner

N-Wouda commented Sep 30, 2022

Sure!

@N-Wouda
Copy link
Owner

N-Wouda commented Sep 30, 2022

Use the environment to get constants EPOCH_N_REQUESTS and EPOCH_DURATION.

Turns out that removing these is pretty hard because the run_dispatch function is in-between. Let's keep this as-is, and refactor if and when we get rid of other dynamic strategies.

@N-Wouda
Copy link
Owner

N-Wouda commented Sep 30, 2022

@jaspervd96 can you approve if you're OK with this PR? I'll wait for the CI to complete as well.

@jmhvandoorn
Copy link
Collaborator

@jaspervd96 can you approve if you're OK with this PR? I'll wait for the CI to complete as well.

Will do. Only added 1 more small suggestion

Co-authored-by: jaspervd96 <jasper.vandoorn@hotmail.com>
@N-Wouda N-Wouda merged commit 71500c0 into main Sep 30, 2022
@N-Wouda N-Wouda deleted the rollout branch September 30, 2022 14:39
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants