-
Notifications
You must be signed in to change notification settings - Fork 2
Impact of simulation-solution quality on rollout performance #100
Comments
The table below shows the average results over 10 seeds for Rollout-long is rollout where the simulation instances are solved using 500 iterations with HGS and slightly adjusted config. The number of simulations between rollout and rollout-long is similar so that the only difference is the quality of the solution. There is not really an epoch time limit for rollout long. The only restriction is that the dispatch instance is solved in half the epoch time, just like rollout. Having better quality simulation-solutions gives a ~400 improvement. This is not a lot, given that the solutions from rollout long are quite a bit better than rollout. I don't have hard numbers on this, but these "long" solution are on average 5% better for the first few epochs. In the later epochs this goes down to 0.1-0.5%, because there are the simulation instances become smaller. |
@leonlan what am I looking at in the above results? I guess dynamic performance, but how did you increase the epoch time limit to allow for more iterations? |
@N-Wouda I updated my earlier comment! There's some merit in improving the solution quality, but the current experiment shows that this is only very little. I'll try a few more experiments in the next days. |
I ran a similar experiment to one mentioned earlier. 10 seeds, Just like before, rollout long improves rollout but only by a little (500-ish). The results are very similar to the previous experiment, even though we have 120 epoch time limits and more lookaheads. As already said before somewhere, more simulations generally don't lead to better improvements in rollout (i.e., we already have enough simulations). Moreover, more lookahead periods than 3 don't produce a lot more requests and those extra requests are hard to combine (tight time windows and release dates). |
In summary:
An important note is that all these observations have to be interpreted within the context of the threshold dispatching criteria. If we use some other dispatching criteria based on costs, this story may result in a completely different story. See #101 for more on that. I leave this issue open for a while to discuss these experiments. Also let me know if you have any suggestions for numerical experiments to try out. |
Re. tuning. It's easiest to tune if we pass the settings in via the function arguments, rather than the constants file we have now. For this I suggest we refactor fairly substantially, removing I'll see if I can write out a tuning script with SMAC next week, both for static and dynamic/rollout. |
Sure. We should keep greedy as baseline reference but we can just make a module for that as well. |
I changed The differences between rollout and rollout long are now much larger. There's now a 1500 difference, up from 500 before. Interesting! What's also interesting: rollout 0.15 vs rollout 0.20 threshold is still about the same. |
Very nice!! Could you share the exact configuration you are using for rollout_long30? |
Here you can find it: Euro-NeurIPS-2022/dynamic/rollout/rollout_long.py Lines 52 to 58 in 5cc1067
Note that it still uses an older rollout code version (but with same performance). The solutions with rollout_long are between 2-8% better than rollout in the earlier epochs, in the later epochs this falls off because the instances become smaller. Actually, I even fall-back on regular rollout if rollout long does not improve the solutions. But these are all (insignificant) details, I hope 😅. |
How is that possible? The time to run a single simulation is 20x as long right? Or what do I miss? |
Sorry that's a bit confusing indeed. I extrapolate from a single "normal" rollout simulation how long that would take. Then I calculate how many simulations that would have resulted in. This is precisely the number of "long" simulations that we are allowed to do in rollout long. |
So do I understand correctly, that this means the "rollout long" is relevant as a test, but not possible within 60s per epoch? |
That's correct. These experiments show that, if we can get "rollout long quality" solutions within 60s, then we can improve our dynamic performance even further. At the moment, this is not possible, but to be honest, this is not a very big challenge to achieve (see my comment above with 3 suggestions on how to improve solution quality). I also think that tuning HGS will work, but I have a very bad understanding of how to tune these parameters. |
Unfortunately, the results for rollout_long30 were incomplete. A single instance set was missing, meaning that the comparison of results was faulty since the other methods did have results for all the instance sets. The sole reason why rollout_long30 had better results was simply because of missing data. This means that my conclusion that improving the simulation solution quality could lead to drastic improvements was incorrect :-( There are small gains (1-2K) which are already merged by #132. I'll close this issue, but feel free to open again when relevant. |
To investigate
Observations
The text was updated successfully, but these errors were encountered: