-
Notifications
You must be signed in to change notification settings - Fork 2k
Darwin - %age and ETA's during backtesting, as well as mid-generation resuming #1250
Conversation
scripts/genetic_backtester/darwin.js
Outdated
// return periods.value; | ||
// } | ||
// return 1; | ||
// }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for keeping this outcommented code (and the other blocks) in the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was part of an earlier attempt to measure time by first figuring out how many periods are being processed, and then measuring time per period. I quickly found out that time per period can vary based on lots of different things. However since that part worked, I was hesitant to get rid of it entirely in case someone else here had an idea of how to leverage it.
Couple of thoughts:
|
9d660dc
to
55c32ff
Compare
Both these last two thoughts are now implemented. Here you can see in generation 1 it was spitting out the best balance mid-generation of runs that had completed so far. In generation 2 you can see it's spitting out a progress report of actively running sims. At the bottom after I hit Ctrl+C you can see it gives you a copy-and-pasteable command you can run to resume the previous backtest. |
Problems I hoped to solve:
To solve resuming mid-generation:
I now generate a token for the "run" that looks like "backtest_201801300356" where the number there is the timestamp. Inside it are nested files and folders related to the data. The pro being no more scattered millions of files in a single simulations folder. Now everything for the entire "run" is stored in one place.
To resume, you just add
--population_data=backtest_201801300356
to your previous command and it'll re-start on the correct generation and sim set. So if you were at 87/100 sims and you closed it and restarted, you'll still be at 87/100 sims remaining.As another benefit to this is that as soon as a sim is done, you can view its info and results within its own file within its generation folder, including the html output.
The only downside here is that this is not backward compatible with the old
population_data
param so any old runs you may have stored will no longer be resumable. I could revisit this if people think it's a real problem but I didn't want to clutter the codebase with multiple solutions to the same problem.To add meaningful console output:
Instead of printing out the command along with the sim number, the command info is added to its sim file in the generation folder immediately, and now the output is made to be something more useful.
Final thoughts:
I don't actually think the PR should be simply accepted as is. I'm hoping for some people to really hammer at this and give some feedback and ideas on how to give it some better polish. I was originally hoping that I could give an ETA for the entire run early on, but considering some sims may take much longer than others, I can't figure out a way that doesn't involve lots of guesses with averages to compute and that'll just annoy people because it'll be wrong.
At any rate, I feel like there's more opportunity for useful output there but now that I've hit a solid stopping point I'm low on ideas. Let me know what you think!