-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
.cv_results_
does not include info from first generation
#27
Comments
@rsteca or even @ryanpeach - any thoughts on how to get the first generation included in |
I'll take a look. |
In the _fit function, where most of the work is done, the history object caclulates after the mate+mutate step. As such, there is a problem where either we keep it the way it is, and loose 1st gen information, or move it above the mate+mutate step and lose the last generation. Maybe we should have a special "first run" condition which saves the 1st generation data. |
That sounds good to me. Personally, I'm more interested in the 1st run than the last run (first run is where we try all the crazy ideas, and we'll see the most variance across different hyperparameter combinations, while last run is generally a bit safer and more boring combinations of things we've already tried before). But I like your idea- sounds like a pretty simple bit of code to get all the data people would expect. Thanks for finding that! |
Someone test this branch #29 |
I basically discovered that we just haven't included the evaluation step of the population in the history logger. I've now added both evaluation and selection steps but they need testing. |
Hey, so I think we have a misunderstanding. cv_results_ does not include "Generation information." it includes all generated individuals from all generations. It's a pretty big table... |
@ryanpeach Yeah, i understand that we're including individuals in i ran into this issue when i ran a pretty small search space that was only two generations, and the second generation was primarily just re-picking candidates from the first generation. try setting thanks for looking into this! it's a really cool project, and a pretty big improvement over gridsearch |
@ClimbsRocks Great, ok just being clear. Wasn't sure. I'm actually not super familiar with how DEAP works (which is the framework we use). I am following the code referenced here: http://deap.readthedocs.io/en/master/api/tools.html
Here:
And here
Just for the reference. I'm lost as to how the history object works, but I think it contains all individuals ever populated in pop, and then "decorates" those individuals by the decorator commands "such as, creating a graph of who was selected, or who mated with who." But the evaluation step is saved in the history automatically I think. I'll keep looking I guess, just thinking out loud. |
Hey, so I did what you said and I'm just not replicating the results. On the test.ipynb notebook (use my fork) if you put generation_number to 1 you still get some individuals. Note, they wont be the same number of individuals as population_size indicates, because if 2 individuals are functionally the same, they are treated as the same (so a population of 3 "111" individuals in history just shows up as some individual "111"). You sure you aren't just miscounting? If you are sure this is still an issue, please provide an example jupyter notebook. Thanks! |
And... now I'm seeing it. I swear it worked just a min ago... |
Nope, nvm, it works as expected. Here is a link to my notebook: https://github.com/ryanpeach/sklearn-deap/blob/test_issue27/test.ipynb |
I think there's a fenceposting/off-by-one error somewhere.
When I pass in
generations_number = 1
, it's actually 0-indexed, and gives me 2 generations. Similarly, if I pass in 2 generations, I actually get 3.Then, when I examine the cv_results_ property, I noticed that I only get the results from all generations after the first generation (the 0-indexed generation).
This is most apparently if you set
generations_number = 1
.I looked through the code quickly, but didn't see any obvious source of it. Hopefully someone who knows the library can find it more easily!
The text was updated successfully, but these errors were encountered: