-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validator() #40
Comments
Sounds good! I'll try and think about this some more when I can and respond with anything I come up with 👍 |
Ok so there are a couple things I've been thinking about, and a few layers of computational complexity to worry about. I'll elaborate. We cannot really confidently choose the best HP data point based on Talos' current state. It certainly works but perhaps not to the degree some users would want. The reason is two-fold. Consider a data set and a train/validation/test split.
You see where I'm going with this! If Talos currently runs in O(N) time, where N is the number of HP permutations, then in a perfect world if we were to be sure about our choice of "best HP", we would need brute force O(NKL) time to do this, where K is the number of folds and L is the number of times you want to run statistical averages over the random initializations. So I guess my comment on this discussion would be: should we focus on a Validator() yet, or should be try to be smarter about directing the flow of Scan() so that it finds the optimal HP sooner? Perhaps one of the search methods mentioned in previous issues? I dunno, @mikkokotila, but let's discuss before we start doing more work. 😄 By the way, this is where I implement hardware accelerators since it is much too slow on CPU's. Google Colab is an amazing option for anyone who doesn't have access to a HPC cluster! |
I agree with everything you say above. Let's figure out the optimization layer first, and then move to validating. With this in mind, I'm working on a major rehaul / refactoring of the codebase so that it is less anxiety inducing to make major changes. Scan() is already completely cleaned, the param handling is completely rebuilt, as is reductions. These would be the three things that play some role in the optimization aspect. As you may have noted, in the initial architecture I've assumed an approach where:
The idea is that we could have many different strategies, which all take as input the results from the previous rounds (from the experiment log) and based on that input reduce the complexity of the rest of the experiment. This in effect happens by removing select items from self.param_log. I've written an article which should be ready to publish in the next few days that goes a little bit deeper into the reasoning for this approach (as opposed to the approach where random/grid search is considered a taxonomically parallel to something like Bayesian). Google Colab seems amazing, will definitely try it! :) |
This is related to #17 where some additional comments can be found. |
That is awesome news. I stuck a
Fantastic! We may want to begin linking things in the wiki or something 👍
Please do. It really lowers the barrier of entry into this kind of work which I feel is incredibly important to the scientific community. Lots of smart people out there who want to do ML but don't have the firepower to train deep networks. If you need any help with figuring it out feel free to email me or something. Took me a bit to figure it all out 😄, no need for both of us to waste time! In any case, after your refactoring I'll reread anything and help you clean up. Then we can move forward! |
This is a nice article (with a comprehensive collection) on the metrics topic |
@x94carbone just a heads up that this is moving :) It seems that saving the model does not need any messing around with tf session / graph objects, but we can just save the model as json inside a list in the Scan() object, and the model weights in a separate list in the Scan() object. Then load the model from the json, and set its weights from the corresponding weights. This seems to be the same way as one would do it from a file. Very clean. I will start testing this now. I will then move on to implementing a k-fold cross validation for "best model" which we can then use to build towards the discussion we've had i.e. several best models being cross validated or and competed against each other in some meaningful way against various sampling methods. |
Well well. We have now implemented f1-score based kfold crossvalidation. If you look at /utils/predict the whole thing becomes quite apparent. The workflow is very simple. After you have concluded the experiment with Scan()
Where 's' is the Scan object, and x and y is the cross-validation data. In this case it's multi-class (i.e. y dims > 1) so average is set to 'macro'. TODO:
|
This is now available through Evaluate() and Autom8(). Closing here. |
Now that a lot of the issues are handled, I think the next big push is on putting the pieces together for the Validator() i.e what happens after Scan() and that leads to the information that is needed to train the production model finally.
I think that in #17 you had more or less nailed the outline of the approach, and I will follow that for now. We already have an objective measure for classification tasks in form of score_model, so I will focus on that use-case (class predictions) first.
The text was updated successfully, but these errors were encountered: