-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting Started #1
Comments
Getting my driver script ready. After thinking through versions that would make sense for the future, I'm just trying to get it done simply using a CSV. On one hand, I don't figure this one will be too much in the way of hyperparameter tuning. On the other, all models will probably be not great, so maybe a big ensemble will do a good job. Either way, here are some intro thoughts on things we'd want to experiment with. Feel free to directly add to or update this list.
Surely more. Also we can try predicting the outliers themselves. I don't have a lot of hope there, but it seems reasonable to give it a shot.
|
Also, I haven't put code to run these in parallel in, but it's possible for R.
|
Added my thoughts to the main comment here. |
Cool. Yes, some good additions. |
Driver thing sounds cool. I would like to share my experience of using Spearmint for parameter search. It works really good for Random Forest and Extra Trees. I can share how to use it in case any of you are interested. |
Oh, great. Yes, bayesian optimization is often where people go. That's a great way to let the computer stay busy, too. I'm familiar with it from here: http://fastml.com/tuning-hyperparams-automatically-with-spearmint/ Related is this paper, that we've been looking over at H2O: Cool stuff, Thakur. It would be great if you could use scikit's GBM (or anything else scikit) since they support MAE. |
Also related: somebody interviewing at H2O used this, which allows you to drive scikit learn with a config file, similar to what I'm trying. |
Actually you have to use 2 files by default for Spearmint. Attach are the example files from Spearmint competition for Random Forest. config.txt --- config.json |
You need to have mongoDB installed for using Spearmint. I will make a small document of step-by-step process of Spearmint and will post tomorrow. |
Thanks for sharing. This is great to see as I'm trying to create my own--it comes to life a lot more. And I get the MongoDB part, too. I'm just shooting out text files. I really had it continually updating the same CSV, but once I went parallel that is no longer a good idea. The characteristic I want by having it all together rather than separated in different files is to be able to efficiently analyze it. For me a CSV is easy. But a Mongo query is surely easy, too. But a fairly large overhead, unfortunately. As I get closer to the final vision of my simple thing, I'll talk more about why I want what I want, as far as how that might differ from Spearmint. |
My first run's results:
|
As I mentioned, I use these mainly to keep communication with enhanced features (markdown: code, tables, etc.).
You'll probably get emails whenever I update it, but in case things render poorly, the Issue directly on Github will likely have better formatting.
The text was updated successfully, but these errors were encountered: