Hyperparameter optimization #128

acrellin · 2015-12-22T02:22:39Z

Add estimator hyperparameter optimization capability; misc. performance improvements and bug fixes. Fixes #103

acrellin · 2015-12-22T02:24:51Z

Still need to improve the web UI, but this works for now.

bnaul · 2015-12-22T21:47:14Z

mltsp/build_model.py

+        The model/estimator whose hyperparameters are to be optimized.
+    model_params : dict or list of dict
+        Dictionary with parameter names as keys and lists of parameter values
+        to try as values, or a list of such dictionaries.


When is this a list of dicts versus a dict? Trying to wrap my head around the different use cases...

It never will be in the webapp usage, but package-API users who know a bit about hyperparameter optimization may want to utilize this - check out http://scikit-learn.org/stable/modules/generated/sklearn.grid_search.GridSearchCV.html

acrellin · 2015-12-22T22:18:13Z

Sphinx is suddenly failing in the Drone build... All tests passing locally though.

bnaul · 2015-12-22T22:30:28Z

👹

Did you rebase everything after the latest merge of doc stuff? The error I see in Drone shouldn't be happening with the new import trickery.

acrellin · 2015-12-22T22:32:45Z

I agree - but only the last few commits are failing and I definitely didn't
change anything Sphinx or Drone-related... Take a look at
http://nipy.bic.berkeley.edu:8080/github.com/mltsp/mltsp/hyperparameter_optimization/5a5e0a30c3c181c27168930add729f417c44525c
Then after that it started failing.

On Tue, Dec 22, 2015 at 2:30 PM, Brett Naul notifications@github.com
wrote:

[image: 👹]

Did you rebase everything after the latest merge of doc stuff? The error I
see in Drone shouldn't be happening with the new import trickery.

—
Reply to this email directly or view it on GitHub
#128 (comment).

bnaul · 2015-12-22T22:40:02Z

Does anything happen if you git pull --rebase [your_upstream] master locally? Something with this branch being out-of-date seems by far the most likely culprit.

acrellin · 2015-12-22T22:40:55Z

Ok, just rebased again and pushed - looks happy now.

acrellin · 2015-12-29T21:01:56Z

What do you all think of the idea of the optimize-hyperparameters-and-fit-to-training-data function returning the the best-fitting estimator only, as opposed to the GridSearchCV object, which contains the estimator as an attribute and implements the predict and predict_proba methods? I can see arguments for both sides, like easier access to the classes_, etc. attributes vs. retaining potentially useful info like grid_scores.

bnaul · 2015-12-31T23:55:52Z

mltsp/util.py

+                                  dest_type) or \
+                        (params_to_optimize and k in params_to_optimize and \
+                         isinstance(ast.literal_eval(model_params[k]), list) \
+                         and (type(x) in dest_types_list for x in \


Yikes...guess there's no way for this to be simple but maybe it would help a bit to define ast.literal_eval(model_params[k]) before the if block. Also, should there be an all around the last list comprehension?

bnaul · 2015-12-31T23:58:58Z

@acrellin I can also see arguments for both sides but I guess I'd vote for returning the GridSearchCV object, it's easy enough to extract the model but going the other direction would be impossible.

bnaul · 2016-01-21T18:47:51Z

This looks good to me 👍; wanna rebase so we can see that sweet green 💵?

bnaul · 2016-01-21T19:11:49Z

This failed the first time for some reason (said RethinkDB wasn't running) but re-building fixed it.

acrellin · 2016-01-21T19:15:43Z

I've seen that before, too... A bit troubling

bnaul · 2016-01-22T01:05:55Z

mltsp/build_model.py

+
+    """
+    # To fit with fixed, non-optimized params, must be wrapped in list
+    if isinstance(model_params, dict):


if isinstance(model_params, dict): model_params = [model_params] for param_dict in model_params: ...

acrellin · 2016-01-25T22:51:38Z

All tests are passing on my machine (both with Python 3.5 and a clean 2.7 virtual env), and none of the relevant data files have changes between http://nipy.bic.berkeley.edu:8080/github.com/mltsp/mltsp/hyperparameter_optimization/6761809af2e20bafe7c3ec7b9b415047382f7134 and http://nipy.bic.berkeley.edu:8080/github.com/mltsp/mltsp/hyperparameter_optimization/fa3c460e84b35c9f2c28f80f80110efc2595a3af. Anyone have any ideas re: the sudden xray errors?

acrellin · 2016-01-27T21:43:28Z

@stefanv @bnaul Please take a look and let me know what you think when you have a chance.

acrellin · 2016-01-27T21:53:24Z

http://nipy.bic.berkeley.edu:8080/github.com/acrellin/mltsp/hyperparameter_optimization/240726d4cfab818e58eaa19008b2bc77ce73bcd5

bnaul · 2016-01-27T23:08:59Z

👍 @stefanv ?

stefanv · 2016-01-27T23:59:59Z

mltsp/util.py

@@ -199,3 +208,20 @@ def extract_data_archive(archive_path, extract_dir=None):
    file_paths = [f for f in all_paths if not os.path.isdir(f)]
    archive.close()
    return file_paths
+
+
+def robust_literal_eval_dict(input_dict):


Here, how about:

def robust_literal_eval(value): try: return ast.literal_eval(value) except: return value

and then

d = {k: robust_literal_eval(v) for k, v in some_dict.items()}

…ator

…onaries in build_model; update test_build_model

…checking logic; update tests

…d expected types

stefanv · 2016-01-28T20:30:41Z

👍 looks really good--thanks for your patience with us, Ari :)

Hyperparameter optimization

bnaul reviewed Dec 22, 2015
View reviewed changes

acrellin force-pushed the hyperparameter_optimization branch from 9398e1b to 935d7fc Compare December 22, 2015 22:33

bnaul reviewed Dec 31, 2015
View reviewed changes

acrellin force-pushed the hyperparameter_optimization branch from cf9422d to 6761809 Compare January 21, 2016 18:50

bnaul reviewed Jan 22, 2016
View reviewed changes

acrellin force-pushed the hyperparameter_optimization branch from fa3c460 to 4e1d191 Compare January 25, 2016 22:52

bnaul mentioned this pull request Jan 26, 2016

Model selection changes in sklearn #138

Closed

acrellin force-pushed the hyperparameter_optimization branch from 4e1d191 to 5592074 Compare January 26, 2016 22:20

stefanv reviewed Jan 27, 2016
View reviewed changes

acrellin added 24 commits January 28, 2016 11:32

Improve syntax

97cd3f5

Add Flask module test for hyperparameter optimization

b350944

Update index.html to reflect new model description syntax

d727f63

Remove spaces from model names to fix frontend bug

5d62a7c

Fix bug in featurize_tools.assemble_featureset

37a79e2

Fix assemble_featureset bug the right way

7e3d59a

Fix predict bug related to GridSearchCV model object vs regular estim…

c12a3aa

…ator

Expand & improve backend tests

e528ba0

Improve type comparison

34d3939

Improve type comparisons in unit tests

ebd6b5d

Fix bug in type-casting utility function

76bfbe9

Separate model_options and params_to_optimize into two separate dicti…

b6539a9

…onaries in build_model; update test_build_model

Update Flask form handling & type-casting

7e0d994

Update unit tests

b2b8b4f

Improve multi-line statement syntax

3fbfc41

Clean up model param type casting function

1aac5db

Fix uncaught merge conflict (xray -> xr) in test_build_model

01a57a2

Fix bug in util.py

4533e3e

Add separate utility function for evaluating literals; simplify type-…

bf148e6

…checking logic; update tests

Update util.check_model_param_types to allow for even if not in liste…

3791a6d

…d expected types

Do not return True in util.check_model_param_types

981022e

Remove timeout limit for Celery featurization tasks

5e5a1a6

Dict comprehension around robust_literal_eval

aacb03b

Clean up type-checking logic

170f185

acrellin force-pushed the hyperparameter_optimization branch from 2037a9f to 170f185 Compare January 28, 2016 19:32

Fix unicode error and model param type check bug

167d2ac

stefanv added a commit that referenced this pull request Jan 28, 2016

Merge pull request #128 from acrellin/hyperparameter_optimization

8aa727c

Hyperparameter optimization

stefanv merged commit 8aa727c into cesium-ml:master Jan 28, 2016

acrellin deleted the hyperparameter_optimization branch November 22, 2016 19:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hyperparameter optimization #128

Hyperparameter optimization #128

acrellin commented Dec 22, 2015

acrellin commented Dec 22, 2015

bnaul Dec 22, 2015

acrellin Dec 22, 2015

acrellin commented Dec 22, 2015

bnaul commented Dec 22, 2015

acrellin commented Dec 22, 2015

bnaul commented Dec 22, 2015

acrellin commented Dec 22, 2015

acrellin commented Dec 29, 2015

bnaul Dec 31, 2015

bnaul commented Dec 31, 2015

bnaul commented Jan 21, 2016

bnaul commented Jan 21, 2016

acrellin commented Jan 21, 2016

bnaul Jan 22, 2016

acrellin commented Jan 25, 2016

acrellin commented Jan 27, 2016

acrellin commented Jan 27, 2016

bnaul commented Jan 27, 2016

stefanv Jan 27, 2016

stefanv commented Jan 28, 2016

Hyperparameter optimization #128

Hyperparameter optimization #128

Conversation

acrellin commented Dec 22, 2015

acrellin commented Dec 22, 2015

bnaul Dec 22, 2015

Choose a reason for hiding this comment

acrellin Dec 22, 2015

Choose a reason for hiding this comment

acrellin commented Dec 22, 2015

bnaul commented Dec 22, 2015

acrellin commented Dec 22, 2015

bnaul commented Dec 22, 2015

acrellin commented Dec 22, 2015

acrellin commented Dec 29, 2015

bnaul Dec 31, 2015

Choose a reason for hiding this comment

bnaul commented Dec 31, 2015

bnaul commented Jan 21, 2016

bnaul commented Jan 21, 2016

acrellin commented Jan 21, 2016

bnaul Jan 22, 2016

Choose a reason for hiding this comment

acrellin commented Jan 25, 2016

acrellin commented Jan 27, 2016

acrellin commented Jan 27, 2016

bnaul commented Jan 27, 2016

stefanv Jan 27, 2016

Choose a reason for hiding this comment

stefanv commented Jan 28, 2016