# Overview
- http://scikit-learn.org/stable/modules/grid_search.html
- **API**: http://scikit-learn.org/stable/modules/classes.html#module-sklearn.grid_search

|method|Description|
|---|---|
|grid_search.GridSearchCV(estimator, param_grid)	| Exhaustive search over specified parameter values for an estimator.|
|grid_search.ParameterGrid(param_grid)	| Grid of parameters with a discrete number of values for each.|
|grid_search.ParameterSampler(...[, random_state]) | Generator on parameters sampled from given distributions.|
|grid_search.RandomizedSearchCV(estimator, ...)	| Randomized search on hyper parameters.|

## Load modules

In [52]:
import pandas as pd
import numpy as np
from sklearn import grid_search, cross_validation, svm
from pprint import pprint
from pandas import DataFrame as DF
from pandas import Series as SR
from tak.tak import myprint, pd_underscore, pd_setdiff

- To find the names and current values for all parameters for a given estimator, use: `estimator.get_params()`

In [2]:
DF(SR(svm.SVC().get_params())).T

Unnamed: 0,C,cache_size,class_weight,coef0,degree,gamma,kernel,max_iter,probability,random_state,shrinking,tol,verbose
0,1,200,,0,3,0,rbf,-1,False,,True,0.001,False


In [3]:
DF(SR(svm.LinearSVC().get_params())).T

Unnamed: 0,C,class_weight,dual,fit_intercept,intercept_scaling,loss,max_iter,multi_class,penalty,random_state,tol,verbose
0,1,,True,True,1,squared_hinge,1000,ovr,l2,,0.0001,0


# 3.2.1 Exhaustive Grid Search (GridSearchCV)
- [`GridSearchCV`](http://scikit-learn.org/stable/modules/generated/sklearn.grid_search.GridSearchCV.html) exhaustively generates candidates from a grid of parameter values specified with the param_grid parameter

**methods**

|method|Description|
|--|--|
|decision_function(*args, **kwargs) | Call decision_function on the estimator with the best found parameters. |
|fit(X[, y]) |Run fit with all sets of parameters. |
|get_params([deep]) | Get parameters for this estimator. |
|inverse_transform(*args, **kwargs) | Call inverse_transform on the estimator with the best found parameters. |
|predict(*args, **kwargs) |   Call predict on the estimator with the best found parameters. |
|predict_log_proba(*args, **kwargs) | Call predict_log_proba on the estimator with the best found parameters. |
|predict_proba(*args, **kwargs) | Call predict_proba on the estimator with the best found parameters. |
|score(X[, y]) |  Returns the score on the given data, if the estimator has been refit |
|set_params(**params) |   Set the parameters of this estimator. |
|transform(*args, **kwargs) | Call transform on the estimator with the best found parameters. |

In [4]:
# ah, create list of dict
param_grid = [
  {'C': [1, 10, 100, 1000], 'kernel': ['linear']},
  {'C': [1, 10, 100, 1000], 'gamma': [0.001, 0.0001], 'kernel': ['rbf']},
 ]
pprint(param_grid)
DF(param_grid)

[{'C': [1, 10, 100, 1000], 'kernel': ['linear']},
 {'C': [1, 10, 100, 1000], 'gamma': [0.001, 0.0001], 'kernel': ['rbf']}]


Unnamed: 0,C,gamma,kernel
0,"[1, 10, 100, 1000]",,[linear]
1,"[1, 10, 100, 1000]","[0.001, 0.0001]",[rbf]


## Example
- http://scikit-learn.org/stable/auto_examples/model_selection/grid_search_digits.html

In [5]:
from sklearn import datasets

# Loading the Digits dataset
digits = datasets.load_digits()

# To apply an classifier on this data, we need to flatten the image, to
# turn the data in a (samples, feature) matrix:
n_samples = len(digits.images)
X = digits.images.reshape((n_samples, -1))
y = digits.target

# Split the dataset in two equal parts
Xtr, Xts, ytr, yts = cross_validation.train_test_split(X, y, test_size=0.5, random_state=0)

# Set the parameters by cross-validation
tuned_parameters = [{'kernel': ['rbf'], 'gamma': [1e-3, 1e-4],
                     'C': [1, 10, 100, 1000]},
                    {'kernel': ['linear'], 'C': [1, 10, 100, 1000]}]

# tune hyperparameters
clf = grid_search.GridSearchCV(svm.SVC(C=1), tuned_parameters, cv=5, scoring='accuracy')

# items prior to fitting
df_prefit = DF(dir(clf))

# fit
clf.fit(Xtr, ytr)

df_postfit = DF(dir(clf))


In [6]:
# pd.merge(df_prefit, df_postfit, left_on=['stuff'])
# df_prefit.join(df_postfit, on='stuff')
# DF(list(set(df_postfit[0]) - set(df_prefit[0])))
pd_setdiff( df_postfit, df_prefit)

Unnamed: 0,0
0,scorer_
1,grid_scores_
2,best_params_
3,best_estimator_
4,best_score_


In [7]:
# DF([x for x in list(dir(clf)) if not x.startswith('_') and x.endswith('_')])

# made a function for above
pd_underscore(clf)

Unnamed: 0,0
0,best_estimator_
1,best_params_
2,best_score_
3,grid_scores_
4,scorer_


In [51]:
# myprint(clf.best_estimator_)
# myprint(clf.best_params_)
# df_postfit.iteritems
df_diff = pd_setdiff( df_postfit, df_prefit)
for _,row in df_diff.itertuples():
    print row,'='
    pprint(getattr(clf,row))
    print ''
#     print type(y)

scorer_ =
make_scorer(accuracy_score)

grid_scores_ =
[mean: 0.98552, std: 0.01038, params: {'kernel': 'rbf', 'C': 1, 'gamma': 0.001},
 mean: 0.95768, std: 0.01459, params: {'kernel': 'rbf', 'C': 1, 'gamma': 0.0001},
 mean: 0.98664, std: 0.01037, params: {'kernel': 'rbf', 'C': 10, 'gamma': 0.001},
 mean: 0.98107, std: 0.01442, params: {'kernel': 'rbf', 'C': 10, 'gamma': 0.0001},
 mean: 0.98664, std: 0.01037, params: {'kernel': 'rbf', 'C': 100, 'gamma': 0.001},
 mean: 0.98107, std: 0.01352, params: {'kernel': 'rbf', 'C': 100, 'gamma': 0.0001},
 mean: 0.98664, std: 0.01037, params: {'kernel': 'rbf', 'C': 1000, 'gamma': 0.001},
 mean: 0.98107, std: 0.01352, params: {'kernel': 'rbf', 'C': 1000, 'gamma': 0.0001},
 mean: 0.97327, std: 0.00737, params: {'kernel': 'linear', 'C': 1},
 mean: 0.97327, std: 0.00737, params: {'kernel': 'linear', 'C': 10},
 mean: 0.97327, std: 0.00737, params: {'kernel': 'linear', 'C': 100},
 mean: 0.97327, std: 0.00737, params: {'kernel': 'linear', 'C': 1000}]

bes

# 3.2.2 Randomized Parameter Optimization (RandomizedSearchCV)
-  [RandomizedSearchCV](http://scikit-learn.org/stable/modules/generated/sklearn.grid_search.RandomizedSearchCV.html#sklearn.grid_search.RandomizedSearchCV) implements a randomized search over parameters, where each setting is sampled from a distribution over possible parameter values
- Specifying how parameters should be sampled is done using a dictionary (just like in GridSearchCV)
- `n_iter` specifies the computation budgets

Here's an example:

```python
[{'C': scipy.stats.expon(scale=100), 'gamma': scipy.stats.expon(scale=.1),
  'kernel': ['rbf'], 'class_weight':['auto', None]}]
```

See http://scikit-learn.org/stable/auto_examples/model_selection/randomized_search.html#example-model-selection-randomized-search-py for demo

# 3.2.4.1 Model specific CV
- Some models can fit data for a range of value of some parameter almost as efficiently as fitting the estimator for a single value of the parameter.
- The most common parameter amenable to this strategy is the parameter encoding the strength of the regularizer. In this case we say that we compute the **regularization path** of the estimator.

|model|Description|
|--|--|
|linear_model.ElasticNetCV([l1_ratio, eps, ...]) | Elastic Net model with iterative fitting along a regularization path|
|linear_model.LarsCV([fit_intercept, ...]) |   Cross-validated Least Angle Regression model|
|linear_model.LassoCV([eps, n_alphas, ...]) |  Lasso linear model with iterative fitting along a regularization path|
|linear_model.LassoLarsCV([fit_intercept, ...]) |  Cross-validated Lasso, using the LARS algorithm|
|linear_model.LogisticRegressionCV([Cs, ...]) |    Logistic Regression CV (aka logit, MaxEnt) classifier.|
|linear_model.MultiTaskElasticNetCV([...]) |   Multi-task L1/L2 ElasticNet with built-in cross-validation.|
|linear_model.MultiTaskLassoCV([eps, ...]) |   Multi-task L1/L2 Lasso with built-in cross-validation.|
|linear_model.OrthogonalMatchingPursuitCV([...]) | Cross-validated Orthogonal Matching Pursuit model (OMP)|
|linear_model.RidgeCV([alphas, ...]) | Ridge regression with built-in cross-validation.|
|linear_model.RidgeClassifierCV([alphas, ...]) |   Ridge classifier with built-in cross-validation.|
