## Hyperopt is a Hyperparamter Optimization Library

It Minimizes a function f(x1,x2....,xn) when (x1,x2....,xn) belong to some Range

In [2]:
from hyperopt import hp, tpe, fmin
from hyperopt.mongoexp import MongoTrials
# hp is the function which will create a range for the function parameters
# tpe is the Tree-structured Parzen Estimator Algorithm, which we use instead of Grid/Random search as dicussed
# fmin is the function that will take in the function that is to be optimized, the range of parameters

In [9]:
def objective_function(args):  
    
    '''define the function which will take the argument                            
    and return the equation that is to be minimized.
    taking the arguments
    returning the equation
    '''
    
    x,y = args[0],args[0]  
    return x**2 - y**2       

In [4]:
# space = {'x':hp.uniform('x',-2,2),     
#          #defining the range for x
         
#          'y':hp.uniform('y',-3,1)}      
#         #defining the range for y

In [10]:
space = [hp.uniform('x',-2,2), hp.uniform('y',-3,1)]

In [11]:
best_param_60eval = fmin(objective_function ,space ,algo=tpe.suggest ,max_evals=60)
# using tpe to get the minimum value of f(x,y) when x <- (-2,2)
#                                               and y <- (-3,1)

In [12]:
print(best_param_60eval)

{'y': -2.1603749211456584, 'x': 1.3375448481033194}


In [7]:
#almost similar result for 100 evals
best_param_100eval = fmin(objective_function ,space ,algo=tpe.suggest ,max_evals=100)
print(best_param_100eval)

{'y': -2.9914118485184074, 'x': 0.22298709904380568}


## Understanding and Creating Search Space for hyperopt
Unlike sklearn's grid search hyperopt.fmin's search space must be an hp object. We'll look at some important range functions and how to use them

Fmin function allow us to pass only one argument to objective function as can be seen in above example, the we are supposed to extract the parameters from the argument.

we can use the above code like this as well:

```
def objective_function(args):
    x,y = args[0],args[1]  
    return x**2 - y**2
    
space = [hp.uniform('x',-2,2),hp.uniform('y',-3,1)]
#or
space = ([hp.uniform('x',-2,2),hp.uniform('y',-3,1)])

```

### Lets look at some hyperopt space functions on the official wiki page here:
https://github.com/hyperopt/hyperopt/wiki/FMin#21-parameter-expressions

## We'll now use some complex spaces in the following sklearn examples and Compare Grid Search and tpe for optimization of 

In [13]:
from sklearn.neighbors import KNeighborsClassifier
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import pandas as pd

df = pd.read_csv('iris.csv')

In [14]:
df.head()  #our data

Unnamed: 0,sepal length,sepal width,petal length,petal width,species
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,4.6,3.1,1.5,0.2,Iris-setosa
4,5.0,3.6,1.4,0.2,Iris-setosa


In [15]:
l = LabelEncoder()
df['species'] = l.fit_transform(df['species'])

x = df.drop('species',axis=1)
y = df['species']

x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.2,random_state=0)

### Using Grid Search

In [16]:
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import GridSearchCV
from sklearn.preprocessing import LabelEncoder
from sklearn.svm import SVC

df = pd.read_csv('iris.csv')

l = LabelEncoder()
df['species'] = l.fit_transform(df['species'])
x = df.drop('species',axis=1)
y = df['species']
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.2)


def Grid(model):
    
    
    '''
    the Grid function creates a dictionaries of choices of hyperparameters 
    to be used in the Grid Search
    '''
    
    
    if model == 'KNeighborsClassifier':
        parameters = {'n_neighbors':[3,11],
                  'algorithm':['ball_tree','kd_tree'],
                  'leaf_size':[1,50],
                  'metric':["euclidean","manhattan","chebyshev","minkowski"]}
   
    if model == 'SVC':
        parameters ={ 'kernel':['rbf','poly','rbf','sigmoid'],
            'degree':[1,15],
            'gamma':[0.001,10000]}
    
    return parameters



def GridSearch(model):
    
    '''
    GridSearch is a wrapper for sklearn's GridSearchCV as per model case
    '''
    
    if model == "KNeighboursClassifier":
        parameters = Grid("KNeighborsClassifier")
        knc=KNeighborsClassifier()
        global clf
        clf = GridSearchCV(knc,parameters)
    
    if model == "SVC":
        parameters = Grid("SVC")
        svc = SVC()
        clf = GridSearchCV(svc,parameters)

        
    clf.fit(x_train,y_train)
    
    return clf


clf1 = GridSearch("SVC")
clf2 = GridSearch("KNeighborsClassifier")

# y_pred_train = clf1.predict(x_train)    
# loss = mean_squared_error(y_train,y_pred_train)

# print("Test Score:",clf.score(x_test,y_test))
print("Train Score for SVC:",clf1.score(x_train,y_train))
print("Test Score for SVC ",clf1.score(x_test,y_test))
print("\n")
print("Train Score for KNeighborsClassifier:",clf2.score(x_train,y_train))
print("Test Score for KNeighborsClassifier",clf2.score(x_test,y_test))
print("\n=================")



('Train Score for SVC:', 0.9833333333333333)
('Test Score for SVC ', 0.9666666666666667)


('Train Score for KNeighborsClassifier:', 0.9833333333333333)
('Test Score for KNeighborsClassifier', 0.9666666666666667)





### Using Hyperopt

In [18]:
from sklearn import datasets
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from hyperopt import tpe, hp, fmin
from sklearn.metrics import mean_squared_error
iris = datasets.load_iris()
x = iris.data
y = iris.target
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.2)

def objective_func(args):
    
    
    '''
    the objective function that has to be minimized as per the target 'loss'
    '''
    
    if args['model']==KNeighborsClassifier:
        n_neighbors = args['param']['n_neighbors']
        algorithm = args['param']['algorithm']
        leaf_size = args['param']['leaf_size']
        metric = args['param']['metric']
        clf = KNeighborsClassifier(n_neighbors=n_neighbors,
                               algorithm=algorithm,
                               leaf_size=leaf_size,
                               metric=metric,
                               )
    elif args['model']==SVC:
        C = args['param']['C']
        kernel = args['param']['kernel']
        degree = args['param']['degree']
        gamma = args['param']['gamma']
        clf = SVC(C=C, kernel=kernel, degree=degree,gamma=gamma)
    
    
    clf.fit(x_train,y_train)
    
    y_pred_train = clf.predict(x_train)
    
    loss = mean_squared_error(y_train,y_pred_train)
    
    print(args)
    
    print("Test Score:",clf.score(x_test,y_test))
    print("Train Score:",clf.score(x_train,y_train))
    print("\n=================")
    
    return loss



#space is the choices of hyperparameters for the objective function
space = hp.choice('classifier',[
        {'model': KNeighborsClassifier,
        'param': {'n_neighbors': hp.choice('n_neighbors',range(3,11)),
                  'algorithm':hp.choice('algorithm',['ball_tree','kd_tree']),
                  'leaf_size':hp.choice('leaf_size',range(1,50)),
                  'metric':hp.choice('metric', ["euclidean","manhattan","chebyshev","minkowski"])}
        },
    
        {'model': SVC,
        'param':{'C':hp.lognormal('C',0,1),
        'kernel':hp.choice('kernel',['rbf','poly','rbf','sigmoid']),
        'degree':hp.choice('degree',range(1,15)),
        'gamma':hp.uniform('gamma',0.001,10000)}
        }
        ])


best_classifier = fmin(objective_func, space, algo=tpe.suggest, max_evals=100)
print(best_classifier)

{'model': <class 'sklearn.svm.classes.SVC'>, 'param': {'kernel': 'rbf', 'C': 0.83394525164318, 'degree': 11, 'gamma': 3682.1995052940583}}
('Test Score:', 0.3)
('Train Score:', 1.0)

{'model': <class 'sklearn.svm.classes.SVC'>, 'param': {'kernel': 'sigmoid', 'C': 0.37432705867293947, 'degree': 8, 'gamma': 3494.8518713642625}}
('Test Score:', 0.3)
('Train Score:', 0.3416666666666667)

{'model': <class 'sklearn.neighbors.classification.KNeighborsClassifier'>, 'param': {'n_neighbors': 10, 'metric': 'chebyshev', 'leaf_size': 7, 'algorithm': 'ball_tree'}}
('Test Score:', 1.0)
('Train Score:', 0.9666666666666667)

{'model': <class 'sklearn.svm.classes.SVC'>, 'param': {'kernel': 'rbf', 'C': 1.345345931635596, 'degree': 4, 'gamma': 8304.660527374122}}
('Test Score:', 0.3)
('Train Score:', 1.0)

{'model': <class 'sklearn.neighbors.classification.KNeighborsClassifier'>, 'param': {'n_neighbors': 7, 'metric': 'minkowski', 'leaf_size': 16, 'algorithm': 'ball_tree'}}
('Test Score:', 1.0)
('Train Sco

{'model': <class 'sklearn.svm.classes.SVC'>, 'param': {'kernel': 'rbf', 'C': 2.178043286217525, 'degree': 7, 'gamma': 3651.422649867919}}
('Test Score:', 0.3)
('Train Score:', 1.0)

{'model': <class 'sklearn.svm.classes.SVC'>, 'param': {'kernel': 'rbf', 'C': 0.9730979159631501, 'degree': 3, 'gamma': 5201.965808198092}}
('Test Score:', 0.3)
('Train Score:', 1.0)

{'model': <class 'sklearn.neighbors.classification.KNeighborsClassifier'>, 'param': {'n_neighbors': 3, 'metric': 'chebyshev', 'leaf_size': 2, 'algorithm': 'kd_tree'}}
('Test Score:', 1.0)
('Train Score:', 0.975)

{'model': <class 'sklearn.svm.classes.SVC'>, 'param': {'kernel': 'rbf', 'C': 1.0539849159767247, 'degree': 3, 'gamma': 4956.448839037919}}
('Test Score:', 0.3)
('Train Score:', 1.0)

{'model': <class 'sklearn.svm.classes.SVC'>, 'param': {'kernel': 'rbf', 'C': 0.827398433376136, 'degree': 3, 'gamma': 5549.493159850686}}
('Test Score:', 0.3)
('Train Score:', 1.0)

{'model': <class 'sklearn.neighbors.classification.KNeigh

{'model': <class 'sklearn.svm.classes.SVC'>, 'param': {'kernel': 'poly', 'C': 7.705031033134191, 'degree': 5, 'gamma': 3460.897972024736}}
('Test Score:', 1.0)
('Train Score:', 1.0)

{'model': <class 'sklearn.svm.classes.SVC'>, 'param': {'kernel': 'poly', 'C': 10.240636607738633, 'degree': 12, 'gamma': 321.006465764337}}
('Test Score:', 0.36666666666666664)
('Train Score:', 0.325)

{'model': <class 'sklearn.svm.classes.SVC'>, 'param': {'kernel': 'rbf', 'C': 2.7863363891517854, 'degree': 4, 'gamma': 8948.038864032076}}
('Test Score:', 0.3)
('Train Score:', 1.0)

{'model': <class 'sklearn.neighbors.classification.KNeighborsClassifier'>, 'param': {'n_neighbors': 10, 'metric': 'minkowski', 'leaf_size': 23, 'algorithm': 'ball_tree'}}
('Test Score:', 1.0)
('Train Score:', 0.9583333333333334)

{'model': <class 'sklearn.svm.classes.SVC'>, 'param': {'kernel': 'rbf', 'C': 4.155110243237879, 'degree': 4, 'gamma': 8712.186490286054}}
('Test Score:', 0.3)
('Train Score:', 1.0)

{'model': <class 'sk

## Evaluations using MongoDB

In the docker environment we have provided, MongoDB is already installed.
In order to parallelinzing evaluations with MongoDB we'll first start with creating the server

`service mongod start`

PS: note that in docker you have the root access, in ur system your command will start with sudo

Compile your python Script

In [13]:
import math
from hyperopt import tpe, hp, fmin
from hyperopt.mongoexp import MongoTrials


trials = MongoTrials('mongo://localhost:27017/foo_db/jobs', exp_key='exp4')

best = fmin(math.cos, hp.uniform('x', -2, 2), trials=trials, algo=tpe.suggest, max_evals=20)

print(best)

{u'x': -1.8495872997740905}


Now that we have compiled the above script, we'll start the worker, we'll open another terminal and use the command below


`hyperopt-mongo-worker --mongo=localhost:27017/foo_db --poll-interval=0.1`


this worker will connect the script with the server and will handle the tasks

## And now we'll show you the demo of a real life example that we solved using Hyperopt during our summer internship at MateLabs