# Overview

As stated earlier, the objective function is responsible for evaluating the "goodness" or "performance" of a model or function when it is provided with a particular parameter set. This is sometimes referred to as a loss function within the machine learning space. The evaluation is provided as a globally comparable metric or measurement (some type of number).

```
metric = objective(parameter_set)
```

Defining this measure is outside the scope of this notebook. At a high level, the function might be defined as:

```
def objective(parameters_set):

    // Create a machine learning model
    // Train the model
    // Test the model
    // return test results metric

```


For educational purposed we will consider a trivial measure. Our objective function will return a number representing the sum of the sample parameters it was provided from the search space. For example, consider the following:

```
metric = objective(0)
# metric == 0

metric = objective(1,2,3)
# metric == 6

```

As stated earlier, defining the objective function depends on the search space. If the schema of the search space changes, so must the objective function. As we go through our examples, we will see the objective function's definition change to fit our search space.

# Examples
TODO: Add exampls of objective functions

# Considerations
## Cross Validation
Databricks has made some [comments](https://databricks.com/blog/2021/04/15/how-not-to-tune-your-model-with-hyperopt.html) regarding cross validation that are worth considering. 

Before getting into this, let's review the topic. In a nutshell, cross validation techniques, like k-folds will split the data into multiple sets intended for training and testing. It then trains the model multiple times on these (k) different sets of training and testing data to see how it performs accross them. The idea being that overfitting can be detected if a model performs inexplicably better on one set than another.

The argument from databricks is that the k-1 extra training sessions could have been spent exploring new points within the hyperparameter space. In other words, databricks is calling out the tradeoff between hyperparameter exploration and certainty of model accuracy. They argue that increasing max_evals by a factor of k is probably better than adding k-fold cross-validation, all else equal.

I might offer a hybrid approach and suggest performing cross validation on the top performing hyperparameter sets rather than all the sets.