Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logloss issue... #90

Closed
caprone opened this issue Nov 1, 2018 · 10 comments
Closed

logloss issue... #90

caprone opened this issue Nov 1, 2018 · 10 comments

Comments

@caprone
Copy link

caprone commented Nov 1, 2018

HI HunterMcGushion!

when I use 'log_loss' as metric in Enviroment,
ther's a way to tell optimizer to predict probabilities? it seems not... (maybe from "model_extra_params" ??).

For example, also if I set 'logloss' as metric in xgb, probably,
all optimizer's predictions are-- binary values--, then logloss scores become totally bogus

thanks

@HunterMcGushion
Copy link
Owner

Hi, @caprone, thanks for opening this issue! Have you tried providing the do_predict_proba kwarg on initialization of :class:environment.Environment. If that doesn't resolve it, can you provide a minimal code example to recreate the problem?

@caprone
Copy link
Author

caprone commented Nov 2, 2018

HI @HunterMcGushion , thanks for answer.

Unfortunately this don"t resolve problem,
in this toy script -- binary classification --, we compare the same algorithm RF, but different scores "size/scale" computed
(but the same issue occurs with others algorithms, like xgbclassifier, ....)

EDIT: probably issue can be that we can"t specify the positive class in Enviroment....?
`
import os
os.environ['KERAS_BACKEND'] = 'tensorflow'

from hyperparameter_hunter import Environment, Integer, Real, ExtraTreesOptimization
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import log_loss
from sklearn.model_selection import train_test_split
from sklearn.datasets.samples_generator import make_classification
from pandas import DataFrame

x, y = make_classification(n_samples=100, n_features=10)
x_train1 = DataFrame(dict(x=x[:, 0], y=x[:, 1], target=y))
x_train1.info()

hunter_path = '../HyperparameterHunterAssets'

def execute():

env = Environment(
    train_dataset=x_train1,
    target_column="target",
    root_results_path=hunter_path,
    # metrics_map=["log_loss"],
    do_predict_proba=True,
    metrics_map=dict(log_loss=lambda t, p: -log_loss(t, p)),
    cross_validation_type='KFold',
    cross_validation_params=dict(n_splits=5,  shuffle=True, random_state=32),
    runs=1
)

optimizer = ExtraTreesOptimization(
                                    iterations=10, read_experiments=True, random_state=None)

optimizer.set_experiment_guidelines(
    model_initializer=RandomForestClassifier,
    model_init_params=dict(
        n_estimators=10,
        n_jobs=6,
        max_depth=Integer(4, 6)

    ),
)

optimizer.go()
print()
print()
print("start second randomForest model with TRUE logloss values predicted:")
label = x_train1['target'].values

x_train = x_train1.drop('target', 1)
X_train, X_test, y_train, y_test = train_test_split(x_train, label, test_size=0.30, random_state=0)

def randomForest():

    clf = RandomForestClassifier(max_depth=4, n_estimators=10)
    clf.fit(X_train, y_train)
    y_pred = clf.predict_proba(X_test)
    # here we select "1 prob predictions" as positive class -- >>>> probably issue here??
    y_pred = y_pred[:, 1]

    print("logloss score is: {}".format(log_loss(y_test, y_pred)))

randomForest()

if name == 'main':
execute()
`

@HunterMcGushion
Copy link
Owner

Sorry for the delayed response, @caprone!

In the provided example, the comparison being made wouldn’t work, since HyperparameterHunter is automatically performing KFold cross-validation with the given parameters to produce the log_loss score. However, the classifier (clf) that is trained without HyperparameterHunter doesn’t undergo a fitting process even near the one specified in env. Therefore, we can’t really expect the results of those two tests to be similar, much less identical, since HyperparameterHunter is told to evaluate with 5 KFold splits, whereas clf is given a train_test_split of 0.30. Additionally, the random seeds are different, which produces further different results.

I’ve attempted to make a closer comparison; however, there is a lot that is handled by HyperparameterHunter behind the scenes, so we can’t make a true comparison. That said, I’ve modified your script, and come up with the following to replicate a little bit of the core functionality of HyperparameterHunter:

from hyperparameter_hunter import Environment, CrossValidationExperiment

import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import log_loss
from sklearn.model_selection import KFold
from sklearn.datasets import make_classification

CV_PARAMS = dict(n_splits=3, shuffle=True, random_state=32)
MODEL_INIT_PARAMS = dict(n_estimators=10, max_depth=4, random_state=32)

INPUT, TARGET = make_classification(n_samples=100, n_features=10, n_classes=2, random_state=32)
TRAIN_DF = pd.DataFrame(data=INPUT, columns=range(INPUT.shape[1]))
TRAIN_DF["target"] = TARGET


def run_hyperparameter_hunter():
    env = Environment(
        train_dataset=TRAIN_DF.copy(),
        root_results_path="HyperparameterHunterAssets",
        do_predict_proba=False,
        metrics_map=dict(log_loss=lambda t, p: -log_loss(t, p)),
        cross_validation_type="KFold",
        cross_validation_params=CV_PARAMS,
    )

    experiment = CrossValidationExperiment(
        model_initializer=RandomForestClassifier,
        model_init_params=MODEL_INIT_PARAMS
    )

    return experiment


def run_normal(random_seeds):
    #################### Result Placeholders ####################
    oof_predictions = np.zeros_like(TARGET)
    oof_predictions_proba_0 = np.zeros_like(TARGET)
    oof_predictions_proba_1 = np.zeros_like(TARGET)
    oof_scores = []
    oof_scores_proba_0 = []
    oof_scores_proba_1 = []

    for fold, (train_index, validation_index) in enumerate(KFold(**CV_PARAMS).split(INPUT, TARGET)):
        np.random.seed(random_seeds[fold][0])

        #################### Split Data ####################
        train_input, validation_input = INPUT[train_index], INPUT[validation_index]
        train_target, validation_target = TARGET[train_index], TARGET[validation_index]

        #################### Fit Classifier ####################
        classifier = RandomForestClassifier(
            **dict(MODEL_INIT_PARAMS, **dict(random_state=random_seeds[fold][0]))
        )
        classifier.fit(train_input, train_target)

        #################### Make Predictions ####################
        validation_predictions = classifier.predict(validation_input)
        validation_predictions_proba = classifier.predict_proba(validation_input)

        #################### Calculate Score ####################
        validation_score = -log_loss(validation_target, validation_predictions)
        validation_score_proba_0 = -log_loss(validation_target, validation_predictions_proba[:, 0])
        validation_score_proba_1 = -log_loss(validation_target, validation_predictions_proba[:, 1])

        #################### Collect Results ####################
        oof_scores.append(validation_score)
        oof_scores_proba_0.append(validation_score_proba_0)
        oof_scores_proba_1.append(validation_score_proba_1)

        oof_predictions[validation_index] = validation_predictions
        oof_predictions_proba_0[validation_index] = validation_predictions_proba[:, 0]
        oof_predictions_proba_1[validation_index] = validation_predictions_proba[:, 1]

        print(" - F{}:     {}     {}     {}".format(
            fold, validation_score, validation_score_proba_0, validation_score_proba_1
        ))

    print("FINAL:     {}     {}     {}".format(
        np.average(oof_scores), np.average(oof_scores_proba_0), np.average(oof_scores_proba_1)
    ))


def execute():
    exp = run_hyperparameter_hunter()
    print("#" * 80)
    run_normal(exp.experiment_params["random_seeds"][0])


if __name__ == "__main__":
    execute()

If we run the above script as-is, with Environment.do_predict_proba=False, the HyperparameterHunter experiment produces validation negative log_loss scores of [-3.04754, -5.23322, -2.09326] over three folds, with a final score of -3.45390

  • These scores match the scores produces by run_normal in its validation_score and oof_scores variables, which were calculated using classifier.predict (which reflects the do_predict_proba=False given earlier to Environment)

If, instead, we slightly modify the script, and run it with Environment.do_predict_proba=True, the HyperparameterHunter experiment produces validation negative log_loss scores of [-8.18999, -4.69224, -3.89328] over three folds, with a final score of -5.61782

  • These scores match the scores produces by run_normal in its validation_score_proba_0 and oof_scores_proba_0 variables, which were calculated using classifier.predict_proba(…)[:, 0] (which reflects the do_predict_proba=True given earlier to Environment)

TL;DR:
So if I’m understanding your question correctly, there seem to be two problems:

  1. The original comparison in your first example script doesn’t evaluate the classifiers in the same way, so we can’t expect similar results, and
  2. Environment.do_predict_proba should be set according to however you would predict your values if you weren’t using HyperparameterHunter, as can be seen by the difference between the two script executions above

I hope this clears things up for you, but please let me know if you have any other questions, and thanks again for opening this issue!

@caprone
Copy link
Author

caprone commented Nov 7, 2018

HI @HunterMcGushion!
your comparision is very helpful THANKS!

@HunterMcGushion
"""Environment.do_predict_proba should be set according to however you would predict your values if you weren’t using HyperparameterHunter"""

Yes, finally the problem is that in "Environment.do_predict_proba"
it is not possible set the "positive"(for example 1 and not zero) class, then in binary classification
automatically Hunter compute pobability of zero class (right?);

@HunterMcGushion
Copy link
Owner

Glad to hear it!

You are correct. If a a model's predictions are not one-dimensional, the default is to use the column at index 0. This takes place in hyperparameter_hunter.models.Model.predict:

def predict(self, input_data):
"""Generate model predictions for `input_data`
Parameters
----------
input_data: Array-like
Data containing the same number of features as were trained on, for which the model will
predict output values"""
if input_data is None:
return None
try:
if self.do_predict_proba is True:
prediction = self.model.predict_proba(input_data)
else:
prediction = self.model.predict(input_data)
except Exception as _ex:
raise _ex
with suppress(IndexError):
prediction = prediction[:, 0]
return prediction

Specifically, I think you'll be interested in line 194.

Do you think it would be helpful to be able to specify the column index selected when using do_predict_proba=True?

@caprone
Copy link
Author

caprone commented Nov 8, 2018

okk @HunterMcGushion !, perfect!

Yes I think it would be very helpful to be able to specify the column index in "proba_predictions", because often algorithms return positive class's probability(usually 1 for binary classification) in the second matrix's column.

thanks again!!

@HunterMcGushion
Copy link
Owner

Great point! I'm thinking the easiest way to get this done would be to allow the do_predict_proba parameter to be an integer, as well as a boolean.

If it's a boolean and False, then the model's predict method is called (default). If it's a boolean and True, the model's predict_proba method is called and the default index of 0 is used to select the column.

The new part is if do_predict_proba is an integer, then do_predict_proba is interpreted as having been True, but the integer passed as is now used as the column index. So in your example, you would pass do_predict_proba=1 to get the second column.

@HunterMcGushion
Copy link
Owner

HunterMcGushion commented Nov 8, 2018

Although, there may be a bit of confusion since it is popular to pass truthy or falsey values, like 0 and 1, in place of actual booleans, which could produce unexpected results in this case.

@caprone, could you check out #95 to see if it does what you need?

@caprone
Copy link
Author

caprone commented Nov 9, 2018

HI @HunterMcGushion!!
for me Your solutiuon is very effective, great!!

@ doc
"""
do_predict_proba: Boolean, or int, default=False
* If False, :meth:.models.Model.fit will call :meth:models.Model.model.predict
* If True, it will call :meth:models.Model.model.predict_proba, and the values in the
first column (index 0) will be used as the actual prediction values
* If do_predict_proba is an int, :meth:.models.Model.fit will call
:meth:models.Model.model.predict_proba, as is the case when do_predict_proba is
True, but the int supplied as do_predict_proba declares the column index to use as
the actual prediction value
"""
I also agree that:
@HunterMcGushion
""" there may be a bit of confusion since it is popular to pass truthy or falsey values,
like 0 and 1, in place of actual booleans"""

especially if someone do not read the documentation / or comment;

one solution can be make "model.do_predict_proba" only boolean and create new integer class attribute
that specifies index column of positive class, for ex: self.model.do_pos_class=1;

however I think that your solution is very useful!!
thanks again!!

@HunterMcGushion
Copy link
Owner

That's a great idea. For now, I'd prefer not to add too many more parameters, but if it looks like others are having problems, we should revisit your idea! Let me know if the problems persist after merging, and thanks again for opening the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants