You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is currently not possible to reproduce the results of regression performed with the TPOTRegressor class with the resulting pipeline.
Context of the issue
Currently, the accuracy score from the .score() method of a TPOTClassifier instance and the output of sklearn.metrics.accuracy_score on the best pipeline are identical. This is not the case with pipelines from TPOTRegressor instances.
Process to reproduce the issue
Classifier (correct/reproducible results)
The following code is used to create a TPOTClassifier, train it on the iris dataset and then return the accuracy
from tpot import TPOTClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target,
train_size=0.75, test_size=0.25)
X_train.shape, X_test.shape, y_train.shape, y_test.shape
tpot = TPOTClassifier(verbosity=2, max_time_mins=2)
tpot.fit(X_train, y_train)
print(tpot.score(X_test, y_test))
>>>Optimization Progress: 77%
>>>154/200 [01:49<00:46, 1.00s/pipeline]
>>>2.01 minutes have elapsed. TPOT will close down.
>>>TPOT closed during evaluation in one generation.
>>>WARNING: TPOT may not provide a good pipeline if TPOT is stopped/interrupted in a early generation.
>>>TPOT closed prematurely. Will use the current best pipeline.
>>>Best pipeline: MLPClassifier(input_matrix, alpha=0.01, learning_rate_init=0.001)
>>>1.0
When the sklearn.metrics.accuracy_score function is called on the y_test data and the predictions from the best pipeline created by the TPOTClassifier instance, the result is identical:
With TPOTRegressor, the results are not identical.
from tpot import TPOTRegressor
#from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_diabetes
X, y = load_diabetes(return_X_y=True)
X = X[:1500]
y = y[:1500]
X_train, X_test, y_train, y_test = train_test_split(X,y,
train_size=0.75, test_size=0.25, random_state=42)
tpot = TPOTRegressor(generations=5, population_size=5, verbosity=2, random_state=42)
tpot.fit(X_train, y_train)
print(-1*tpot.score(X_test, y_test))
>>>Best pipeline: RandomForestRegressor(SelectFromModel(ElasticNetCV(input_matrix, l1_ratio=0.75, tol=0.01), >>>max_features=0.15000000000000002, n_estimators=100, threshold=0.0), bootstrap=True, max_features=0.4, min_samples_leaf=7, >>>min_samples_split=17, n_estimators=100)
>>>2572.133297426151
Unfortunately, the results of rerunning the call to .predict with test data for the best pipeline from the TPOTRegressor object, are not identical to this step:
I would have expected the last line to return 2572.133297426151.
The text was updated successfully, but these errors were encountered:
dlmolloy97
changed the title
Lack of reproducibility between TPOTRegressor and exported pipeline
Lack of reproducibility between TPOTRegressor and .fitted_pipeline_ attribute
Jun 22, 2023
Succinctly, the problem is that you expected: some_tpot_classifier.score() == sklearn.metrics.accuracy_score(some_tpot_classifier) some_tpot_regressor.score() == sklearn.metrics.accuracy_score(some_tpot_regressor)
but instead got: some_tpot_classifier.score() == sklearn.metrics.accuracy_score(some_tpot_classifier) some_tpot_regressor.score() != sklearn.metrics.accuracy_score(some_tpot_regressor)
I doubt what you got is intended for at least for some datasets; the first step would be consolidating your demonstration code into an automated test.
The default scoring for TPOTRegressor is 'neg_mean_squared_error'. so tpot.score will return the neg_mean_squared_error. But you are comparing it to mean_absolute_error.
If you want to optimize mean absolute error, you can pass that in as a scorer.
If you change your estimator to the following, you get the same results in your example. tpot = TPOTRegressor(generations=5, population_size=5, verbosity=2, random_state=42, scoring='neg_mean_absolute_error')
It is currently not possible to reproduce the results of regression performed with the TPOTRegressor class with the resulting pipeline.
Context of the issue
Currently, the accuracy score from the .score() method of a TPOTClassifier instance and the output of sklearn.metrics.accuracy_score on the best pipeline are identical. This is not the case with pipelines from TPOTRegressor instances.
Process to reproduce the issue
Classifier (correct/reproducible results)
The following code is used to create a TPOTClassifier, train it on the iris dataset and then return the accuracy
When the sklearn.metrics.accuracy_score function is called on the y_test data and the predictions from the best pipeline created by the TPOTClassifier instance, the result is identical:
Regressor (incorrect/nonreproducible results)
With TPOTRegressor, the results are not identical.
Unfortunately, the results of rerunning the call to .predict with test data for the best pipeline from the TPOTRegressor object, are not identical to this step:
I would have expected the last line to return 2572.133297426151.
The text was updated successfully, but these errors were encountered: