New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent prediction results after dumping and loading lightgbm.LGBMClassifier via pickle #2449
Comments
@slam3085 Please post the whole snippet. Also, attach the data you use, if it's not sensitive. Or can you reproduce this issue with random data? We do have a test where predictions are the same: LightGBM/tests/python_package_test/test_sklearn.py Lines 155 to 178 in be206a9
|
Hi! I can't share the data and the snippet looks like this: try:
with open(path, "rb") as f:
mdl = pickle.load(f)
print(f'lightgbm_fitted_model_{today} loaded')
except:
print(f'Unable to open lightgbm_fitted_model_{today}; fitting...')
X_actual_train, X_eval, y_actual_train, y_eval = train_test_split(X_train, y_train, test_size=0.1,
random_state=42, stratify=y_train)
hyperparams = {'n_estimators': 200, 'class_weight': 'balanced', 'random_state': 42}
mdl = lightgbm.LGBMClassifier(**hyperparams)
mdl.fit(X_actual_train, y_actual_train, eval_set=(X_eval, y_eval), eval_metric='roc_auc')
with open(path, "wb") as f:
pickle.dump(mdl, f)
pred_proba = mdl.predict_proba(X_test)[:, 1]
n_pred_proba_95 = sum(pred_proba >= 0.95)
print(f'stats: {n_pred_proba_95} out of {len(pred_proba)} are 95% objects of class 1')
return pred_proba Basically either I fit new model and dump it via pickle, or I use already trained one. I tried to reproduce it in local environment only and couldn't, and now I understand why - I have python 3.7.3 as local environment and python 3.7.0 as production environment. So, actual steps to reproduce are:
I guess it's not really a bug then or very minor one and issue can be closed. |
Makes sense! Thanks a lot for your investigation. |
We are having the same problem. It happens when we try to deploy our models to production in a different system/environment than the one we trained the model in, and it is not easy to reproduce. Observed this issue two times:
Using both scikit-learn API and classic Python API. Changing the model in our pipeline for a typical scikit-learn RandomForestRegressor and everything is OK. Any idea about what is happening? It looks like we are missing something important and this is driving us crazy. |
What is the difference for the prediction results? |
For example, predictions on the same machine with the same environment the model was trained:
Predictions loading the model (joblib or txt) in another machine/environment, always installing LightGBM with conda or pip:
|
with this large gap, I think it is not the same problem as this issue. |
Yes, in the same machine and using the same environment there is no problem, in the same machine and using a different environment (for example, deploying the model using MLFlow) the results are different, same as saving the model and loading it in other machines... The thing is, the models are supposed to be portable, at least saving it as a .txt file. |
@dcanones could you try the cli version in the same machine? I think it is possible the bug of env. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Basically, title.
Both models have same parameters and metrics in properties.
I'm using python 3.7.3 and lightgbm 2.2.3
The text was updated successfully, but these errors were encountered: