Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tabular: Fix error when loading with a different OS #2865

Merged
merged 10 commits into from
Jun 5, 2023

Conversation

Innixma
Copy link
Contributor

@Innixma Innixma commented Feb 8, 2023

Issue #, if available:
#2208

Description of changes:
Fixes exception during load when TabularPredictor was trained on Windows, and then loaded on MacOS/Linux.
Should also fix exception when trained on MacOS/Linux and loaded on Windows.

Because this PR introduces a new check on load based on a new variable, prior trained predictors on earlier versions will not be able to load using this PR.

Note: I have not formally tested this yet as I don't have a Windows machine on hand. Would appreciate if those affected can comment on if this PR fixes their problem.

Follow-up: Add unit test, tracked in #2863

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@github-actions
Copy link

github-actions bot commented Feb 8, 2023

Job PR-2865-17946e3 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2865/17946e3/index.html

@Innixma Innixma added this to the 0.7 Release milestone Feb 9, 2023
@Innixma Innixma added bug Something isn't working OS: Windows Impacting Windows OS module: tabular priority: 0 Maximum priority labels Feb 9, 2023
@Innixma Innixma modified the milestones: 0.7 Release, 0.7 Fast-Follow Items Feb 15, 2023
@imrooki
Copy link

imrooki commented Apr 11, 2023

Traceback (most recent call last):
File "E:\桌面\My work_4\Models\Double D\1\AG3_RMSE\AG_pred3.py", line 19, in
predictor = TabularPredictor.load(save_path)
File "F:\Anaconda\envs\ag1\lib\site-packages\autogluon\tabular\predictor\predictor.py", line 3198, in load
predictor = cls._load(path=path)
File "F:\Anaconda\envs\ag1\lib\site-packages\autogluon\tabular\predictor\predictor.py", line 3101, in _load
predictor._set_post_fit_vars(learner=learner)
File "F:\Anaconda\envs\ag1\lib\site-packages\autogluon\tabular\predictor\predictor.py", line 3033, in _set_post_fit_vars
self._learner.persist_trainer(low_memory=True)
File "F:\Anaconda\envs\ag1\lib\site-packages\autogluon\tabular\learner\abstract_learner.py", line 836, in persist_trainer
self.trainer = self.load_trainer()
File "F:\Anaconda\envs\ag1\lib\site-packages\autogluon\core\learner\abstract_learner.py", line 120, in load_trainer
return self.trainer_type.load( # noqa
File "F:\Anaconda\envs\ag1\lib\site-packages\autogluon\core\trainer\abstract_trainer.py", line 2671, in load
obj.set_contexts(path)
File "F:\Anaconda\envs\ag1\lib\site-packages\autogluon\core\trainer\abstract_trainer.py", line 232, in set_contexts
self.path, model_paths = self.create_contexts(path_context)
File "F:\Anaconda\envs\ag1\lib\site-packages\autogluon\core\trainer\abstract_trainer.py", line 244, in create_contexts
if os.path.sep != self._og_os_path_sep:
AttributeError: 'AutoTrainer' object has no attribute '_og_os_path_sep'
This error occurring

@Innixma Innixma modified the milestones: 0.7 Fast-Follow Items, 0.8 Release Apr 11, 2023
@Innixma
Copy link
Contributor Author

Innixma commented Apr 11, 2023

Thanks @imrooki, I plan to get a hold of a Windows machine to ensure this fix works correctly and have this fixed for v0.8 release.

@imrooki
Copy link

imrooki commented Apr 12, 2023

We look forward to the v0.8 release.

@eses-wk
Copy link

eses-wk commented Apr 19, 2023

@Innixma I've tested this PR locally, and it does solve the cross-OS load issue. Yet, there are still issues when calling predict() or predict_proba() on any Ensemble Model trained on a Windows machine.

Python version: 3.10.10

Test passed:

  • Trained on Linux (Ubuntu) => Load on Windows
  • Trained on Linux (Ubuntu) => Predict on Windows
  • Trained on Windows => Load on Linux (Ubuntu)

Test failed:

  • Trained on Windows => Predict on Linux (Ubuntu)
    • Note: Predicting with non-ensemble models, i.e. LightGBM, XGBoost, NN all works, the following error occurred when predicting with Ensemble Model
File ~/Projects/autogluon/core/src/autogluon/core/trainer/abstract_trainer.py:1376, in AbstractTrainer.load_model(self, model_name, path, model_type)
   1374     model_type = self.get_model_attribute(model=model_name, attribute='type')
   1375     logging.warning(f'model type is: {model_type}')
-> 1376 return model_type.load(path=path, reset_paths=self.reset_paths)

File ~/Projects/autogluon/core/src/autogluon/core/models/ensemble/bagged_ensemble_model.py:887, in BaggedEnsembleModel.load(cls, path, reset_paths, low_memory, load_oof, verbose)
    885 @classmethod
    886 def load(cls, path: str, reset_paths=True, low_memory=True, load_oof=False, verbose=True):
--> 887     model = super().load(path=path, reset_paths=reset_paths, verbose=verbose)
    888     if not low_memory:
    889         model.persist_child_models(reset_paths=reset_paths)

File ~/Projects/autogluon/core/src/autogluon/core/models/abstract/abstract_model.py:926, in AbstractModel.load(cls, path, reset_paths, verbose)
    904 """
    905 Loads the model from disk to memory.
    906 
   (...)
    923     Loaded model object.
    924 """
    925 file_path = path + cls.model_file_name
--> 926 model = load_pkl.load(path=file_path, verbose=verbose)
    927 if reset_paths:
    928     model.set_contexts(path)

File ~/Projects/autogluon/common/src/autogluon/common/loaders/load_pkl.py:43, in load(path, format, verbose, **kwargs)
     41 if compression_fn in compression_fn_map:
     42     with compression_fn_map[compression_fn]['open'](validated_path, 'rb', **compression_fn_kwargs) as fin:
---> 43         object = pickle.load(fin)
     44 else:
     45     raise ValueError(
     46         f'compression_fn={compression_fn} or compression_fn_kwargs={compression_fn_kwargs} are not valid.'
     47         f' Valid function values: {compression_fn_map.keys()}')

TypeError: __randomstate_ctor() takes from 0 to 1 positional arguments but 2 were given

ps: __randomstate_ctor() should be a function in the numpy.random library, that will be called whenever we call pickle.load, Numpy version: 1.23.5

@eses-wk
Copy link

eses-wk commented Apr 19, 2023

@imrooki your error likely is due to loading existing predictors trained on Autogluon version without this PR fix

prior trained predictors on earlier versions will not be able to load using this PR.

See if you want to try this PR #3161

@yinweisu
Copy link
Collaborator

@eses-wk Hi, I think the error you got was because of some numpy version discrepancy between two environment. Also, I've updated the PR a bit and had a test myself across OS. Do you mind verifying the changes too? The change should be backward-compatible

@github-actions
Copy link

Job PR-2865-f7b0146 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2865/f7b0146/index.html

@github-actions
Copy link

Job PR-2865-709bc73 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2865/709bc73/index.html

@@ -0,0 +1,41 @@
from unittest.mock import patch
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a unit test for the absolute path example both from linux and windows

https://learn.microsoft.com/en-us/dotnet/standard/io/file-path-formats

For example, on windows it could be something like:

"C:\Documents\foo"

@github-actions
Copy link

Job PR-2865-55a256b is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2865/55a256b/index.html

@github-actions
Copy link

github-actions bot commented Jun 1, 2023

Job PR-2865-1408fa0 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2865/1408fa0/index.html

@github-actions
Copy link

github-actions bot commented Jun 1, 2023

Job PR-2865-4edb33e is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2865/4edb33e/index.html

Copy link
Contributor Author

@Innixma Innixma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! @yinweisu feel free to approve and merge. I can't approve since I am the PR author.

Copy link
Collaborator

@yinweisu yinweisu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self approving :)

@yinweisu yinweisu merged commit b6dc989 into autogluon:master Jun 5, 2023
28 checks passed
@eses-wk
Copy link

eses-wk commented Jun 6, 2023

@eses-wk Hi, I think the error you got was because of some numpy version discrepancy between two environment. Also, I've updated the PR a bit and had a test myself across OS. Do you mind verifying the changes too? The change should be backward-compatible

You were right, the issue was gone

Sorry for the super late reply, I tried loading an old Window-trained model (Autogluon version==0.7.0) in a Linux machine (Autogluon version==0.7.1b20230606, installed from the latest master branch) and an AttributeError occurred. Probably due to other new features implemented with autogluon.features module.

For backward-compatibility:
Might be good to mention in docs that if there is cross-OS loading issue, it is recommended to train and deploy/infer with the latest (same) Autogluon version (0.8 coming soon?) Assuming that there might have other hidden feature incompatibility issues with the older versions

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[8], line 6
      4 test_data_nolab = test_data.drop(columns=[label])  # delete label column to prove we're not cheating
      5 test_data_nolab.head()
----> 6 y_pred_test = predictor_test.predict(test_data_nolab,model='WeightedEnsemble_L2')
      7 #y_pred_test = predictor_test.predict(test_data_nolab)
      8 #print("Predictions:  \n", y_pred)
      9 #perf_test = predictor_test.evaluate_predictions(y_true=y_test, y_pred=y_pred, auxiliary_metrics=True)

File [~/Projects/ag_os_test/autogluon/tabular/src/autogluon/tabular/predictor/predictor.py:1379](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/eses/Projects/ag_os_test/~/Projects/ag_os_test/autogluon/tabular/src/autogluon/tabular/predictor/predictor.py:1379), in TabularPredictor.predict(self, data, model, as_pandas, transform_features)
   1377 self._assert_is_fit('predict')
   1378 data = self.__get_dataset(data)
-> 1379 return self._learner.predict(X=data, model=model, as_pandas=as_pandas, transform_features=transform_features)

File [~/Projects/ag_os_test/autogluon/tabular/src/autogluon/tabular/learner/abstract_learner.py:160](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/eses/Projects/ag_os_test/~/Projects/ag_os_test/autogluon/tabular/src/autogluon/tabular/learner/abstract_learner.py:160), in AbstractTabularLearner.predict(self, X, model, as_pandas, transform_features)
    158 else:
    159     X_index = None
--> 160 y_pred_proba = self.predict_proba(X=X, model=model, as_pandas=False, as_multiclass=False, inverse_transform=False, transform_features=transform_features)
    161 problem_type = self.label_cleaner.problem_type_transform or self.problem_type
    162 y_pred = get_pred_from_proba(y_pred_proba=y_pred_proba, problem_type=problem_type)

File [~/Projects/ag_os_test/autogluon/tabular/src/autogluon/tabular/learner/abstract_learner.py:140](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/eses/Projects/ag_os_test/~/Projects/ag_os_test/autogluon/tabular/src/autogluon/tabular/learner/abstract_learner.py:140), in AbstractTabularLearner.predict_proba(self, X, model, as_pandas, as_multiclass, inverse_transform, transform_features)
    138 else:
    139     if transform_features:
--> 140         X = self.transform_features(X)
    141     y_pred_proba = self.load_trainer().predict_proba(X, model=model)
    142 if inverse_transform:

File [~/Projects/ag_os_test/autogluon/tabular/src/autogluon/tabular/learner/abstract_learner.py:384](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/eses/Projects/ag_os_test/~/Projects/ag_os_test/autogluon/tabular/src/autogluon/tabular/learner/abstract_learner.py:384), in AbstractTabularLearner.transform_features(self, X)
    382 def transform_features(self, X):
    383     for feature_generator in self.feature_generators:
--> 384         X = feature_generator.transform(X)
    385     return X

File [~/Projects/ag_os_test/autogluon/features/src/autogluon/features/generators/abstract.py:341](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/eses/Projects/ag_os_test/~/Projects/ag_os_test/autogluon/features/src/autogluon/features/generators/abstract.py:341), in AbstractFeatureGenerator.transform(self, X)
    339 if self._pre_astype_generator:
    340     X = self._pre_astype_generator.transform(X)
--> 341 X_out = self._transform(X)
    342 if self._post_generators:
    343     X_out = self._transform_generators(X=X_out, generators=self._post_generators)

File [~/Projects/ag_os_test/autogluon/features/src/autogluon/features/generators/bulk.py:172](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/eses/Projects/ag_os_test/~/Projects/ag_os_test/autogluon/features/src/autogluon/features/generators/bulk.py:172), in BulkFeatureGenerator._transform(self, X)
    170 feature_df_list = []
    171 for generator in generator_group:
--> 172     feature_df_list.append(generator.transform(X))
    174 if not feature_df_list:
    175     X = DataFrame(index=X.index)

File [~/Projects/ag_os_test/autogluon/features/src/autogluon/features/generators/abstract.py:341](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/eses/Projects/ag_os_test/~/Projects/ag_os_test/autogluon/features/src/autogluon/features/generators/abstract.py:341), in AbstractFeatureGenerator.transform(self, X)
    339 if self._pre_astype_generator:
    340     X = self._pre_astype_generator.transform(X)
--> 341 X_out = self._transform(X)
    342 if self._post_generators:
    343     X_out = self._transform_generators(X=X_out, generators=self._post_generators)

File [~/Projects/ag_os_test/autogluon/features/src/autogluon/features/generators/astype.py:127](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/eses/Projects/ag_os_test/~/Projects/ag_os_test/autogluon/features/src/autogluon/features/generators/astype.py:127), in AsTypeFeatureGenerator._transform(self, X)
    125 def _transform(self, X: DataFrame) -> DataFrame:
    126     if self._bool_features:
--> 127         X = self._convert_to_bool(X)
    128     # check if not same
    129     if self._type_map_real_opt != X.dtypes.to_dict():

File [~/Projects/ag_os_test/autogluon/features/src/autogluon/features/generators/astype.py:152](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/eses/Projects/ag_os_test/~/Projects/ag_os_test/autogluon/features/src/autogluon/features/generators/astype.py:152), in AsTypeFeatureGenerator._convert_to_bool(self, X)
    151 def _convert_to_bool(self, X: DataFrame) -> DataFrame:
--> 152     if self._use_fast_bool_method:
    153         return self._convert_to_bool_fast(X)
    154     else:

AttributeError: 'AsTypeFeatureGenerator' object has no attribute '_use_fast_bool_method'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working module: tabular OS: Windows Impacting Windows OS priority: 0 Maximum priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants