Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GBM lr_annealing] failed: water.exceptions.H2OIllegalArgumentException: Can only convert jobs producing a single Model or ModelContainer. #167

Open
algomaschine opened this issue Nov 8, 2022 · 2 comments

Comments

@algomaschine
Copy link

algomaschine commented Nov 8, 2022

Dear Developers,

I've got 28 samples of data, the format is exactly the same, numerical and some categorical columns. It's been working continuously (30 min max time per model) and generated various types of models, as you can see there is DeepLearning, GBM, StackedEnsemble variations. All good. Again, each model corresponds to a different data set, but the format is EXACTLY the same.
image

Now, at one instance it shows this error. Unfortunately it doesn't tell me more details. Cou7ld you please advise how can I understand what's happening under the hood and perhaps fix it?
image

The interesting thing it happens after the model generation is done, just before it's to finish.

AutoML progress: |█████████████████████████████████████████████████████████████▌ | 97%
08:19:54.521: GBM_lr_annealing_selection_AutoML_26_20221108_75037 [GBM lr_annealing] failed: water.exceptions.H2OIllegalArgumentException: Can only convert jobs producing a single Model or ModelContainer.

AutoML progress: |███████████████████████████████████████████████████████████████ (done)| 100%
generated file C:\Users\Administrator\Desktop\snp ephem\h2o models\StackedEnsemble_BestOfFamily_6_AutoML_26_20221108_75037

The interesting part is, it actually did finish the model file eventually. But what's the difference? What is missing, if anything?
generated file C:\Users\Administrator\Desktop\h2o models\StackedEnsemble_BestOfFamily_6_AutoML_26_20221108_75037

Thank you!

PS: Also I've noticed one bug. If the previous instance of H2O server is not killed, and I start generating models from a different directory, then this instance of H2O server is reused and the models will be generated in the directory as associated with previous H2O instance, not the new directory from where the program was started.

@algomaschine
Copy link
Author

I also get an 'array out of bounds exception' sometimes, but it continues and eventually generates a model
image

@algomaschine
Copy link
Author

**AutoML progress: |████ | 6%
10:00:16.97: GLM_1_AutoML_4_20221109_95959 [GLM def_1] failed: java.lang.ArrayIndexOutOfBoundsException: 324

AutoML progress: |██████████████████████████████████████████████████████████████▎| 98%
10:29:38.439: GBM_lr_annealing_selection_AutoML_4_20221109_95959 [GBM lr_annealing] failed: water.exceptions.H2OIllegalArgumentException: Can only convert jobs producing a single Model or ModelContainer.

AutoML progress: |███████████████████████████████████████████████████████████████ (done)| 100%
Traceback (most recent call last):
File ".\auto_model_trainer.py", line 298, in
train_by_data(os.path.dirname(sys.argv[0])+"\train-test\","train_per_2022-11-.csv",os.path.dirname(sys.argv[0])+"\h2o models\", "AutoMLper_2022-11-*")
File ".\auto_model_trainer.py", line 110, in train_by_data
aml.train(y = y, training_frame = train, leaderboard_frame = test)
File "C:\Program Files\Python37\lib\site-packages\h2o\automl_estimator.py", line 683, in train
self._fetch()
File "C:\Program Files\Python37\lib\site-packages\h2o\automl_estimator.py", line 712, in _fetch
state = _fetch_state(self.key)
File "C:\Program Files\Python37\lib\site-packages\h2o\automl_base.py", line 354, in _fetch_state
event_log = _fetch_table(state_json['event_log_table'], key=project_name+"_eventlog", progress_bar=False)
File "C:\Program Files\Python37\lib\site-packages\h2o\automl_base.py", line 327, in _fetch_table
fr = h2o.H2OFrame(table.cell_values, destination_frame=key, column_names=table.col_header, column_types=table.col_types)
File "C:\Program Files\Python37\lib\site-packages\h2o\frame.py", line 114, in init
column_names, column_types, na_strings, skipped_columns)
File "C:\Program Files\Python37\lib\site-packages\h2o\frame.py", line 155, in _upload_python_object
os.remove(tmp_path) # delete the tmp file
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\Users\ADMINI~1\AppData\Local\Temp\2\tmphburp1m8.csv'
Closing connection _sid_9f5e at exit
H2O session _sid_9f5e closed.**

And I'm starting to get this often, when generating model after model in the same console. These commands below kinda help, but I still have to monitor for every crush. What might be a specific root cause? The resources are OK, enough memory and everything.
taskkill /F /IM "python.exe"
taskkill /F /IM "java.exe"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant