OneHotEncoder Failure: Simple example failure #147

chclam · 2022-03-21T20:13:18Z

An error occurs when I'm trying to run the following simple example from the main page:

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import log_loss, accuracy_score
from gama import GamaClassifier

if __name__ == '__main__':
    X, y = load_breast_cancer(return_X_y=True)
    X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=0)

    automl = GamaClassifier(max_total_time=180, store="nothing")
    print("Starting `fit` which will take roughly 3 minutes.")
    automl.fit(X_train, y_train)

    label_predictions = automl.predict(X_test)
    probability_predictions = automl.predict_proba(X_test)

    print('accuracy:', accuracy_score(y_test, label_predictions))
    print('log loss:', log_loss(y_test, probability_predictions))
    # the `score` function outputs the score on the metric optimized towards (by default, `log_loss`)
    print('log_loss', automl.score(X_test, y_test))

The error that I get:

Traceback (most recent call last):
  File "/Users/chris/Development/gradproject/issues/gama/gama/./test.py", line 13, in <module>
    automl.fit(X_train, y_train)
  File "/Users/chris/Development/gradproject/issues/gama/gama/gama/GamaClassifier.py", line 134, in fit
    super().fit(x, y, *args, **kwargs)
  File "/Users/chris/Development/gradproject/issues/gama/gama/gama/gama.py", line 502, in fit
    self._x, self._basic_encoding_pipeline = basic_encoding(
  File "/Users/chris/Development/gradproject/issues/gama/gama/gama/utilities/preprocessing.py", line 63, in basic_encoding
    x_enc = encoding_pipeline.fit_transform(x, y=None)  # Is this dangerous?
  File "/usr/local/lib/python3.9/site-packages/sklearn/pipeline.py", line 434, in fit_transform
    return last_step.fit_transform(Xt, y, **fit_params_last_step)
  File "/usr/local/lib/python3.9/site-packages/sklearn/base.py", line 847, in fit_transform
    return self.fit(X, **fit_params).transform(X)
  File "/usr/local/lib/python3.9/site-packages/category_encoders/one_hot.py", line 152, in fit
    oe_missing_strat = {
KeyError: 'ignore'

It seems to be caused by assigning an invalid keyword to the handle_missing function parameter to OneHotEncoder in the dependency category_encoders.
According to the docs, the valid keywords are as follows: error, return_nan, value, and indicator, where value is the default.

The text was updated successfully, but these errors were encountered:

PGijsbers · 2022-03-22T11:11:46Z

Hi, thanks for opening the issue and providing a solution 👍
It looks like this is specific to the latest release (and was undocumented and without deprecation warnings) :)

For those that run into this issue until a new gama PyPI release is available: please downgrade category encoders to 2.3:
pip install category-encoders==2.3

chclam · 2022-03-23T14:00:51Z

Hey, I'm glad to be of any help 👍

alanwilter · 2022-09-19T13:28:52Z

I'm hitting this same problem right now and I need it in docker.

If you don't have a linux, try with gitpod.io with https://github.com/openml/automlbenchmark

yes | python runbenchmark.py gama:latest example test -m docker -s force
...
Collecting liac-arff>=2.2.2
  Downloading liac-arff-2.5.0.tar.gz (13 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Requirement already satisfied: psutil in ./frameworks/GAMA/venv/lib/python3.7/site-packages (from gama==22.0.1.dev0) (5.8.0)
ERROR: Ignored the following versions that require a different python version: 1.1.0 Requires-Python >=3.8; 1.1.0rc1 Requires-Python >=3.8; 1.1.1 Requires-Python >=3.8; 1.1.2 Requires-Python >=3.8; 1.4.0 Requires-Python >=3.8; 1.4.0rc0 Requires-Python >=3.8; 1.4.1 Requires-Python >=3.8; 1.4.2 Requires-Python >=3.8; 1.4.3 Requires-Python >=3.8; 1.4.4 Requires-Python >=3.8; 1.5.0rc0 Requires-Python >=3.8; 1.8.0 Requires-Python >=3.8,<3.11; 1.8.0rc1 Requires-Python >=3.8,<3.11; 1.8.0rc2 Requires-Python >=3.8,<3.11; 1.8.0rc3 Requires-Python >=3.8,<3.11; 1.8.0rc4 Requires-Python >=3.8,<3.11; 1.8.1 Requires-Python >=3.8,<3.11; 1.9.0 Requires-Python >=3.8,<3.12; 1.9.0rc1 Requires-Python >=3.8,<3.12; 1.9.0rc2 Requires-Python >=3.8,<3.12; 1.9.0rc3 Requires-Python >=3.8,<3.12; 1.9.1 Requires-Python >=3.8,<3.12
ERROR: Could not find a version that satisfies the requirement scikit-learn>=1.1.0 (from gama) (from versions: 0.9, 0.10, 0.11, 0.12, 0.12.1, 0.13, 0.13.1, 0.14, 0.14.1, 0.15.0b1, 0.15.0b2, 0.15.0, 0.15.1, 0.15.2, 0.16b1, 0.16.0, 0.16.1, 0.17b1, 0.17, 0.17.1, 0.18, 0.18.1, 0.18.2, 0.19b2, 0.19.0, 0.19.1, 0.19.2, 0.20rc1, 0.20.0, 0.20.1, 0.20.2, 0.20.3, 0.20.4, 0.21rc2, 0.21.0, 0.21.1, 0.21.2, 0.21.3, 0.22rc2.post1, 0.22rc3, 0.22, 0.22.1, 0.22.2, 0.22.2.post1, 0.23.0rc1, 0.23.0, 0.23.1, 0.23.2, 0.24.dev0, 0.24.0rc1, 0.24.0, 0.24.1, 0.24.2, 1.0rc1, 1.0rc2, 1.0, 1.0.1, 1.0.2)
ERROR: No matching distribution found for scikit-learn>=1.1.0
Traceback (most recent call last):

  File "<string>", line 1, in <module>

ModuleNotFoundError: No module named 'gama'



Cloning into '/bench/frameworks/GAMA/lib/gama'...
ERROR: Ignored the following versions that require a different python version: 1.1.0 Requires-Python >=3.8; 1.1.0rc1 Requires-Python >=3.8; 1.1.1 Requires-Python >=3.8; 1.1.2 Requires-Python >=3.8; 1.4.0 Requires-Python >=3.8; 1.4.0rc0 Requires-Python >=3.8; 1.4.1 Requires-Python >=3.8; 1.4.2 Requires-Python >=3.8; 1.4.3 Requires-Python >=3.8; 1.4.4 Requires-Python >=3.8; 1.5.0rc0 Requires-Python >=3.8; 1.8.0 Requires-Python >=3.8,<3.11; 1.8.0rc1 Requires-Python >=3.8,<3.11; 1.8.0rc2 Requires-Python >=3.8,<3.11; 1.8.0rc3 Requires-Python >=3.8,<3.11; 1.8.0rc4 Requires-Python >=3.8,<3.11; 1.8.1 Requires-Python >=3.8,<3.11; 1.9.0 Requires-Python >=3.8,<3.12; 1.9.0rc1 Requires-Python >=3.8,<3.12; 1.9.0rc2 Requires-Python >=3.8,<3.12; 1.9.0rc3 Requires-Python >=3.8,<3.12; 1.9.1 Requires-Python >=3.8,<3.12
ERROR: Could not find a version that satisfies the requirement scikit-learn>=1.1.0 (from gama) (from versions: 0.9, 0.10, 0.11, 0.12, 0.12.1, 0.13, 0.13.1, 0.14, 0.14.1, 0.15.0b1, 0.15.0b2, 0.15.0, 0.15.1, 0.15.2, 0.16b1, 0.16.0, 0.16.1, 0.17b1, 0.17, 0.17.1, 0.18, 0.18.1, 0.18.2, 0.19b2, 0.19.0, 0.19.1, 0.19.2, 0.20rc1, 0.20.0, 0.20.1, 0.20.2, 0.20.3, 0.20.4, 0.21rc2, 0.21.0, 0.21.1, 0.21.2, 0.21.3, 0.22rc2.post1, 0.22rc3, 0.22, 0.22.1, 0.22.2, 0.22.2.post1, 0.23.0rc1, 0.23.0, 0.23.1, 0.23.2, 0.24.dev0, 0.24.0rc1, 0.24.0, 0.24.1, 0.24.2, 1.0rc1, 1.0rc2, 1.0, 1.0.1, 1.0.2)
ERROR: No matching distribution found for scikit-learn>=1.1.0
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'gama'

Command '['/bench/frameworks/GAMA/setup.sh', 'latest']' returned non-zero exit status 1.
The command '/bin/sh -c $PY runbenchmark.py gama:latest -s only' returned a non-zero code: 2
Traceback (most recent call last):

  File "runbenchmark.py", line 182, in <module>

    bench.setup(amlb.SetupMode[args.setup])

  File "/bench/amlb/benchmark.py", line 126, in setup

    _activity_timeout_=rconfig().setup.activity_timeout)

  File "/bench/frameworks/GAMA/__init__.py", line 7, in setup

    call_script_in_same_dir(__file__, "setup.sh", *args, **kwargs)

  File "/bench/amlb/utils/process.py", line 259, in call_script_in_same_dir

    return run_script(script_path, *args, **kwargs)

  File "/bench/amlb/utils/process.py", line 253, in run_script

    return run_cmd(script_path, *args, **kwargs)

  File "/bench/amlb/utils/process.py", line 247, in run_cmd

    raise e

  File "/bench/amlb/utils/process.py", line 234, in run_cmd

    preexec_fn=params.preexec_fn)

  File "/bench/amlb/utils/process.py", line 77, in run_subprocess

    raise subprocess.CalledProcessError(retcode, process.args, output=stdout, stderr=stderr)

subprocess.CalledProcessError: Command '['/bench/frameworks/GAMA/setup.sh', 'latest']' returned non-zero exit status 1.






The command '/bin/sh -c $PY runbenchmark.py gama:latest -s only' returned a non-zero code: 2

Command 'docker build --no-cache -t automlbenchmark/gama:latest-dev -f /workspace/automlbenchmark/frameworks/GAMA/.setup/Dockerfile .' returned non-zero exit status 2.
Traceback (most recent call last):
  File "runbenchmark.py", line 182, in <module>
    bench.setup(amlb.SetupMode[args.setup])
  File "/workspace/automlbenchmark/amlb/runners/container.py", line 80, in setup
    self.image = self._build_image(cache=(mode != SetupMode.force))
  File "/workspace/automlbenchmark/amlb/runners/container.py", line 194, in _build_image
    self._run_container_build_command(image, cache)
  File "/workspace/automlbenchmark/amlb/runners/docker.py", line 98, in _run_container_build_command
    run_cmd("docker build {options} -t {container} -f {script} .".format(
  File "/workspace/automlbenchmark/amlb/utils/process.py", line 247, in run_cmd
    raise e
  File "/workspace/automlbenchmark/amlb/utils/process.py", line 221, in run_cmd
    completed = run_subprocess(str_cmd if params.shell else full_cmd,
  File "/workspace/automlbenchmark/amlb/utils/process.py", line 77, in run_subprocess
    raise subprocess.CalledProcessError(retcode, process.args, output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'docker build --no-cache -t automlbenchmark/gama:latest-dev -f /workspace/automlbenchmark/frameworks/GAMA/.setup/Dockerfile .' returned non-zero exit status 2.

If I run local, it works, but only because my local python is 3.8.

alanwilter · 2022-09-19T13:39:25Z

Never mind, changed in resources/config.yaml to python: 3.8 and it worked.

PGijsbers · 2022-09-20T08:20:40Z

As expected: the latest GAMA release (22.0.0) is only available for Py 3.8+

chclam mentioned this issue Mar 21, 2022

Fix for issue #147, provide valid function argument to onehotencoder #148

Merged

PGijsbers mentioned this issue Mar 22, 2022

Bump dependencies #142

Closed

chclam closed this as completed Mar 23, 2022

PGijsbers mentioned this issue Aug 16, 2022

[GAMA] Error: NoResultError:'ignore' openml/automlbenchmark#489

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OneHotEncoder Failure: Simple example failure #147

OneHotEncoder Failure: Simple example failure #147

chclam commented Mar 21, 2022

PGijsbers commented Mar 22, 2022

chclam commented Mar 23, 2022

alanwilter commented Sep 19, 2022

alanwilter commented Sep 19, 2022

PGijsbers commented Sep 20, 2022

OneHotEncoder Failure: Simple example failure #147

OneHotEncoder Failure: Simple example failure #147

Comments

chclam commented Mar 21, 2022

PGijsbers commented Mar 22, 2022

chclam commented Mar 23, 2022

alanwilter commented Sep 19, 2022

alanwilter commented Sep 19, 2022

PGijsbers commented Sep 20, 2022