Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature shape mismatch error in xgboost 1.5.0 for linux-s390x #7712

Closed
pradghos opened this issue Mar 1, 2022 · 3 comments
Closed

Feature shape mismatch error in xgboost 1.5.0 for linux-s390x #7712

pradghos opened this issue Mar 1, 2022 · 3 comments

Comments

@pradghos
Copy link
Contributor

pradghos commented Mar 1, 2022

We have observed Feature shape mismatch error in xgboost 1.5.0 for linux-s390x and it can be easily reproducible with below test case -

Failure log:

pytest -v tests/python/test_with_sklearn.py::test_XGBClassifier_resume

tests/python/test_with_sklearn.py::test_XGBClassifier_resume FAILED                                                                                    [100%]

========================================================================== FAILURES ==========================================================================
_________________________________________________________________ test_XGBClassifier_resume __________________________________________________________________

    def test_XGBClassifier_resume():
        from sklearn.datasets import load_breast_cancer
        from sklearn.metrics import log_loss

        with TemporaryDirectory() as tempdir:
            model1_path = os.path.join(tempdir, 'test_XGBClassifier.model')
            model1_booster_path = os.path.join(tempdir, 'test_XGBClassifier.booster')

            X, Y = load_breast_cancer(return_X_y=True)

            model1 = xgb.XGBClassifier(
                learning_rate=0.3, random_state=0, n_estimators=8)
            model1.fit(X, Y)

>           pred1 = model1.predict(X)

.... 

            if not hasattr(data, "shape"):
                raise TypeError(
                    "`shape` attribute is required when `validate_features` is True."
                )
            if len(data.shape) != 1 and self.num_features() != data.shape[1]:
>               raise ValueError(
                    f"Feature shape mismatch, expected: {self.num_features()}, "
                    f"got {data.shape[1]}"
                )
E               ValueError: Feature shape mismatch, expected: 0, got 30

There are other test case failure also for the same issue in xgboost 1.5 ; However above test cases worked fine with xgboost 1.3.3 in linux-s390x.

pradghos added a commit to pradghos/xgboost that referenced this issue Mar 3, 2022
- Addressing dmlc#7712
- Corrected integer types
- also referred current implementation of `num_row()`, `num_col()`
@pradghos
Copy link
Contributor Author

pradghos commented Mar 3, 2022

features.value is getting truncated to 0 in num_features() is getting truncated in python layer.

> /opt/miniconda3/envs/test_xgb_mster/lib/python3.9/site-packages/xgboost/core.py(2039)inplace_predict()
-> if not hasattr(data, "shape"):
(Pdb) n
> /opt/miniconda3/envs/test_xgb_mster/lib/python3.9/site-packages/xgboost/core.py(2043)inplace_predict()
-> if len(data.shape) != 1 and self.num_features() != data.shape[1]:
(Pdb) step num_features()
--Call--
> /opt/miniconda3/envs/test_xgb_mster/lib/python3.9/site-packages/xgboost/core.py(2243)num_features()
-> def num_features(self) -> int:
(Pdb) n
> /opt/miniconda3/envs/test_xgb_mster/lib/python3.9/site-packages/xgboost/core.py(2245)num_features()
-> features = ctypes.c_int()
(Pdb) n
> /opt/miniconda3/envs/test_xgb_mster/lib/python3.9/site-packages/xgboost/core.py(2246)num_features()
-> assert self.handle is not None
(Pdb) n
> /opt/miniconda3/envs/test_xgb_mster/lib/python3.9/site-packages/xgboost/core.py(2247)num_features()
-> _check_call(_LIB.XGBoosterGetNumFeature(self.handle, ctypes.byref(features)))
(Pdb)

...

Breakpoint 1, XGBoosterGetNumFeature (handle=0x2aa01807920, out=0x3ffaefe5588) at /usr/local/src/conda/xgboost-split-master/src/c_api/c_api.cc:626
626     /usr/local/src/conda/xgboost-split-master/src/c_api/c_api.cc: No such file or directory.
(gdb) n
627     in /usr/local/src/conda/xgboost-split-master/src/c_api/c_api.cc
(gdb) n
628     in /usr/local/src/conda/xgboost-split-master/src/c_api/c_api.cc
(gdb) p *(bst_ulong *)out
$1 = 10                           <-------- Correct value
(gdb) n
0x000003ffb9f07c60 in ffi_call_SYSV () from /opt/miniconda3/envs/test_xgb_mster/lib/python3.9/lib-dynload/../../libffi.so.7
(gdb) c
Continuing.
> /opt/miniconda3/envs/test_xgb_mster/lib/python3.9/site-packages/xgboost/core.py(2248)num_features()
-> return features.value
(Pdb) p features.value           <--------- incorrect value
0

After the fix -

> /opt/miniconda3/envs/test_xgb_master/lib/python3.9/site-packages/xgboost/core.py(2243)num_features()
-> def num_features(self) -> int:
(Pdb) n
> /opt/miniconda3/envs/test_xgb_master/lib/python3.9/site-packages/xgboost/core.py(2245)num_features()
-> features = c_bst_ulong()
(Pdb) n
> /opt/miniconda3/envs/test_xgb_master/lib/python3.9/site-packages/xgboost/core.py(2246)num_features()
-> assert self.handle is not None
(Pdb) n
> /opt/miniconda3/envs/test_xgb_master/lib/python3.9/site-packages/xgboost/core.py(2247)num_features()
-> _check_call(_LIB.XGBoosterGetNumFeature(self.handle, ctypes.byref(features)))
(Pdb)

...

Breakpoint 1, XGBoosterGetNumFeature (handle=0x2aa017fc5f0, out=0x3ffaf0d6e88) at /usr/local/src/conda/xgboost-split-master/src/c_api/c_api.cc:626
626     /usr/local/src/conda/xgboost-split-master/src/c_api/c_api.cc: No such file or directory.
(gdb) n
627     in /usr/local/src/conda/xgboost-split-master/src/c_api/c_api.cc
(gdb) n
628     in /usr/local/src/conda/xgboost-split-master/src/c_api/c_api.cc
(gdb) p *(bst_ulong *)out
$1 = 10       <---------- Correct value
(gdb) n
0x000003ffb9f07c60 in ffi_call_SYSV () from /opt/miniconda3/envs/test_xgb_master/lib/python3.9/lib-dynload/../../libffi.so.7
(gdb) c
Continuing.
> /opt/miniconda3/envs/test_xgb_master/lib/python3.9/site-packages/xgboost/core.py(2248)num_features()
-> return features.value
(Pdb) p features.value
10           <---------  Correct value
(Pdb)

@trivialfis
Copy link
Member

Closed by #7715 .

@pradghos
Copy link
Contributor Author

pradghos commented Mar 3, 2022

JFYI @potula-chandra @ravigumm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants