Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XGBoost segmentation fault #9

Closed
tejas-kale opened this issue Aug 25, 2021 · 1 comment
Closed

XGBoost segmentation fault #9

tejas-kale opened this issue Aug 25, 2021 · 1 comment

Comments

@tejas-kale
Copy link

Hi Tushar. Thanks for sharing the package. I am facing an issue with the line self.temp1 = XGBClassifier().fit(self.X, self.y).feature_importances_ in the method base_tree() of models.py. I can't get much out of the error which says:

UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].
  warnings.warn(label_encoder_deprecation_msg, UserWarning)

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

The surprising thing is that when I run XGBClassifier().fit(X, y) on the same data in an IPython console, it runs fine.

I am using the same script as provided in the README:

import torch
import numpy as np
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from XBNet.training_utils import training,predict
from XBNet.models import XBNETClassifier
from XBNet.run import run_XBNET

# Experiment with Iris data directly from sklearn.
# iris = load_iris()
# data = pd.DataFrame(iris.data)
# data.columns = iris.feature_names
# data.loc[:, 'type'] = iris.target
# X, y = data.iloc[:, :-1], data.iloc[:, -1]
# X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=123)

data = pd.read_csv('test/Iris.csv')
print(data.shape)
x_data = data[data.columns[:-1]]
print(x_data.shape)
y_data = data[data.columns[-1]]
le = LabelEncoder()
y_data = np.array(le.fit_transform(y_data))
print(le.classes_)

X_train,X_test,y_train,y_test = train_test_split(x_data.to_numpy(),y_data,test_size = 0.3,random_state = 0)
model = XBNETClassifier(X_train,y_train,2)

criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

m,acc, lo, val_ac, val_lo = run_XBNET(X_train,X_test,y_train,y_test,model,criterion,optimizer,32,300)
print(predict(m,x_data.to_numpy()[0,:]))

Any idea what could be going wrong here? Following are some details of my system that might be useful:

  • OS: macOS 11.5.2
  • Python 3.8.11
  • XGBoost version: 1.4.2
  • XBNet: 1.3.1
@tusharsarkar3
Copy link
Owner

Hey! Sorry for the late reply. This problem was also reported by another Mac OS user, so we're probing into it to see how it can be solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants