Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does this cause core dump ? #16

Closed
chenglin opened this issue Mar 28, 2022 · 3 comments
Closed

Does this cause core dump ? #16

chenglin opened this issue Mar 28, 2022 · 3 comments

Comments

@chenglin
Copy link

Recently, I find that one of my model will cause core dump if I use lleaves for predict.

I am confused about two functions below.

In codegen.py, function param type can be int* if param is categorical

def make_tree(tree):
    # declare the function for this tree
    func_dtypes = (INT_CAT if f.is_categorical else DOUBLE for f in tree.features)
    scalar_func_t = ir.FunctionType(DOUBLE, func_dtypes)
    tree_func = ir.Function(module, scalar_func_t, name=str(tree))
    tree_func.linkage = "private"
    # populate function with IR
    gen_tree(tree, tree_func)
    return LTree(llvm_function=tree_func, class_id=tree.class_id)

But in data_processing.py with predict used, all feature param are convert to double*

def ndarray_to_ptr(data: np.ndarray):
    """
    Takes a 2D numpy array, converts to float64 if necessary and returns a pointer

    :param data: 2D numpy array. Copying is avoided if possible.
    :return: pointer to 1D array of dtype float64.
    """
    # ravel makes sure we get a contiguous array in memory and not some strided View
    data = data.astype(np.float64, copy=False, casting="same_kind").ravel()
    ptr = data.ctypes.data_as(POINTER(c_double))
    return ptr

Is this just like

int* predict(int* a, double* b);
double a = 1.1;
double b = 2.2;
predict(&a, &b);

Does this will happy in lleaves?

@siboehm
Copy link
Owner

siboehm commented Mar 29, 2022

TLDR: It's possible that there's a bug that causes a segfault, though it's unlikely that this is happening in the parts of the code you're pointing to.

For diagnosing the segfault: Could you run a minimally reproducing example with gdb to see which instruction triggers the segfault? There used to be an issue with overflows for very large datasets, but I fixed that a few months ago. If there's any way you can have a self-contained, minimally reproducible sample and send it to me (email is fine), I'd love to help you out.

Regarding the categorical data: The relevant function is actually this one:

def gen_forest(forest, module, fblocksize):

This is the function in the binary that lleaves calls from Python (using two double pointers). The categorical features are then cast to ints in the core loop here:
args.append(builder.fptosi(el, INT_CAT))

Most of the processing of the Pandas dataframes follows LightGBM very closely. This double to int casting is a bit strange, but I wanted to follow LightGBM as closely as possible. It works since LightGBM doesn't allow categoricals > 2^31-1 (max int 32), but double can represent any int up to 2^53 and lower without loss of precision.

@chenglin
Copy link
Author

I find that if categorical feature is numerical value, we can get rid of the code df[categorical_feature] = df[categorical_feature].astype('category') when prepared training data. We can just call lightgbm train function by set param categorical_feature=categorical_feature. In model file trained like this, pandas_categorical is null. May this issue related to this?

When I retrained a model that pandas_categorical is not null, the core dump disappeared.

PR: return empty list if pandas_categorical is null in model file
BTW, I think we show keep pandas_categorical = None, when pandas_categorical: null in the model file.

@siboehm
Copy link
Owner

siboehm commented Apr 3, 2022

I'm having trouble understanding this issue. Could you write up a minimally reproducible example of the core dump / send me the model.txt that causes it?

@siboehm siboehm closed this as completed Aug 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants