Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tree: node_value could be float instead of double #14747

Open
sdpython opened this issue Aug 24, 2019 · 2 comments
Open

Tree: node_value could be float instead of double #14747

sdpython opened this issue Aug 24, 2019 · 2 comments

Comments

@sdpython
Copy link
Contributor

Description

Predict function in _tree.pyx is comparing a float feature to a double threshold (node_value). The double threshold could be replaced by a float threshold using the following function:

import numpy

def float_threshold(dy):
    fy = numpy.float32(dy)
    if fy == dy:
        return fy
    if fy < dy:
        return fy
    eps = max(abs(fy), numpy.finfo(numpy.float32).eps) * 10
    nfy = numpy.nextafter([fy], [fy - eps], dtype=numpy.float32)[0]
    return nfy

More explanations: Tricky detail when converting a random forest from scikit-learn into ONNX.

Expected Results

Exactly the same results but with less memory consumption and maybe with some speed gain as no cast to double would be required when comparing both values.

Versions

Any version >= 020.

@cmarmo
Copy link
Member

cmarmo commented Aug 10, 2020

Hi @sdpython, thanks for reaching out, and sorry for the late answer. Are you interested in proposing a PR?

@NicolasHug
Copy link
Member

@sdpython instead of node_value, do you mean node.threshold? I do not understand yet how the node's value (i.e. the prediction) relates to this threshold comparison.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants