Skip to content

FIX Uses log2 in tree building#30557

Merged
jeremiedbb merged 6 commits into
scikit-learn:mainfrom
thomasjpfan:thomasjpfan/fix_log_tree
Jan 2, 2025
Merged

FIX Uses log2 in tree building#30557
jeremiedbb merged 6 commits into
scikit-learn:mainfrom
thomasjpfan:thomasjpfan/fix_log_tree

Conversation

@thomasjpfan

Copy link
Copy Markdown
Member

Reference Issues/PRs

Fixes #30554

What does this implement/fix? Explain your changes.

This PR replaces log with log2 and explicitly uses that in sklearn/tree/_partitioner.pyx, so it's more clear that we are using base 2.

@thomasjpfan thomasjpfan changed the title Adds whats new BUG: Uses log2 in tree building Dec 30, 2024
@thomasjpfan thomasjpfan changed the title BUG: Uses log2 in tree building FIX Uses log2 in tree building Dec 30, 2024
@github-actions

github-actions Bot commented Dec 30, 2024

Copy link
Copy Markdown

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 07c1b23. Link to the linter CI: here

@OmarManzoor OmarManzoor left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @thomasjpfan

@OmarManzoor OmarManzoor added Waiting for Second Reviewer First reviewer is done, need a second one! Quick Review For PRs that are quick to review labels Dec 30, 2024
@jeremiedbb

Copy link
Copy Markdown
Member

The name of the fragment should be 30557.fix.rst. Otherwise LGTM

@glemaitre

Copy link
Copy Markdown
Member

Do you think that it is worth a minimal non-regression test to acknowledge that we use the log2?

@ogrisel

ogrisel commented Dec 30, 2024

Copy link
Copy Markdown
Member

We might also want to delete the log function of https://github.com/scikit-learn/scikit-learn/blob/6385b7f7f4395f050cb4c85449fa3987031598c1/sklearn/tree/_utils.pyx#L65C1-L66C27 and update any remaining calls such as sklearn/tree/_criterion.pyx to cimport log2 from libc.math instead.

This change would better be done in another MAINT PR that would not be part of the 1.6.1 released fix.

@ogrisel

ogrisel commented Dec 30, 2024

Copy link
Copy Markdown
Member

Do you think that it is worth a minimal non-regression test to acknowledge that we use the log2?

I approved the PR but forgot about this point. I am not sure how to do that, but it would be great to find a way.

@thomasjpfan thomasjpfan left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a non-regression test that uses an example that fails on main and passes with this PR.

best.pos += best.n_missing


def _py_sort(float32_t[::1] feature_values, intp_t[::1] samples, intp_t n):

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this function to test sort directly.

@jeremiedbb jeremiedbb merged commit 19ef479 into scikit-learn:main Jan 2, 2025
jeremiedbb pushed a commit to jeremiedbb/scikit-learn that referenced this pull request Jan 8, 2025
jeremiedbb pushed a commit that referenced this pull request Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cython module:tree Quick Review For PRs that are quick to review Waiting for Second Reviewer First reviewer is done, need a second one!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

scikit-learn 1.6 changed behavior of growing trees

5 participants