Skip to content

FEA Support missing-values in ExtraTrees*#28268

Merged
OmarManzoor merged 171 commits intoscikit-learn:mainfrom
adam2392:extraforest
Jul 10, 2024
Merged

FEA Support missing-values in ExtraTrees*#28268
OmarManzoor merged 171 commits intoscikit-learn:mainfrom
adam2392:extraforest

Conversation

@adam2392
Copy link
Copy Markdown
Member

@adam2392 adam2392 commented Jan 25, 2024

Reference Issues/PRs

Follow-up to: #27966

Closes: #27931

What does this implement/fix? Explain your changes.

  • Adds support and tests for missing-values in the ExtraTrees*

Any other comments?

This should be a very easy review as this just activates the usage of missing value support and adds unit-test parameterizations for the ExtraTree Forest. Just requesting reviews from @thomasjpfan, @glemaitre and @OmarManzoor since you all were able to review the code in #27966. Hope this is easy :)

Waiting for #27966 to be merged first. This just shows the relevant code changes that introduces the missing-value support overall for ExtraTrees. Note the bulk of the work for going from tree->forest missing-value support was added in #26391. The main change here is to introduce certain unit tests.

  • benchmark script ran to demonstrate ExtraTrees* perform much better on data with missing values compared to naive imputation on main
  • benchmark script to compare ExtraTree* compared to naive imputation on main
Benchmarks on ExtraTrees demonstrating there is no runtime regression

results_image

Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
@github-actions
Copy link
Copy Markdown

github-actions bot commented Jan 25, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 963de46. Link to the linter CI: here

adam2392 added 13 commits July 6, 2024 11:40
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Copy link
Copy Markdown
Contributor

@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @adam2392. Would it be suitable for ExtraTreesRegressor to be added to test_non_supported_criterion_raises_error_with_missing_values?

Comment thread sklearn/tree/_classes.py Outdated
Comment thread sklearn/tree/_classes.py Outdated
adam2392 and others added 4 commits July 9, 2024 08:09
Co-authored-by: Omar Salman <omar.salman2007@gmail.com>
Co-authored-by: Omar Salman <omar.salman2007@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
@adam2392
Copy link
Copy Markdown
Member Author

adam2392 commented Jul 9, 2024

Good point @OmarManzoor!

Added in 1441925

Signed-off-by: Adam Li <adam2392@gmail.com>
Copy link
Copy Markdown
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, LGTM

Comment thread doc/whats_new/v1.6.rst Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ENH support for missing values in ExtraTrees

4 participants