Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH Add dtype preservation to FeatureAgglomeration #24346

Draft
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

IvanLauLinTiong
Copy link
Contributor

Reference Issues/PRs

In scope of #11000

What does this implement/fix? Explain your changes.

Implement dtype preservation to FeatureAgglomeration and its relevant tests

Any other comments?

@jeremiedbb
Copy link
Member

Thanks for the PR @IvanLauLinTiong ! looks good. I just removed one of the tests because it will be covered by a common test.

test_feature_agglomeration_numerical_consistency
@glemaitre glemaitre self-requested a review September 5, 2022 15:18
@glemaitre
Copy link
Member

@jeremiedbb In this case, we don't modify the tree builder algorithm to perform the 32 bits operations?

@IvanLauLinTiong
Copy link
Contributor Author

Thanks for the PR @IvanLauLinTiong ! looks good. I just removed one of the tests because it will be covered by a common test.

Okay thanks

@glemaitre glemaitre removed their request for review September 9, 2022 09:24
@IvanLauLinTiong
Copy link
Contributor Author

@jeremiedbb In this case, we don't modify the tree builder algorithm to perform the 32 bits operations?

@glemaitre @jeremiedbb
Hi, do I need to modify the tree builder algorithm to perform 32 bits operations ?

@IvanLauLinTiong
Copy link
Contributor Author

@glemaitre @jeremiedbb
Hello, any chances to help me review this? Thanks

@IvanLauLinTiong
Copy link
Contributor Author

IvanLauLinTiong commented Oct 25, 2022

Hi @jeremiedbb, could you advise me if there's any mistakes/or improvement that I still need to do for this PR? I'm happy to learn from mistakes and work more on this if there's a room for improvement.
Thanks
Ivan

@glemaitre
Copy link
Member

I discussed the problem with @ogrisel IRL. We agreed that we should also make sure that fit preserve dtype and that every computation should be done in float32.

It means that we need to check that every algorithms in TREE_BUILDERS needs to be modified.
It potentially leads to a large diff. Therefore, it could be easier to open a PR by algorithm and do it step by step. It will also be either to review.

@glemaitre glemaitre marked this pull request as draft November 3, 2022 10:54
@glemaitre
Copy link
Member

Converting into draft until it is ready to be reviewed.

@IvanLauLinTiong
Copy link
Contributor Author

I discussed the problem with @ogrisel IRL. We agreed that we should also make sure that fit preserve dtype and that every computation should be done in float32.

It means that we need to check that every algorithms in TREE_BUILDERS needs to be modified. It potentially leads to a large diff. Therefore, it could be easier to open a PR by algorithm and do it step by step. It will also be either to review.

Okay noted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants