-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add 2D matrices support for GapEncoder #185
Add 2D matrices support for GapEncoder #185
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the super fast implementation of this feature. I did a first full review (sorry, not as thorough as I would like, but it's late). I added some inline comments, and here are a few:
We are changing the public interface of GapEncoder in a backward incompatible way: before it would accept 1D array, and now it will throw an error. While I think that this is absolutely the right thing to do, we need to indicate this in a very clear way in our changelog, for instance by writing in bold in the beginning "Backward incompatible change to GapEncoder" and describing the change.
I think that we need to test all the public methods with 2D inputs: score, get_feature_names, transform. Just to make sure that we have no hidden problem.
Finally, I would suggest to change the default of SuperVectorizer to now use the GapEncoder (which is the motivation for this change). It will stress-test a lot this implementation, and will help us convince ourselves that it is functional.
I have made a pull request on your branch @LilianBoulard with many updates on the PR:
|
Absolutely awesome. @LilianBoulard , when you have time, can you merge @alexis-cvetkov 's branch in yours. We'll need the note in the CHANGES.rst (unless Alexis's branch implements it), and we might be good to go. |
Update PR Gap encoder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me !
OK, I commited a few changes directly in the PR and will merge once CI has ran. Thanks a lot!! |
Merging! Thanks!! |
Here it should be (n_samples, ) because we are in the GapEncoderColoumn class. The GapEncoder expects 2D arrays, but the GapEncoderColumn expect 1D arrays.
Thank you for your review. I will fix my mistake
|
Fixes #165