Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Repeating Basis Functions #171

Merged
merged 14 commits into from Aug 22, 2019
Merged

Conversation

RensDimmendaal
Copy link
Contributor

I worked on this a while ago in PR #147, but I started from a fresh branch because I decided to limit the scope only to repeating basis functions (#20), excluding the spanning basis functions (#29).

Feedback adapted so far:

  • Write a transformer when X contains only one column
  • Write a wrapper which selects one column, when X has more than one column
  • Added basic tests
  • Added docstring

To Do:

  • Write documentation

Could someone review the code?
I'll work on the documentation in the meantime as well.

Tagging people involved in PR #147: @kayhoogland @koaning @MBrouns

sklego/preprocessing.py Outdated Show resolved Hide resolved
sklego/preprocessing.py Outdated Show resolved Hide resolved
sklego/preprocessing.py Outdated Show resolved Hide resolved
@koaning
Copy link
Owner

koaning commented Jul 31, 2019

I've taken a few minutes to have a glance, will try to find some more time to look at the code later.

@RensDimmendaal
Copy link
Contributor Author

  • Adopted the docstring suggestions
  • Rename the floor/ceil parameters to input_range to clarify what it does
  • Added documentation as a notebook

Not sure why travis fails, seems to me like the issue is with other notebooks which i have not touched link.

@koaning
Copy link
Owner

koaning commented Aug 12, 2019

So there's a bug in a part of the code that has to do with the grouped-estimator. If that part of the library is causing this then I'll still gladly merge the work here.

@koaning
Copy link
Owner

koaning commented Aug 12, 2019

image

but this seems to be formatting on your side?

@koaning
Copy link
Owner

koaning commented Aug 12, 2019

But this you may safely ignore, this is something that I am picking up with @pim-hoeven.

image

@RensDimmendaal
Copy link
Contributor Author

Thanks. My bad on the formatting issue. Fixed it now.

image

If that other issue can be ignored then I think it is ready.

@koaning
Copy link
Owner

koaning commented Aug 12, 2019

Great. I'll have a look at the end of the day.

@koaning
Copy link
Owner

koaning commented Aug 13, 2019

image

Great that you took the effort of moving the preprocessing documentation into a notebook but is there a reason why only your code is evaluated?

@koaning
Copy link
Owner

koaning commented Aug 13, 2019

I am playing with the notebook and i see something odd.

image

image

When remainder should be dropping ... it isn't?

@koaning
Copy link
Owner

koaning commented Aug 13, 2019

I am having worries about the remainder behaviour. To me it feels like the base behaviour should be that it always drops the column. In my mind it would be used similar to this:

image

You would have a feature union split the steps and pass the correct column to the model. I think having passthrough behaviour can still be valid but I would argue that it goes against a base behaviour.

@MBrouns thoughts?

@RensDimmendaal
Copy link
Contributor Author

Docs:
Had to do a slight rewrite of the code snippet you screenshotted above, but I've now converted all snippets in the preprocessing docs to working code.

Regarding faulty "drop" behavior:
Shame on me, that's a bug. I've fixed it and added a test for it now as well.

Default behavior "drop"/"passthrough":
That's fair. It makes more sense to have "drop" as the default. That's also the default of the ColumnTransformer which I use to accomplish the behavior.

@MBrouns
Copy link
Collaborator

MBrouns commented Aug 14, 2019

regarding the drop / passthrough I think we should go for the principal of least surprise and to me it feels like the transformer is most like the one hot encoder. The OHE keeps all the non-categorical columns and simply adds the encoded columns so for me it would make most sense to have the same behaviour here.

On the other hand, the categorical_features attribute of the OHE is getting deprecated and users are told to use a ColumnTransformer instead, so I don't know..

@koaning
Copy link
Owner

koaning commented Aug 22, 2019

Oh yeah. I forgot to click the merge button. @RensDimmendaal well done!

@koaning koaning merged commit 4470ad0 into koaning:master Aug 22, 2019
@RensDimmendaal RensDimmendaal deleted the add-rbf branch August 23, 2019 06:21
@RensDimmendaal
Copy link
Contributor Author

RensDimmendaal commented Aug 23, 2019

Whoo! :-) Thanks for the help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants