Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requested changes made #17

Merged
merged 8 commits into from Nov 24, 2017
Merged

Conversation

chriswbartley
Copy link
Collaborator

Christoph, I finally got to do all those changes. I have done everything as requested with the exception of adding the 'Cs' parameter into fit(). I put that in the constructor for RuleFit to match all the sklearn standard. Also, FYI LassoCV uses alphas and n_alphas so I had to convert Cs to alphas=1/Cs and n_alphas as needed.

Hopefully I haven't missed anything.

Cheers, Chris

chriswbartley and others added 8 commits September 1, 2017 14:00
- Fix: uses binomial (log) loss for classification
- Added: use of Friedman standardisation on linear variables (Winsorised
and scaled by 0.4/stdev)
- Added: use of Friedman randomisation of number of terminal nodes using
exponential distrbution
- Fixed: use of set for rules sometimes caused wrong coeficients to be
associiated with the wrong rules! Rules are now stored as a list (ie
ordered)
- Added ability for certain features to be constrained monotone
(upcoming paper!)
- Improved: sped up prediction by not evaluating rules with zero
coefficients
- Added: Max rules parameter like Friedman
- Added: Invisible use of BoostingRegressor/Classifier (created
according to constructor parameters)
Added a lot of features to make it more like the original paper, and the
interface like Friedmans R implementation
(http://statweb.stanford.edu/~jhf/r-rulefit/RuleFit_help.html)
- Added: binomial (log) loss for classification (using glmnet_py)
- Added: use of Friedman standardisation on linear variables (Winsorised
and scaled by 0.4/stdev)
- Added: use of Friedman randomisation of number of terminal nodes using
exponential distrbution
- Fixed: use of a set (i.e. unordered) for rules sometimes caused wrong
coeficients to be
associiated with the wrong rules! Rules are now stored as a list (ie
ordered)
- Improved: sped up prediction by not evaluating rules with zero
coefficients
- Added: Max rules parameter like Friedman
- Added: Invisible use of BoostingRegressor/Classifier (created
according to constructor parameters, like Friedman's R implementation)
Updated comments to describe rulefit constructor. Wherever possible it
now matches Friedman's R library
(http://statweb.stanford.edu/~jhf/r-rulefit/RuleFit_help.html).
Removed some testing guff
@chriswbartley
Copy link
Collaborator Author

Oops, I'm new to github I just realised I didn't need to close the last pull request for it to update...

@christophM
Copy link
Owner

Great! Thanks a lot for this contribution

@christophM christophM merged commit 20350e9 into christophM:master Nov 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants