Bugfixes and change to default behavior of get_rules #18

dchristle · 2018-03-02T19:14:02Z

I fixed two bugs I ran into:

If one of the feature columns has a constant feature, then its standard deviation will be zero, and the friedman scaling done will have a divide by 0 error. I added a small constant to prevent this division by zero.
If Cs is passed to RuleFit when instantiating the object, it won't be passed properly to the LogisticRegression subroutine -- it should be self.Cs, instead of Cs.

I also ran into quirky behavior with get_rules versus transform. When transforming, I would get out a matrix with 116 columns, corresponding to 116 transformed features. When inspecting the rules with the output of get_rules, the total number of rules would only be 115. This was pretty frustrating, but I tracked down the source of the issue to be that when exclude_zero_coef was set to True, one of the rules was being eliminated. I think that the behavior between get_rules and transform should be identical -- either the variables with zero coefficient are eliminated from both, or neither. So this change at least makes the two consistent.

…re has zero variance.

… creating a RuleFit object, this fails since self.Cs exists and Cs is null.

…set to true, the behavior between transform (which does not appear to drop coefficients with 0) and the behavior of get_rules are different, which was very confusing to me. I could not understand why the transformed output had one more row than the total number of decision tree rules I retrieved using the get_rules method.

christophM · 2018-03-25T08:51:09Z

Thanks for the fixes!

chriswbartley · 2018-03-26T01:03:36Z

Seconded, thank you David.

David Christle and others added 4 commits March 2, 2018 10:56

Added a small constant to prevent division by zero if the input featu…

cfe986b

…re has zero variance.

Wrong variable passed to LogisticRegressionCV. If self.CS is set when…

b6d14ee

… creating a RuleFit object, this fails since self.Cs exists and Cs is null.

cleanup of spacing

5f9ff45

christophM merged commit 0b65f7a into christophM:master Mar 25, 2018

danallison mentioned this pull request Jun 18, 2018

Fix default Cs value #21

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugfixes and change to default behavior of get_rules #18

Bugfixes and change to default behavior of get_rules #18

dchristle commented Mar 2, 2018

christophM commented Mar 25, 2018

chriswbartley commented Mar 26, 2018

Bugfixes and change to default behavior of get_rules #18

Bugfixes and change to default behavior of get_rules #18

Conversation

dchristle commented Mar 2, 2018

christophM commented Mar 25, 2018

chriswbartley commented Mar 26, 2018