Enabling higher orders feature importance for F filter and LR filter #509

zhenyuz0500 · 2022-05-04T23:31:18Z

Proposed changes

Enabling higher orders of feature importance for F filter and LR filter.

Previously, the F filter and LR filter can only detect uplift feature importance based on linear pattern.
This diff adds 2nd and 3rd orders of feature transformations into the feature selection model.
An argument 'order' is added to F filter and LR filter methods to control the orders of feature to be added to the evaluation, which takes value in 1, 2, 3.

The example notebook for feature selection is updated to show how to use this new feature.

Types of changes

What types of changes does your code introduce to CausalML?
Put an x in the boxes that apply

Bugfix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation Update (if none of the other choices apply)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

I have read the CONTRIBUTING doc
I have signed the CLA
Lint and unit tests pass locally with my changes
I have added tests that prove my fix is effective or that my feature works
I have added necessary documentation (if appropriate)
Any dependent changes have been merged and published in downstream modules

Further comments

If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution you did and what alternatives you considered, etc. This PR template is adopted from appium.

paullo0106 · 2022-05-06T19:13:51Z

causalml/feature_selection/filters.py

+        elif order == 2:
+            F_test = result.f_test(np.array([[0, 0, 0, 1, 0, 0], [0, 0, 0, 0, 0, 1]]))
+        elif order == 3:
+            F_test = result.f_test(


@zhenyuz0500 can you remind me why is the r_matrix configured in this way? Maybe we can add comment here too

The r_matrix assumes the linear combination of each row times the coefficients equals 0.
For example, when order=2, then the linear model will be: a + b1 * I_treatment + b2 * x + b3 * x * I_treatment + b4 * x^2 + b5 * x^2 * I_treatment
we want to test H0: b3==0 and b5 ==0 vs H1: b3!=0 or b5 !=0
then it translates to test H0: [0, 0, 0, 1, 0, 0] * [a, b1, b2, b3, b4,b5]' = 0 and [0, 0, 0, 0, 0, 1] * [a, b1, b2, b3, b4, b5]' = 0

reference: https://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.RegressionResults.f_test.html

I see, thanks!

thanks for the review!

paullo0106

LGTM with one question, thanks Zhenyu!

zhenyuz0500 added 2 commits May 5, 2022 07:27

Enabling higher orders feature importance for F filter and LR filter

598d25e

reformatting filters file

77a0f37

zhenyuz0500 requested review from paullo0106 and t-tte May 6, 2022 19:01

paullo0106 reviewed May 6, 2022

View reviewed changes

paullo0106 approved these changes May 6, 2022

View reviewed changes

zhenyuz0500 merged commit 5eca506 into master May 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enabling higher orders feature importance for F filter and LR filter #509

Enabling higher orders feature importance for F filter and LR filter #509

zhenyuz0500 commented May 4, 2022

paullo0106 May 6, 2022

zhenyuz0500 May 6, 2022 •

edited

paullo0106 May 7, 2022

zhenyuz0500 May 7, 2022

paullo0106 left a comment

Enabling higher orders feature importance for F filter and LR filter #509

Enabling higher orders feature importance for F filter and LR filter #509

Conversation

zhenyuz0500 commented May 4, 2022

Proposed changes

Types of changes

Checklist

Further comments

paullo0106 May 6, 2022

Choose a reason for hiding this comment

zhenyuz0500 May 6, 2022 • edited

Choose a reason for hiding this comment

paullo0106 May 7, 2022

Choose a reason for hiding this comment

zhenyuz0500 May 7, 2022

Choose a reason for hiding this comment

paullo0106 left a comment

Choose a reason for hiding this comment

zhenyuz0500 May 6, 2022 •

edited