minimal fix to resolve #707 #720

ras44 · 2023-11-29T23:42:18Z

Proposed changes

This is a minimal fix to resolve the issue described in #707. The fix contains assertions in the functions get_cumlift, get_qini, get_tmlegain, and get_tmleqini that required columns in the dataframes do not contain null values. An example test for get_cumlift is included.

Types of changes

What types of changes does your code introduce to CausalML?
Put an x in the boxes that apply

Bugfix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation Update (if none of the other choices apply)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

I have read the CONTRIBUTING doc
I have signed the CLA
Lint and unit tests pass locally with my changes
I have added tests that prove my fix is effective or that my feature works
I have added necessary documentation (if appropriate)
Any dependent changes have been merged and published in downstream modules

Further comments

None

jeongyoonlee · 2023-12-01T17:33:22Z

causalml/metrics/visualize.py

+        (outcome_col in df.columns and df[[outcome_col]].notnull().all().bool())
+        and (treatment_col in df.columns and df[[treatment_col]].notnull().all().bool())
+        or (
+            treatment_effect_col in df.columns
+            and df[[treatment_effect_col]].notnull().all().bool()


The return type of all() is already bool. It doesn't need bool().

What about adding the skipna=False as an input argument so users can explicitly choose to allow NaNs?

jeongyoonlee

Let's merge this PR as is and open a new PR for adding an option for skipna=False. Thanks for the contribution.

ras44 · 2023-12-05T15:44:26Z

Let's merge this PR as is and open a new PR for adding an option for skipna=False. Thanks for the contribution.

Took a quick look at this and not sure how the skipna=False argument should work:

the current assertions ensure that there are no NaNs in the required columns
therefore calculating .cumsum(skipna) seems redundant since there never will be NaNs

https://github.com/uber/causalml/blob/d26e0891f3aa6279e359fe6abe1dd77e1f125c8a/causalml/metrics/visualize.py#L115C5-L115C5

Is the idea for skipna to allow NaN's in the dataframe? So:

if skipna==True then
- skip assertion check and calculate .cumsum(skipna)

rolandrmgservices added 2 commits November 29, 2023 23:20

minimal fix to resolve uber#707

1616f3a

lint

1354106

ras44 requested review from jeongyoonlee and vincewu51 November 29, 2023 23:42

jeongyoonlee reviewed Dec 1, 2023

View reviewed changes

rolandrmgservices added 2 commits December 1, 2023 18:59

remove .bool()

53adee1

operate on series not df

d610793

jeongyoonlee approved these changes Dec 1, 2023

View reviewed changes

jeongyoonlee merged commit ae6c28e into uber:master Dec 1, 2023
6 checks passed

jpansnap mentioned this pull request Dec 1, 2023

Fix missing value when calculate cumulative lift #707

Closed

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

minimal fix to resolve #707 #720

minimal fix to resolve #707 #720

ras44 commented Nov 29, 2023

jeongyoonlee Dec 1, 2023

jeongyoonlee Dec 1, 2023

jeongyoonlee left a comment

ras44 commented Dec 5, 2023

minimal fix to resolve #707 #720

minimal fix to resolve #707 #720

Conversation

ras44 commented Nov 29, 2023

Proposed changes

Types of changes

Checklist

Further comments

jeongyoonlee Dec 1, 2023

Choose a reason for hiding this comment

jeongyoonlee Dec 1, 2023

Choose a reason for hiding this comment

jeongyoonlee left a comment

Choose a reason for hiding this comment

ras44 commented Dec 5, 2023