Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minimal fix to resolve #707 #720

Merged
merged 4 commits into from Dec 1, 2023

Conversation

ras44
Copy link
Collaborator

@ras44 ras44 commented Nov 29, 2023

Proposed changes

This is a minimal fix to resolve the issue described in #707. The fix contains assertions in the functions get_cumlift, get_qini, get_tmlegain, and get_tmleqini that required columns in the dataframes do not contain null values. An example test for get_cumlift is included.

Types of changes

What types of changes does your code introduce to CausalML?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have read the CONTRIBUTING doc
  • I have signed the CLA
  • Lint and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in downstream modules

Further comments

None

Comment on lines 81 to 85
(outcome_col in df.columns and df[[outcome_col]].notnull().all().bool())
and (treatment_col in df.columns and df[[treatment_col]].notnull().all().bool())
or (
treatment_effect_col in df.columns
and df[[treatment_effect_col]].notnull().all().bool()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type of all() is already bool. It doesn't need bool().

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about adding the skipna=False as an input argument so users can explicitly choose to allow NaNs?

Copy link
Collaborator

@jeongyoonlee jeongyoonlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge this PR as is and open a new PR for adding an option for skipna=False. Thanks for the contribution.

@jeongyoonlee jeongyoonlee merged commit ae6c28e into uber:master Dec 1, 2023
6 checks passed
@ras44
Copy link
Collaborator Author

ras44 commented Dec 5, 2023

Let's merge this PR as is and open a new PR for adding an option for skipna=False. Thanks for the contribution.

Took a quick look at this and not sure how the skipna=False argument should work:

  • the current assertions ensure that there are no NaNs in the required columns
  • therefore calculating .cumsum(skipna) seems redundant since there never will be NaNs

https://github.com/uber/causalml/blob/d26e0891f3aa6279e359fe6abe1dd77e1f125c8a/causalml/metrics/visualize.py#L115C5-L115C5

Is the idea for skipna to allow NaN's in the dataframe? So:

  • if skipna==True then
    • skip assertion check and calculate .cumsum(skipna)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants