Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WeibullAFTFitter does not ignore the index of the DataFrame and errors (NaNs were detected in the dataset.) #1215

Closed
joshlk opened this issue Feb 4, 2021 · 2 comments · Fixed by #1216

Comments

@joshlk
Copy link

joshlk commented Feb 4, 2021

Show by example:

from lifelines import WeibullAFTFitter
import pandas as pd

data = pd.DataFrame({'duration': np.arange(1, 11), 'event': np.ones(10), 'x0': np.arange(1, 11)})

regr = WeibullAFTFitter()
regr.fit(data, duration_col='duration', event_col='event')

That works fine. But if I take a row subset of the data so the DataFrame index isn't arange:

sub_data = data[[True, False]*5]
regr = WeibullAFTFitter()
regr.fit(sub_data, duration_col='duration', event_col='event')

It produces the following error even though there is no inf or nan values:

TypeError: NaNs were detected in the dataset. Try using pd.isnull to find the problematic values.

It seems like the function is not ignoring the DataFrame index.

Versions:

  • Lifelines: 0.25.8
  • Pandas: 1.1.5
@CamDavidsonPilon
Copy link
Owner

Hi @joshlk, yup looks like a bug. I think the source is formulaic however. Ex:

from formulaic import Formula

design_info = Formula("1")

df = pd.DataFrame(np.arange(5), index=[0, 2, 4, 6, 8])

print(design_info.get_model_matrix(df))

I'll make an issue with them to resolve this. For now, you can either downgrade a version of lifelines, say 0.25.7, (we recently swapped to Formulaic), or do a .reset_index(drop=True) prior to fitting:

sub_data = data[[True, False]*5].reset_index(drop=True)
regr = WeibullAFTFitter()
regr.fit(sub_data, duration_col='duration', event_col='event')

@CamDavidsonPilon
Copy link
Owner

CamDavidsonPilon commented Feb 5, 2021

👋 Formulaic just released 2.2 - try an upgrade locally pip install formulaic==0.2.2 to fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants