-
-
Notifications
You must be signed in to change notification settings - Fork 551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unique baseline hazard for each strata #268
Unique baseline hazard for each strata #268
Conversation
cp = CoxPHFitter(normalize=False) | ||
cp.fit(rossi, 'week', 'arrest', strata=['race', 'paro', 'mar', 'wexp'], include_likelihood=True) | ||
npt.assert_almost_equal(cp.baseline_cumulative_hazard_[(0, 0, 0, 0)].ix[[14, 35, 37, 43, 52]].values, [0.28665890, 0.63524149, 1.01822603, 1.48403930, 1.48403930], decimal=2) | ||
npt.assert_almost_equal(cp.baseline_cumulative_hazard_[(0, 0, 0, 1)].ix[[27, 43, 48, 52]].values, [0.35738173, 0.76415714, 1.26635373, 1.26635373], decimal=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the baseline hazards are only slightly off, and so the errors accumulate in the cumulative. I'd like to understand why my values are slightly different. @IVANBARRIENTOS, is there a way I can access the non-cumulative hazards?
Also: any others tests you would recommend?
s_0 = self.baseline_survival_ | ||
col = _get_index(X) | ||
return pd.DataFrame(-np.dot(np.log(s_0), v.T), index=self.baseline_survival_.index, columns=col) | ||
if self.strata: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved this logic from the predict_survival function to the more "higher up" function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
return pd.DataFrame(-np.dot(np.log(s_0), v.T), index=self.baseline_survival_.index, columns=col) | ||
if self.strata: | ||
cumulative_hazard_ = pd.DataFrame() | ||
for stratum, stratified_X in X.groupby(self.strata): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cute use of groupby here
@@ -306,7 +308,7 @@ def test_both_concordance_index_function_deal_with_ties_the_same_way(): | |||
actual_times = np.array([1, 1, 2]) | |||
predicted_times = np.array([1, 2, 3]) | |||
obs = np.ones(3) | |||
assert fast_cindex(actual_times, predicted_times, obs) == slow_cindex(actual_times, predicted_times, obs) == 1.0 | |||
assert fast_cindex(actual_times, predicted_times, obs) == slow_cindex(actual_times, predicted_times, obs) == 1.0 | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all white space changes in this file
cp = CoxPHFitter(normalize=False) | ||
cp.fit(rossi, 'week', 'arrest', strata=['race', 'paro', 'mar', 'wexp']) | ||
npt.assert_almost_equal(cp.baseline_cumulative_hazard_[(0, 0, 0, 0)].ix[[14, 35, 37, 43, 52]].values, [0.28665890, 0.63524149, 1.01822603, 1.48403930, 1.48403930], decimal=2) | ||
npt.assert_almost_equal(cp.baseline_cumulative_hazard_[(0, 0, 0, 1)].ix[[27, 43, 48, 52]].values, [0.35738173, 0.76415714, 1.26635373, 1.26635373], decimal=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the baseline hazards are only slightly off, and so the errors accumulate in the cumulative. I'd like to understand why my values are slightly different. @IVANBARRIENTOS, is there a way I can access the non-cumulative hazards?
Also: any others tests you would recommend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I think it's because my estimates of beta are slightly off, and this is just a manifestation of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened an issue here: #272
…nate into nan values in the strata in which you're estimtaing the survival curve
33f680d
to
74b865d
Compare
Thanks for the updates! Yes looks like it's in the betas. One potential:
Lifelines normalizes by default whereas R does not. I'll dig in this
weekend
…On Dec 28, 2016 21:59, "Cameron Davidson-Pilon" ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In tests/test_estimation.py
<#268>:
> @@ -928,6 +929,19 @@ def test_strata_against_r_output(self, rossi):
npt.assert_almost_equal(cp.summary['coef'].values, [-0.335, -0.059, 0.100], decimal=3)
assert abs(cp._log_likelihood - -436.9339) / 436.9339 < 0.01
+ def test_hazard_works_as_intended_with_strata_against_R_output(self, rossi):
+ """
+ > library(survival)
+ > ross = read.csv('rossi.csv')
+ > r = coxph(formula = Surv(week, arrest) ~ fin + age + strata(race,
+ paro, mar, wexp) + prio, data = rossi)
+ > basehaz(r, centered=FALSE)
+ """
+ cp = CoxPHFitter(normalize=False)
+ cp.fit(rossi, 'week', 'arrest', strata=['race', 'paro', 'mar', 'wexp'])
+ npt.assert_almost_equal(cp.baseline_cumulative_hazard_[(0, 0, 0, 0)].ix[[14, 35, 37, 43, 52]].values, [0.28665890, 0.63524149, 1.01822603, 1.48403930, 1.48403930], decimal=2)
+ npt.assert_almost_equal(cp.baseline_cumulative_hazard_[(0, 0, 0, 1)].ix[[27, 43, 48, 52]].values, [0.35738173, 0.76415714, 1.26635373, 1.26635373], decimal=2)
Ah, I think it's because my estimates of beta are slightly off, and this
is just a manifestation of that.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#268>, or mute the
thread
<https://github.com/notifications/unsubscribe-auth/AHM9T9TKXcZY9311ogSPDqAZOnan3p8Dks5rMzALgaJpZM4LXTQA>
.
|
New PR to replace #196
cc @jstoxrocky and @IVANBARRIENTOS