ENH - Implement Cox with Efron estimate #159

Badr-MOUFAD · 2023-05-30T11:45:35Z

A follow-up of #157,

Handling tied data using Efron estimate can be obtained by slightly modifying the Cox datafit as follows

$$ l(\beta) = -\langle s, \mathbf{X}\beta \rangle + \langle s, \log(\mathbf{B}e^{\mathbf{X}\beta} - \mathbf{A}e^{\mathbf{X}\beta}) \rangle $$

where $\mathbf{A}$ is a matrix chosen accordingly to account for the additional term in the Breslow $\log$.
Also, evaluating $\mathbf{A} v$ or $\mathbf{A}^\top v$ is cheap and can be obtained in linear time.

Link to the maths behind

…into cox-estimator

…into cox-efron

mathurinm · 2023-06-08T13:23:26Z

Maybe you can add a small section at the end of the doc example with lifelines to show that we handle ties the same way lifelines does? no need for another speed benchmark, just showing that we have this functionality too

Badr-MOUFAD · 2023-06-08T13:27:56Z

Maybe you can add a small section at the end of the doc example with lifelines to show that we handle ties the same way lifelines does? no need for another speed benchmark, just showing that we have this functionality too

Yes, I agree with the idea!

examples/plot_survival_analysis.py

mathurinm · 2023-06-08T16:24:13Z

The edited example and benchmark in this PR may interest @ogrisel too :)

ogrisel · 2023-06-08T18:04:00Z

Haha the benchmark plot! 😅

ogrisel · 2023-06-09T11:14:08Z

BTW do you see a similar perf improvement for smooth penalties?

Badr-MOUFAD · 2023-06-21T15:40:15Z

@ogrisel we have added support for L2 regularization too (with scipy LBFGS and our Prox Newton methods).
Check the bench figure

Also, here is the link to the complete benchmark and the repo to reproduce it.

Badr-MOUFAD and others added 30 commits May 17, 2023 16:27

init commit

f2b2ae7

implem datafit

5e4fa49

fix numba errors && unittest datafit

2d72326

normalize df with n_samples

1e8ea39

unittest Cox Estimator against lifeline

85d3766

debug script

6a37aaa

avoid ties

7a95dca

finding 0 or very small solution even for reg < 1

1e7083e

fix unittest: agree with lifeline

98999d1

require lifelines in CI tests

2541596

Cox docs

8e44a48

more on docs cox

24c20e2

normalize as external param

22f8461

dummy survival data docs

e8b1c1f

fix pydoctest

676e3ab

faster matmul && fix lifelines install in CI

8fb0063

preserve support of numba v0.56

77dfa65

make script debug reproducible

ce0631f

illustrate convergence failure of lifelines

088818b

fix

30e47e3

clean up

4fe40fa

add support of sparse data

1fc5e4f

Merge branch 'main' of https://github.com/scikit-learn-contrib/skglm …

8c3b657

…into cox-estimator

use Weibull for tm

308790b

unittest Cox sparse data

2aac108

clean ups

9d47207

setups efron

329ef37

compute val

547c566

implement grad and Hessian

ca12fb4

implement A and A.T dot ops

0670baa

Badr-MOUFAD added 3 commits May 31, 2023 11:25

Efron for sparse data

233ec7f

add argument with_ties in dummy data

7d22b44

sample data from weibull

5fb9486

mathurinm requested review from PABannier and QB3 June 6, 2023 07:15

mathurinm mentioned this pull request Jun 7, 2023

ENH improve speed of fitting Cox model by relying on the fast skglm solver CamDavidsonPilon/lifelines#1531

Open

Badr-MOUFAD added 4 commits June 7, 2023 17:25

update docs

e97b724

Merge branch 'main' of https://github.com/scikit-learn-contrib/skglm …

9e0a0cc

…into cox-efron

Merge branch 'main' of https://github.com/scikit-learn-contrib/skglm …

2efe6df

…into cox-efron

fix links to docs

7149ac6

typos

3bd41d7

example lifelines: data and compare sols

70bb89c

mathurinm reviewed Jun 8, 2023

View reviewed changes

examples/plot_survival_analysis.py Show resolved Hide resolved

mathurinm reviewed Jun 8, 2023

View reviewed changes

examples/plot_survival_analysis.py Outdated Show resolved Hide resolved

mathurinm reviewed Jun 8, 2023

View reviewed changes

examples/plot_survival_analysis.py Outdated Show resolved Hide resolved

Badr-MOUFAD added 5 commits June 8, 2023 16:30

example lifelines: speed up ratio

8a8eaab

example lifelines: check ties

e2147d0

example lifelines: typos and reformulations

57d46ce

example lifelines: fix heading

5bb4313

fix format

8e095c6

Badr-MOUFAD added Ready for review and removed Work In Progress labels Jun 8, 2023

mathurinm approved these changes Jun 8, 2023

View reviewed changes

mathurinm merged commit 395af5e into scikit-learn-contrib:main Jun 8, 2023
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH - Implement Cox with Efron estimate #159

ENH - Implement Cox with Efron estimate #159

Badr-MOUFAD commented May 30, 2023 •

edited

mathurinm commented Jun 8, 2023

Badr-MOUFAD commented Jun 8, 2023

mathurinm commented Jun 8, 2023

ogrisel commented Jun 8, 2023

ogrisel commented Jun 9, 2023

Badr-MOUFAD commented Jun 21, 2023

ENH - Implement Cox with Efron estimate #159

ENH - Implement Cox with Efron estimate #159

Conversation

Badr-MOUFAD commented May 30, 2023 • edited

mathurinm commented Jun 8, 2023

Badr-MOUFAD commented Jun 8, 2023

mathurinm commented Jun 8, 2023

ogrisel commented Jun 8, 2023

ogrisel commented Jun 9, 2023

Badr-MOUFAD commented Jun 21, 2023

Badr-MOUFAD commented May 30, 2023 •

edited