Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"print_summary" stalled by calculating concordance index #139

Closed
jingjian opened this issue Mar 26, 2015 · 3 comments
Closed

"print_summary" stalled by calculating concordance index #139

jingjian opened this issue Mar 26, 2015 · 3 comments

Comments

@jingjian
Copy link

when N = 1.5 million, it takes indefinite amount of time to calculate the concordance index, and at first I didn't know that was what's going on, until I interrupted the kernel and saw the partial printed summary. Maybe we could use a progress bar like you did with the fit function from AalenAdditiveFitter?

@CamDavidsonPilon
Copy link
Owner

Hmm, I like the progress bar idea. Anyways, here's a few options:

  1. Try print_summary(self, c_index=False)
  2. Can you confirm you are using the Fortran extensions to compute the c--index. @spacecowboy, how might one check that?

Also, N=1.5 million?! How long did the Cox estimation take to run?

@spacecowboy
Copy link
Collaborator

@CamDavidsonPilon When concordance-index is imported, a warning is printed if it's using the pure python version.

See: https://github.com/CamDavidsonPilon/lifelines/blob/master/lifelines/_utils/__init__.py

Did some quick benchmarking on my machine (with Fortran version):

N=1000: 6.29ms
N=10000: 571ms
N=100000: 49.4s

Because the concordance index is a O(N^2) operation, N=1million would theoretically on my machine require a runtime of 5000s (1.5hours).

So 1.5 million is not something one wants to calculate the concordance index for more than once...

@spacecowboy
Copy link
Collaborator

Since the implementation is in Fortran, it is as fast as it can be on a single core. The code can easily be tweaked to use multiple cores using OpenMP, but it would add another layer of complexity to the compilation. E.g., first try to to compile with OpenMP. If that fails, try without it (like now). And if that fails, just use python version.

It would mean a speed-up of 2-6x on many CPUs though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants