Skip to content

Cox models estimated quantities to be changed in future release #636

@CamDavidsonPilon

Description

@CamDavidsonPilon

As of lifelines 0.18.6, and since it's inception, Cox models have returned a dataframe for specific estimated quantities. Example:

cp.hazards_
"""
          var1      var2      var3
coef  0.222207  0.050957  0.218314
"""

cp.confidence_intervals_
"""
                 var1      var2      var3
lower-bound  0.076603 -0.111503  0.069747
upper-bound  0.367812  0.213417  0.366882
"""


cp.standard_errors_
"""
       var1      var2      var3
se  0.07429  0.082889  0.075801
"""

This is an odd choice - my intuition suggests that:

  1. for estimates that are one dimensional, like standard_errors_ and hazards_, these should be pandas Series.

  2. for estimates that are two dimensional, like confidence_intervals_, they results should be still a dataframe, but transposed.

Generally, I feel tables should be tall and skinny, not short and wide.


So my proposal is to change to the following for a lifelines 0.19.0 release.

cp.hazards
"""
var1    0.222207
var2    0.050957
var3    0.218314
Name: coef, dtype: float64
"""

cp.confidence_intervals_
"""
      lower-bound  upper-bound
var1     0.076603     0.367812
var2    -0.111503     0.213417
var3     0.069747     0.366882
"""

This will be the API that the upcoming AFT models will have too (and it generalizes well to multi-parameter regressions).


Concerns

  • breaks a very consistent API. Users' code will break
    • in 0.19.x, when a user accesses the values above, I can print a message saying "hey btw this recently changed"
  • docs need to be updated
  • examples need to be updated

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions