Code for ICML 2019 paper "Nearest Neighbor and Kernel Survival Analysis: Nonasymptotic Error Bounds and Strong Consistency Rates"

Author: George H. Chen (georgechen [at symbol]

Paper: arXiv

Code requirements:

  • Anaconda Python 3.6
  • Additional packages: joblib, lifelines
  • cython compilation is required:
python build_ext --inplace

The main code implementing all the different nonparametric survival methods from the paper is in Cython helper code for random survival forests is in random_survival_forest_cython.pyx. There are two main utility files: deals with loading datasets (the "pbc" dataset is loaded from the statsmodels Python package; the "gbsg2" and "recid" datasets are loaded from the "data/"), and has some helper calculation functions. Note: the "kidney" dataset is not public so I have removed it from this distribution. These Python files just mentioned should not be directly run. Instead the files that should be run are the demo_*.py files (e.g., python config_tiny.ini, which saves results to the directory output_tiny); in particular, to generate all the experimental results for the "pbc", "gbsg2" and "recid" datasets (and save their results to csv files in the directory output), run ./ (warning: this takes a while to run).

After running, a simple way to display all the tabulated outputs is to run python config.ini. To produce the plots (excluding the "kidney" dataset) in the main part of the paper (i.e., not the extended results), run python config.ini. To produce the plots in the appendix (the extended results, excluding the "kidney" dataset), run python config.ini. Note that these display/plot scripts require the auxiliary text files survival_estimator_names.txt and survival_estimator_names_short.txt.

Important: If you do not want to re-run all the methods but still want to produce plots (excluding for the "kidney" dataset), I have included precomputed csv tables in the folder precomputed. Please move the csv files in this folder to be in the output directory (as specified in the configuration file used; by default if using the provided config.ini file, the output directory is output) and run the plotting code to regenenerate plots.


