Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data
precomputed
README.md
config.ini
demo.sh
demo_cdfreg_knn_weighted.py
demo_coxph.py
demo_kernel.py
demo_knn_weighted.py
demo_rsf.py
demo_rsfann.py
npsurvival_models.py
random_survival_forest_cython.pyx
setup_random_survival_forest_cython.py
survival_datasets.py
survival_estimator_names.txt
survival_estimator_names_short.txt
table_aggregator.py
table_aggregator_plot.py
table_aggregator_plot_short.py
util.py

README.md

Code for ICML 2019 paper "Nearest Neighbor and Kernel Survival Analysis: Nonasymptotic Error Bounds and Strong Consistency Rates"

Author: George H. Chen (georgechen [at symbol] cmu.edu)

Paper: arXiv

Code requirements:

  • Anaconda Python 3.6
  • Additional packages: joblib, lifelines
  • cython compilation is required:
python setup_random_survival_forest_cython.py build_ext --inplace

The main code implementing all the different nonparametric survival methods from the paper is in npsurvival_models.py. Cython helper code for random survival forests is in random_survival_forest_cython.pyx. There are two main utility files: survival_datasets.py deals with loading datasets (the "pbc" dataset is loaded from the statsmodels Python package; the "gbsg2" and "recid" datasets are loaded from the "data/"), and util.py has some helper calculation functions. Note: the "kidney" dataset is not public so I have removed it from this distribution. These Python files just mentioned should not be directly run. Instead the files that should be run are the demo_*.py files (e.g., python demo_rsfann.py config_tiny.ini, which saves results to the directory output_tiny); in particular, to generate all the experimental results for the "pbc", "gbsg2" and "recid" datasets (and save their results to csv files in the directory output), run ./demo.sh (warning: this takes a while to run).

After running demo.sh, a simple way to display all the tabulated outputs is to run python table_aggregator.py config.ini. To produce the plots (excluding the "kidney" dataset) in the main part of the paper (i.e., not the extended results), run python table_aggregator_plot_short.py config.ini. To produce the plots in the appendix (the extended results, excluding the "kidney" dataset), run python table_aggregator_plot.py config.ini. Note that these display/plot scripts require the auxiliary text files survival_estimator_names.txt and survival_estimator_names_short.txt.

Important: If you do not want to re-run all the methods but still want to produce plots (excluding for the "kidney" dataset), I have included precomputed csv tables in the folder precomputed. Please move the csv files in this folder to be in the output directory (as specified in the configuration file used; by default if using the provided config.ini file, the output directory is output) and run the plotting code to regenenerate plots.

You can’t perform that action at this time.