Fast Training of Support Vector Machines for Survival Analysis
This repository contains an efficient implementation of Survival Support Vector Machines as proposed in
Pölsterl, S., Navab, N., and Katouzian, A., Fast Training of Support Vector Machines for Survival Analysis, Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2015, Porto, Portugal, Lecture Notes in Computer Science, vol. 9285, pp. 243-259 (2015)
Pölsterl, S., Navab, N., and Katouzian, A., An Efficient Training Algorithm for Kernel Survival Support Vector Machines 4th Workshop on Machine Learning in Life Sciences, 23 September 2016, Riva del Garda, Italy
‼️ This repository is not actively maintained, please use sebp/scikit-survival instead ‼️
- Python 3.4 or later
- numpy 1.9 or later
- pandas 0.18
- scikit-learn 0.17
- scipy 0.16 or later
- C/C++ compiler
- ipyparallel (optional)
- seaborn (optional)
The easiest way to get started is to install Anaconda and setup an environment.
conda install -c sebp survival-svm
Installation from source
First, create a new environment, named
conda create -n ssvm python=3 --file requirements.txt
To work in this environment,
activate it as follows:
source activate ssvm
If you are on Windows, run the above command without the
source in the beginning.
Once you setup your build environment, you have to compile the C/C++ extensions and install the package by running:
python setup.py install
Alternatively, if you want to use the package without installing it, you can compile the extensions in place by running:
python setup.py build_ext --inplace
To check everything is setup correctly run the test suite by executing:
A more elaborate script that can be used to reproduce the results in the paper
grid_search_parallel.py in the examples directory.
When running it you need to specify the algorithm (
and dataset (
--dataset) to use:
# Start IPython cluster to run grid search in parallel ipcluster start & # Run cross-validation. Results are stored in results-veteran-l2_ranking.csv python examples/grid_search_parallel.py --dataset veteran --method l2_ranking # Find best hyper-parameter configuration and visualize the results python examples/plot-performance.py -o results.pdf results-veteran-l2_ranking.csv
The script runs cross-validation with 200 randomly selected 50/50 splits of the dataset. This is repeated for each possible configuration of hyper-parameters (see Methods section below). Each time the following performance measures are computed:
- Harrell's concordance index, and
- root mean squared error (RMSE) with respect to uncensored records.
The output is a CSV file that contains the performance on the test set for
each fold and hyper-parameter configuration. Additional options of the script
are available when running the script with the
The grid search for all methods contains 13 configurations for the regularization parameter alpha: 2i, from i = -12 to 12 in steps of 2. When using the hybrid ranking-regression loss, an additional 21 configurations for the ratio between the two losses are considered: 0.05 to 0.95 in steps of 0.05.
|l1||Naive implementation of Survival SVM using hinge loss.||-|
|l2_ranking||Fast implementation of Survival SVM using squared hinge loss (ranking objective only).||1.0|
|l2_regression||Fast implementation of Survival SVM using squared loss (regression objective only).||0.0|
|l2_ranking_regression||Fast implementation of Survival SVM using hybrid of squared hinge loss for ranking and squared loss for regression.||0.05, 0.10, …, 0.95|
The repository contains four datasets that are freely available and can be used to reproduce the results in the paper.
|actg320_aids or actg320_death||AIDS study||1,151||13||96 (8.3%)||AIDS defining event or death|
|breast-cancer||Breast cancer||198||80||62 (31.3%)||Distant metastases|
|veteran||Veteran's Lung Cancer||137||6||128 (93.4%)||Death|
|whas500||Worcester Heart Attack Study||500||14||215 (43.0%)||Death|
The source code is thoroughly documented and a HTML version of the API documentation is available at https://tum-camp.github.io/survival-support-vector-machine/
You can generate the documentation yourself using Sphinx 1.4 or later.
cd doc PYTHONPATH="..:sphinxext" sphinx-autogen api.rst make html xdg-open _build/html/index.html