Skip to content


Folders and files

Last commit message
Last commit date

Latest commit



11 Commits

Repository files navigation


reproducibility results for IAL2023 workshop paper, we are excited to share our findings and discusse more ideas!

Upcoming soon!

A detailed guide on how to run this messy code.

How to run

Step 0: make sure you have a Python environment with basic libraries. Check requirements.txt for more details.

Step 1: generate raw data with importing from the SurvSet library is needed. Use the pip install SurvSet==0.2.6 command to install the same version of the library.

Step 2: process the datasets with the script to get a 'clean' version of the data. This also sets-up a 5-fold cross validation and imputes missing values (do we? double check claim). Imputation is necessary since the underlying RSF model (as for scikit-survival==0.21) cannot handle missing values.

Step 3: Select relevant datasets for the analysis (see paper for more details), use the script for the task. The current paramters in the script are the ones used in the paper, and represernt a trade-off between having only 'good' datasets and having 'enough' datasets.

Step 4: Run the for the whole active learning procedure. This will call functions from other files such as It will populate the query-info-data and the plot-and-perfomr-info folders.

Step 5: Analysis of the data and plotting was performed with the and the

⚠️We initially generated 4 simulated datasets, and we later dropped two. We share the results for the 1st and 3rd datasets in the paper, and they are renamed as simul_data_1 and simul_data_2 respectively, but in the project folder they are genearted under the my_simul_data_1 and my_simul_data_3 respectively.