SRF
SRF which stands for Subtyping through Ranked Factorisation is an algorithm for finding cancer subtypes and subtype specific features by integrating mutation data, expression data and biological networks.
Requirements
- Linux/Unix
- Scala (2.10)
- Java (1.7 or above)
- Gurobi solver (http://www.gurobi.com/)
- OscaR (https://bitbucket.org/oscarlib/oscar/wiki/Home)
Support
For support using SRF, please contact thanh.levan@cs.kuleuven.be
Setup
Compilation
First, use Eclipse IDE for Scale (http://scala-ide.org/) to import the two projects, including SRF and SRFWrapper, in the current folder. Then, use the export tool of the IDE to package the SRFWrapper/src/ParallelSubtypingUsingRMF.java into a jar file named ParallelSubtypingUsingRMF.jar.
How to compile the project using sbt tool will be added soon.
Run diffusion
To be added soon.
Convert to ranked data
To be added soon.
Hierarchical clustering
To be added soon.
Run
Set up enviroment variables for Gurobi
SRF uses Gurobi to solve the optimisation problem. Hence, before running SRF, make sure that Gurobi is installed and its enviroment variables are setup correctly.
Parameters
=============================================================================================================
| PARAMETER NAME | DEFAULT | DESCRIPTION |
=============================================================================================================
|-df | None |Absolute path to the tab-separated ranked diffusion file |
| | |where each row contains |
-------------------------------------------------------------------------------------------------------------
|-ef | None |Absolute path to the tab-separated ranked expression file |
| | | |
-------------------------------------------------------------------------------------------------------------
|-if | None |Initialised matrix F obtained by a hierarchical |
| | |clustering. |
-------------------------------------------------------------------------------------------------------------
|-k | 5 |Number of ranked factors |
| | | |
-------------------------------------------------------------------------------------------------------------
|-etheta | 0.65 |Rank expression threshold (\theta_2 in the paper) |
| | |Remember that it is a number in the range of [0..1] |
| | |The actual integer threshold that is used in the program is |
| | |calculated as followed: \theta_2 * n, where n is the number |
| | |of the columns of the rank matrix. |
-------------------------------------------------------------------------------------------------------------
|-dtheta | 0.86 |Rank diffusion threshold (\theta_1 in the paper) |
| | |Remember that it is a number in the range of [0..1] |
| | |The actual integer threshold that is used in the program is |
| | |calculated as followed: \theta_1 * maxD, where maxD is the |
| | |user-input value specifying the maximum value of rank |
| | |diffusion. |
-------------------------------------------------------------------------------------------------------------
|-beta | 1 |The rank imbalance threshold used to specify the relative
| | |importance between mutation data and expression data. |
| | | |
-------------------------------------------------------------------------------------------------------------
|-k | 5 |Number of ranked factors |
| | | |
-------------------------------------------------------------------------------------------------------------
|-nReqMut | 2 |Number of required mutations |
| | | |
-------------------------------------------------------------------------------------------------------------
|-maxD | 0 |Maximum value of ranked diffusion. This should be equal to |
| | |the number of vertices in the graph used to run the |
| | |diffusion. Note that if maxD = 0, the program |
| | |uses the number of rows of the ranked diffusion matrix |
| | |the maximum value of ranked diffusion. |
-------------------------------------------------------------------------------------------------------------
|-maxE | 0 |Maximum value of ranked expression. This should be equal to |
| | |the number of the columns of the ranked expression |
| | |matrix. Note that if maxE = 0 or is not specified, |
| | |the program automatically uses the number of the columns |
| | |of the ranked expression matrix the maximum value of ranked |
| | |expression. |
-------------------------------------------------------------------------------------------------------------
|-dir | ./ |Working directory which will be used to store the results |
| | | |
-------------------------------------------------------------------------------------------------------------
|-log | false |Log intermediate results into files |
| | | |
-------------------------------------------------------------------------------------------------------------
Command line
java -Xmx4096m -jar path_to_the_jar_file/ParallelSubtypingUsingRMF.jar
Citation
If you use SRF in your work, please cite:
T. Le Van, M. van Leeuwen, A. Fierro, D. De Maeyer, J. Van den Eynden, L. Verbeke, L. De Raedt, K. Marchal, S. Nijssen. (2016) Simultaneous discovery of cancer subtypes and subtype features by molecular data integration. Bioinformatics, ?, (2016).