TiRank is a comprehensive tool for integrating and analyzing RNA-seq and scRNA-seq data. It enables researchers to identify phenotype-associated spots by integrating spatial transcriptomics or single-cell RNA sequencing data with bulk RNA sequencing data. TiRank supports various analysis modes, including survival analysis (Cox), classification, and regression.
- Integration of Bulk and Single-cell Data: Seamlessly integrates bulk RNA-seq data with single-cell or spatial transcriptomics data.
- Multiple Analysis Modes: Supports Cox survival analysis, classification, and regression modes.
- Visualization Tools: Provides functions for visualizing results, including UMAP plots and spatial maps.
- Customizable Hyperparameters: Offers flexibility in tuning hyperparameters to optimize results.
TiRank can be installed using one of the following methods. We recommend creating a new conda environment for TiRank to ensure compatibility and isolation from other Python packages.
- Anaconda or Miniconda: For managing Python environments.
- Python 3.9: TiRank requires Python version 3.9.
-
Set up a new conda environment:
conda create -n TiRank python=3.9 -y conda activate TiRank
-
Clone the TiRank repository from GitHub:
git clone git@github.com:LenisLin/TiRank.git
-
Install TiRank via pip:
pip install TiRank
-
Install required dependencies:
TiRank depends on the timm==0.5.4
package from TransPath. Follow these steps to install it:
- Install the package:
pip install ./TiRank/timm-0.5.4.tar # Replace with your actual path
- Reference Link.
-
Prepare Example Data:
- Download the example data from Google Drive
-
(Optional, for Spatial Transcriptomics): Download the pre-trained CTransPath model weights.
(Instructions to be provided)
-
Install TiRank: Follow the installation steps described in Method 1
-
Activate the Web Server:
- Navigate to the Web directory:
cd TiRank/Web
- Set up data directories:
- Create a
data
directory:
- Create a
mkdir data
- Inside the
data
directory, create anExampleData
folder and download the sample data:
cd data mkdir ExampleData
Download the sample data from Google Drive into the
ExampleData
directory.- Return to the
Web
directory:
cd ../
- Verify the directory structure:
Web/ ├── assets/ ├── components/ ├── img/ ├── layout/ ├── data/ │ ├── ExampleData │ │ ├── CRC_ST_Prog/ │ │ └── SKCM_SC_Res/ ├── tiRankWeb/ └── app.py
- Run the web application:
python app.py
Note: If you encounter any issues with image loading, ensure that you are running the program from the Web
directory.
For more tutorials on using the web interface, please refer to the "Tutorials" section within the web application.
Please choose the installation method that best suits your setup. If you encounter any issues, feel free to open an issue on the TiRank GitHub Issues page.
After running TiRank, you can find the results in the savePath/3_Analysis/
directory. The key output file is spot_predict_score.csv
, where the Rank_Label
column represents the TiRank prediction results.
-
For
Cox
mode:Rank+
spots are associated with worse survival.Rank-
spots are associated with better survival.
-
For
Classification
mode:Rank+
spots are associated with the phenotype of the group encoded as1
.Rank-
spots are associated with the phenotype of the group encoded as0
.
-
For
Regression
mode:Rank+
spots are associated with high phenotype label scores.Rank-
spots are associated with low phenotype label scores.- For example, if the input is the IC50 values of different cell lines,
Rank+
spots are associated with drug resistance, andRank-
spots are associated with drug sensitivity.
TiRank provides several hyperparameters that can be adjusted to optimize the analysis. The first three hyperparameters are crucial for feature selection in bulk transcriptomics, while the latter three are used for training the multilayer perceptron network. TiRank automatically selects suitable combinations for the training hyperparameters within a predefined range.
-
top_var_genes
:- Description: The number of top variable genes to select from the bulk RNA-seq data.
- Default:
2000
- Recommendation: If you find that the number of filtered genes is low, consider increasing
top_var_genes
.
-
p_value_threshold
:- Description: The p-value threshold for selecting genes significantly associated with the phenotype.
- Default:
0.05
- Recommendation: If too few genes are selected, consider increasing
p_value_threshold
.
-
top_gene_pairs
:- Description: The number of top gene pairs to select based on variability.
- Default:
2000
-
alphas
:- Description: Weights of different components in the total loss computation.
- Details: Adjusts the influence of each loss component during training.
-
n_epochs
:- Description: The number of training epochs.
- Recommendation: Increase if the model has not converged.
-
lr
(Learning Rate):- Description: Controls the step size during parameter updates.
- Recommendation: A lower
lr
leads to slower but more stable convergence. A higherlr
may speed up convergence but can cause the model to overshoot optimal solutions.