This repository contains the code "ScaledLengthTPM.R" for performing RNA-Seq data analysis, specifically calculating length-scaled transcripts per million (TPM) values using the Salmon tool and the GenomicFeatures package in R.
Before using the code, ensure that you have the following:
- R programming language installed (version 3.5 or higher)
- Required R packages installed: readr, tidyr, tximport, GenomicFeatures
-
Clone the repository or download the "ScaledLengthTPM.R" file to your local machine.
-
Set the working directory:
setwd("path/to/your/directory")
Replace "path/to/your/directory" with the appropriate path to the directory containing your RNA-Seq data and the "transcriptome.gtf" file.
- Install the required R packages:
install.packages(c("readr", "tidyr", "tximport", "GenomicFeatures"))
-
Prepare your data:
- Place your RNA-Seq data files in the "siva_SALMON_OUT/WT" directory.
- Ensure that the quantification files generated by Salmon have the extension "quant.sf" and are located in the appropriate directories.
-
Modify the code if necessary:
- If your "transcriptome.gtf" file is named differently or located in a different directory, update the "gtf_file" variable in the code.
- Adjust any other relevant paths or parameters according to your data and analysis requirements.
-
Execute the code:
source("ScaledLengthTPM.R")
This will run the code and perform the RNA-Seq data analysis, generating the required outputs.
-
Review the outputs:
- The code will generate two CSV files: "tx2gene-WT.csv" and "txi_lengthscaledTPM_WT.csv".
- "tx2gene-WT.csv" contains the mapping of transcript IDs to gene IDs.
- "txi_lengthscaledTPM_WT.csv" contains the calculated length-scaled TPM values for each transcript.
-
Interpret the results and use them for downstream analysis or visualization as needed.
-
The code assumes that the necessary files (RNA-Seq data, "transcriptome.gtf") are correctly organized in the provided directories. Double-check the paths and file names to ensure they match your setup.
-
For more information on the functions and packages used in the code, refer to the official documentation:
- Salmon: https://salmon.readthedocs.io
- GenomicFeatures: https://bioconductor.org/packages/GenomicFeatures
-
If you encounter any issues or have questions, feel free to open an issue in this repository.
Feel free to customize the README file according to your specific repository and provide additional instructions or information as needed.