This project contains all code used during constructioin of HDncRNA
E-Utilities is called to complete search and fetch missions. In this folder, we provided xx perl scripts and xx list files. Below is the introduction about every file.
This script is used to combine disease key words and ncRNA names.
This script is used to search in Pubmed and fetch target Pubmed IDs.
This file contains all heart diseases, which are used to be search key words.
This folder contains shell script which is used to identify lncRNA by sequence’s length, ORF length and coding potential.
This script uses .diff file created by Cuffdiff as input file, and the output file is named as "length_over_200_po_lnc.diff" and transcripts in this file are all longer than 200bp. Output file in this progress will be input file of ORFpredictor.
This file uses the output file of ORFpredictor as input file, and the output file is the list file contains transcripts which predicted ORF is less than 300bp. Output file in this progress will be input file of Coding Potential Calculator.