Skip to content

carnegie/QTG_Finder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

QTG_Finder (version 2.0)

QTG-Finder is a machine-learning pipeline to prioritize causal genes for QTLs identified by linkage mapping. We trained QTG-Finder models for Arabidopsis, rice, sorghum, Setaria viridis based on known causal genes and orthologs of known causal genes, respectively. By utilizing additional information like polymorphisms, function annotation, co-function network, paralog copy number, the models can prioritize causal genes for QTLs identified by QTL mapping.

Authors: Fan Lin, February 2020
Environment: Python 3.7.3

For prediction

The source code and input files can be found in the 'QTG2_prediction' folder. Running the 'QTG_Finder_predict.py' will require a QTL gene list provided by the user.

  1. Users can prepare the QTL gene list as a single column table (.csv). See "SV_height_QTL_example.csv" or "AT_Seedsize_QTL_example.csv" for a example.
//
QTL1 name
Gene1 in QTL1
Gene2 in QTL1
Gene3 in QTL1
//
QTL2 name
Gene1 in QTL2
Gene2 in QTL2
Gene3 in QTL2
  1. The pre-calculated models can be downloaded from the following links:
    Arabidopsis: https://carnegiedpb.s3.amazonaws.com/software/QTG2_prediction/AT_model.dat.zip
    Rice: https://carnegiedpb.s3.amazonaws.com/software/QTG2_prediction/OS_model.dat.zip
    Sorghum: https://carnegiedpb.s3.amazonaws.com/software/QTG2_prediction/SB_model.dat.zip
    Setaria: https://carnegiedpb.s3.amazonaws.com/software/QTG2_prediction/SV_model.dat.zip

  2. Unzip the pre-calculated models in working directory: ./QTG2_prediction
    Example:

jar xvf AT_model.dat.zip
  1. Usage: “QTG_Finder_predict.py -gl QTL_gene_list -sp species_abbreviation"
    QTL_gene_list: this is the list of QTL genes to be ranked. See "SV_height_QTL_example.csv" for a example
    species_abbreviation: "AT" for Arabidopsis; "OS" for rice; "SB" for sorghum;"SV" for Setaria viridis
    As a example,
python QTG_Finder_predict.py -gl SV_height_QTL_example.csv -sp 'SV'

For help,

python QTG_Finder_predict.py -h
  1. “QTL_gene_rank.csv” will be the output file.

For analyses and replications

The source code and input files for cross-validation, feature importance analysis, literature validation and category analysis can be found in the 'QTG2_analysis' folder. The usage of each scripts (.py) is described at the beginning of them.

About

QTG_Finder, a QTL causal gene prioritization tool

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published