Implementation algorithm

Alex Warwick Vesztrocy, Christophe Dessimoz, Henning Redestig, Prioritising candidate genes causing QTL using hierarchical orthologous groups, *Bioinformatics*, Volume 34, Issue 17, 01 September 2018, Pages i612–i619, https://doi.org/10.1093/bioinformatics/bty615

In [1]:
import qtlsearch
import pandas as pd
from IPython.display import Image,SVG
search = qtlsearch.SEARCH(
    "http://pbg-ld.candygene-nlesc.surf-hosted.nl:8890/sparql", 
    "http://sparql.omabrowser.org/sparql",
    "https://sparql.uniprot.org/sparql")

## Brix, Soluble Solids, Sugars

GO-terms: `GO:0006094` `GO:0046370` `GO:0046369` `GO:0005985` `GO:0015770`

QTL from: Chromosome `9`, around `3474710`

Candidate: `Lin5` (`Solyc09g010080`)

Define the QTL and compute genes within this interval

In [2]:
d=100000
intervalT = search.make_interval(
    "http://localhost:8890/genome/Solanum_lycopersicum/chromosome/9", 
    3474710-d, 
    3474710+d)

#genes for interval
genesT = search.interval_genes(intervalT)

Compute the list of GO annotations

In [3]:
qtls = [genesT.index]
go_annotations = pd.concat([search.get_child_annotations("GO:0006094"), search.get_child_annotations("GO:0046370"), search.get_child_annotations("GO:0046369"), search.get_child_annotations("GO:0005985"), search.get_child_annotations("GO:0015770")])
print(go_annotations)

                                                                                       label
go_annotation                                                                               
http://purl.obolibrary.org/obo/GO_0006094                                    gluconeogenesis
http://purl.obolibrary.org/obo/GO_0046370                      fructose biosynthetic process
http://purl.obolibrary.org/obo/GO_1901358        beta-D-galactofuranose biosynthetic process
http://purl.obolibrary.org/obo/GO_0046369                     galactose biosynthetic process
http://purl.obolibrary.org/obo/GO_0019574       sucrose catabolic process via 3'-ketosucrose
http://purl.obolibrary.org/obo/GO_0061705  sucrose catabolic process to fructose-6-phosph...
http://purl.obolibrary.org/obo/GO_0005987                          sucrose catabolic process
http://purl.obolibrary.org/obo/GO_0061704                    glycolytic process from sucrose
http://purl.obolibrary.org/obo/GO_0036008  sucrose catabolic process t

Get data and do computations

In [4]:
result = qtlsearch.QTLSEARCH(search, qtls,go_annotations)

[1m=== GET DATA ===[0m
[1mSearch for Solyc09g009900.2[0m
- root is http://omabrowser.org/ontology/oma#GROUP_185572
- tree of groups fetched: 59
- proteins within tree fetched: 71
- uniprot proteins within tree fetched: 65
  * check proteins (1-50): 0 with required annotation
  * check proteins (51-71): 0 with required annotation
- checked 65 uniprot proteins: 0 with required annotation
- checked 71 proteins: 0 linked to uniprot with required annotation
[1mSearch for Solyc09g009910.2[0m
- root is http://omabrowser.org/ontology/oma#GROUP_433737
- tree of groups fetched: 8197
- proteins within tree fetched: 10000
- uniprot proteins within tree fetched: 9759
  * check proteins (1-50): 0 with required annotation
  * check proteins (51-100): 0 with required annotation
  * check proteins (101-150): 0 with required annotation
  * check proteins (151-200): 0 with required annotation
  * check proteins (201-250): 0 with required annotation
  * check proteins (251-300): 0 with required anno

  * check proteins (6451-6500): 0 with required annotation
  * check proteins (6501-6550): 0 with required annotation
  * check proteins (6551-6600): 0 with required annotation
  * check proteins (6601-6650): 0 with required annotation
  * check proteins (6651-6700): 0 with required annotation
  * check proteins (6701-6750): 0 with required annotation
  * check proteins (6751-6800): 0 with required annotation
  * check proteins (6801-6850): 0 with required annotation
  * check proteins (6851-6900): 0 with required annotation
  * check proteins (6901-6950): 0 with required annotation
  * check proteins (6951-7000): 0 with required annotation
  * check proteins (7001-7050): 0 with required annotation
  * check proteins (7051-7100): 0 with required annotation
  * check proteins (7101-7150): 0 with required annotation
  * check proteins (7151-7200): 0 with required annotation
  * check proteins (7201-7250): 0 with required annotation
  * check proteins (7251-7300): 0 with required annotati

  * check proteins (1-3): 0 with required annotation
- checked 3 uniprot proteins: 0 with required annotation
- checked 3 proteins: 0 linked to uniprot with required annotation
[1mSearch for Solyc09g009980.1[0m
- root is http://omabrowser.org/ontology/oma#GROUP_174450
- tree of groups fetched: 33
- proteins within tree fetched: 53
- uniprot proteins within tree fetched: 43
  * check proteins (1-50): 0 with required annotation
- checked 43 uniprot proteins: 0 with required annotation
- checked 53 proteins: 0 linked to uniprot with required annotation
[1mSearch for Solyc09g009990.2[0m
- root is http://omabrowser.org/ontology/oma#GROUP_181480
- tree of groups fetched: 48
- proteins within tree fetched: 72
- uniprot proteins within tree fetched: 55
  * check proteins (1-50): 0 with required annotation
  * check proteins (51-72): 0 with required annotation
- checked 55 uniprot proteins: 0 with required annotation
- checked 72 proteins: 0 linked to uniprot with required annotation
[1mSe

  * check proteins (1351-1400): 0 with required annotation
  * check proteins (1401-1450): 0 with required annotation
  * check proteins (1451-1500): 0 with required annotation
  * check proteins (1501-1550): 0 with required annotation
  * check proteins (1551-1600): 0 with required annotation
  * check proteins (1601-1650): 0 with required annotation
  * check proteins (1651-1700): 0 with required annotation
  * check proteins (1701-1750): 0 with required annotation
  * check proteins (1751-1800): 0 with required annotation
  * check proteins (1801-1850): 0 with required annotation
  * check proteins (1851-1900): 0 with required annotation
  * check proteins (1901-1950): 0 with required annotation
  * check proteins (1951-2000): 0 with required annotation
  * check proteins (2001-2050): 0 with required annotation
- checked 2021 uniprot proteins: 0 with required annotation
- checked 2453 proteins: 0 linked to uniprot with required annotation
[1mSearch for Solyc09g010110.2[0m
- root i

  * check proteins (2951-3000): 0 with required annotation
  * check proteins (3001-3050): 0 with required annotation
  * check proteins (3051-3100): 0 with required annotation
  * check proteins (3101-3150): 0 with required annotation
  * check proteins (3151-3200): 0 with required annotation
  * check proteins (3201-3250): 0 with required annotation
  * check proteins (3251-3300): 0 with required annotation
  * check proteins (3301-3350): 0 with required annotation
  * check proteins (3351-3400): 0 with required annotation
  * check proteins (3401-3450): 0 with required annotation
  * check proteins (3451-3500): 0 with required annotation
  * check proteins (3501-3550): 0 with required annotation
  * check proteins (3551-3600): 0 with required annotation
  * check proteins (3601-3650): 0 with required annotation
  * check proteins (3651-3700): 0 with required annotation
  * check proteins (3701-3750): 0 with required annotation
  * check proteins (3751-3800): 0 with required annotati

  * check proteins (9901-9950): 0 with required annotation
  * check proteins (9951-10000): 0 with required annotation
- checked 9993 uniprot proteins: 0 with required annotation
- checked 10001 proteins: 0 linked to uniprot with required annotation
[1m=== COMPUTATIONS ===[0m
Compute for Solyc09g009900.2
Compute for Solyc09g009910.2
Compute for Solyc09g009920.1
Compute for Solyc09g009930.1
Compute for Solyc09g009940.2
Compute for Solyc09g009950.2
Compute for Solyc09g009960.2
Compute for Solyc09g009970.2
Compute for Solyc09g009980.1
Compute for Solyc09g009990.2
Compute for Solyc09g010000.2
Compute for Solyc09g010010.1
Compute for Solyc09g010020.2
Compute for Solyc09g010030.1
Compute for Solyc09g010040.1
Compute for Solyc09g010050.1
Compute for Solyc09g010060.2
Compute for Solyc09g010070.1
Compute for Solyc09g010080.2
Compute for Solyc09g010090.2
Compute for Solyc09g010100.2
Compute for Solyc09g010110.2
Compute for Solyc09g010120.2
Compute for Solyc09g010130.2
Compute for Solyc09g01014

Create report

In [5]:
report_list = result.report()
for report in report_list:
    display(report)

Unnamed: 0_level_0,p_initial,p_final,protein
gene,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Solyc09g010090.2,0.034483,0.193381,https://omabrowser.org/oma/info/SOLLC31606
Solyc09g010080.2,0.034483,0.164374,https://omabrowser.org/oma/info/SOLLC31605
Solyc09g010020.2,0.034483,0.043606,https://omabrowser.org/oma/info/SOLLC31599
Solyc09g010040.1,0.034483,0.043606,https://omabrowser.org/oma/info/SOLLC31601
Solyc09g010000.2,0.034483,0.037065,https://omabrowser.org/oma/info/SOLLC31597
Solyc09g009900.2,0.034483,0.034483,https://omabrowser.org/oma/info/SOLLC31587
Solyc09g009910.2,0.034483,0.034483,https://omabrowser.org/oma/info/SOLLC31588
Solyc09g009920.1,0.034483,0.034483,https://omabrowser.org/oma/info/SOLLC31589
Solyc09g009930.1,0.034483,0.034483,https://omabrowser.org/oma/info/SOLLC31590
Solyc09g009940.2,0.034483,0.034483,https://omabrowser.org/oma/info/SOLLC31591
