# Cross-Species Gene Symbol Mapping

The majority of prior knowledge in OmniPath {cite:p}`omnipath`
(and similar databases) is based on human data and thus uses
human gene symbols.

However, gene homology can be used to convert these symbols to those
of other organisms using `decoupler`.
To achieve this, it uses orthology tables extracted from the HCOP
database {cite:p}`hcop`.

The following organisms are currently supported.

In [1]:
import decoupler as dc

dc.op.show_organisms()

['anole_lizard',
 'c.elegans',
 'cat',
 'cattle',
 'chicken',
 'chimpanzee',
 'dog',
 'fruitfly',
 'horse',
 'macaque',
 'mouse',
 'opossum',
 'pig',
 'platypus',
 'rat',
 's.cerevisiae',
 's.pombe',
 'xenopus',
 'zebrafish']

To demonstrate this functionality in `decoupler`,
here the SIGNOR database (which uses human gene symbols) is loaded.

In [2]:
net = dc.op.resource("SIGNOR")
net

Unnamed: 0,genesymbol,pathway
,,
0,ABL1,Cell cycle: G2/M phase transition
1,ACE,Focal segmental glomerulosclerosis
2,ACTB,Axon guidance
3,ACTN1,Axon guidance
4,ACTN1,Glutamatergic synapse
...,...,...
2943,YY1,NOTCH Signaling
2944,ZAP70,T cell activation
2945,ZAP70,P38 Signaling


The obtained resource can easily be converted to mouse gene symbols.

In [3]:
m_net = dc.op.translate(
    net,
    target_organism="mouse",
)
m_net

Unnamed: 0,genesymbol,pathway
0,Abl1,Cell cycle: G2/M phase transition
1,Ace,Focal segmental glomerulosclerosis
2,Actb,Axon guidance
3,Actg1,Axon guidance
4,Actn1,Axon guidance
...,...,...
1436,Yy1,NOTCH Signaling
1437,Zap70,T cell activation
1438,Zap70,P38 Signaling
1439,Zbtb16,Acute Myeloid Leukemia


<div class="alert alert-info">
    
**Note**

Homology conversion may result in the gain or loss
of certain genes when mapping between organisms.
Adjust the `one_to_many` parameter to make the behavior more or less strict.

</div>

Next, the conversion is performed for the fruit fly.

In [4]:
f_net = dc.op.translate(
    net,
    target_organism="fruitfly",
)
f_net

Unnamed: 0,genesymbol,pathway
0,Abl,Cell cycle: G2/M phase transition
1,Ance,Focal segmental glomerulosclerosis
2,Acer,Focal segmental glomerulosclerosis
3,Ance-2,Focal segmental glomerulosclerosis
4,Ance-3,Focal segmental glomerulosclerosis
...,...,...
1024,yki,Hippo Signaling
1025,14-3-3zeta,SAPK/JNK Signaling
1026,pho,NOTCH Signaling
1027,phol,NOTCH Signaling


Additionaly, all database functions in `decoupler` directly accept the parametter organism,
which under the hood it runs `decoupler.op.translate`.

In [5]:
dc.op.resource("SIGNOR", organism="zebrafish")

Unnamed: 0,genesymbol,pathway
0,abl1,Cell cycle: G2/M phase transition
1,ace,Focal segmental glomerulosclerosis
2,actb2,Axon guidance
3,actb1,Axon guidance
4,actn1,Axon guidance
...,...,...
1981,zap70,T cell activation
1982,zap70,P38 Signaling
1983,zbtb16a,Acute Myeloid Leukemia
1984,zbtb16b,Acute Myeloid Leukemia


In [6]:
dc.op.progeny(organism="anole_lizard")

Unnamed: 0,source,target,weight,padj
0,Androgen,tmprss2,11.490631,2.384806e-47
1,Androgen,nkx3-1,10.622551,2.205102e-44
2,Androgen,mboat2,10.472733,4.632376e-44
3,Androgen,SLC38A4,7.363805,1.253071e-39
4,Androgen,mtmr9,6.130646,2.534403e-38
...,...,...,...,...
43409,p53,enpp2,2.771405,4.993215e-02
43410,p53,arrdc4,3.494328,4.996747e-02
43411,p53,myo1b,-1.148057,4.997905e-02
43412,p53,ctsc,-1.784693,4.998864e-02


In [7]:
dc.op.collectri(organism="s.cerevisiae")

Unnamed: 0,source,target,weight,resources,references,sign_decision
0,SPT15,TOA1,1.0,ExTRI,10078202;10523649;10581267;10617594;10675336;1...,default activation
1,TOA1,SPT15,1.0,TRRUST,12818428,default activation
2,MOT1,SPT15,-1.0,TRRUST,14988402;15509807;16858867;20627952,PMID
3,SPT15,MOT1,1.0,ExTRI,10082549,default activation
4,MCM1,MCM1,1.0,ExTRI;NTNU.Curated,10330138;10602487;15531578;17629633;8663310;92...,PMID
...,...,...,...,...,...,...
362,COY1,RDI1,1.0,Pavlidis2021,19635798,PMID
363,COY1,POL1,1.0,Pavlidis2021,12438259;12665598;18347061,PMID
364,HAP2,SSA4,1.0,Pavlidis2021,24041570,PMID
365,HAP3,SSA4,1.0,Pavlidis2021,24041570,PMID
