# pyJASPAR Notebook

Once you have installed pyJASPAR, you can load the module and connect to the latest release of JASPAR.

In [78]:
from pyjaspar import jaspardb

Connect to the version of JASPAR you're interested in. This will return jaspardb class object.
For example here we're getting the JASPAR2020.

In [54]:
jdb_obj = jaspardb(release='JASPAR2020')

You can also check JASPAR version you are connected to using:

In [55]:
print(jdb_obj.release)

JASPAR2020


By default it is set to latest release/version of JASPAR database. For example.

In [71]:
jdb_obj = jaspardb()
print(jdb_obj.release)

JASPAR2022


### Get available releases
You can find the available releases/version of JASPAR using.

In [56]:
print(jdb_obj.get_releases())

['JASPAR2022', 'JASPAR2020', 'JASPAR2018', 'JASPAR2016', 'JASPAR2014']


### Get motif by using JASPAR ID
If you want to get the motif details for a specific TF using the JASPAR ID. If you skip the version of motif, it will return the latest version. 

In [57]:
motif = jdb_obj.fetch_motif_by_id('MA0006.1')

Printing the motif will all the associated meta-information stored in the JASPAR database cluding the matric counts.

In [58]:
print(motif)

TF name	Ahr::Arnt
Matrix ID	MA0006.1
Collection	CORE
TF class	['Basic helix-loop-helix factors (bHLH)', 'Basic helix-loop-helix factors (bHLH)']
TF family	['PAS domain factors', 'PAS domain factors']
Species	10090
Taxonomic group	vertebrates
Accession	['P30561', 'P53762']
Data type used	SELEX
Medline	7592839
Comments	dimer
Matrix:
        0      1      2      3      4      5
A:   3.00   0.00   0.00   0.00   0.00   0.00
C:   8.00   0.00  23.00   0.00   0.00   0.00
G:   2.00  23.00   0.00  23.00   0.00  24.00
T:  11.00   1.00   1.00   1.00  24.00   0.00





Get the count matrix using `.counts`

In [59]:
print(motif.counts['A'])

[3.0, 0.0, 0.0, 0.0, 0.0, 0.0]


### Search motifs by TF name
You can use the `fetch_motifs_by_name` function to find motifs by TF name. This method returns a list of motifs for the same TF name across taxonomic group. For example, below search will return two CTCF motifs one in vertebrates and another in plants taxon.

In [60]:
motifs = jdb_obj.fetch_motifs_by_name("CTCF")

In [61]:
print(len(motifs))

2


In [62]:
print(motifs)

TF name	CTCF
Matrix ID	MA0139.1
Collection	CORE
TF class	['C2H2 zinc finger factors']
TF family	['More than 3 adjacent zinc finger factors']
Species	9606
Taxonomic group	vertebrates
Accession	['P49711']
Data type used	ChIP-seq
Medline	17512414
Matrix:
        0      1      2      3      4      5      6      7      8      9     10     11     12     13     14     15     16     17     18
A:  87.00 167.00 281.00  56.00   8.00 744.00  40.00 107.00 851.00   5.00 333.00  54.00  12.00  56.00 104.00 372.00  82.00 117.00 402.00
C: 291.00 145.00  49.00 800.00 903.00  13.00 528.00 433.00  11.00   0.00   3.00  12.00   0.00   8.00 733.00  13.00 482.00 322.00 181.00
G:  76.00 414.00 449.00  21.00   0.00  65.00 334.00  48.00  32.00 903.00 566.00 504.00 890.00 775.00   5.00 507.00 307.00  73.00 266.00
T: 459.00 187.00 134.00  36.00   2.00  91.00  11.00 324.00  18.00   3.00   9.00 341.00   8.00  71.00  67.00  17.00  37.00 396.00  59.00



TF name	CTCF
Matrix ID	MA0531.1
Collection	CORE
TF class	['C2H2 z

### Search motifs with 
A more commonly used function is `fetch_motifs` helps you to get motifs which match a specified set of criteria.
You can query the database based on the available meta-information in the database.

For example, here we are gettting the widely used CORE collection for vertebrates. It returns a list of non-redundent motifs. 

In [76]:
motifs = jdb_obj.fetch_motifs(
collection = ['CORE'],
#tax_group = ['Vertebrates','PLantTs'],
#data_type = ["Chip-seq"]
)

In [77]:
print(len(motifs))

1915


In [66]:
for motif in motifs:
    print(motif.matrix_id)
    pass # do something with the motif

MA0139.1
MA0142.1
MA0149.1
MA0138.2
MA0002.2
MA0065.2
MA0146.2
MA0527.1
MA0523.1
MA0521.1
MA0520.1
MA0519.1
MA0518.1
MA0467.1
MA0468.1
MA0476.1
MA0478.1
MA0479.1
MA0480.1
MA0483.1
MA0488.1
MA0489.1
MA0492.1
MA0493.1
MA0494.1
MA0497.1
MA0501.1
MA0503.1
MA0504.1
MA0505.1
MA0506.1
MA0507.1
MA0513.1
MA0514.1
MA0515.1
MA0517.1
MA0076.2
MA0258.2
MA0050.2
MA0150.2
MA0137.3
MA0144.2
MA0140.2
MA0095.2
MA0591.1
MA0593.1
MA0595.1
MA0596.1
MA0597.1
MA0599.1
MA0852.2
MA0036.3
MA1106.1
MA0147.3
MA0100.3
MA0104.4
MA1109.1
MA0161.2
MA0060.3
MA1110.1
MA1111.1
MA0014.3
MA1114.1
MA1115.1
MA1116.1
MA1117.1
MA1118.1
MA1119.1
MA0442.2
MA1120.1
MA1121.1
MA1122.1
MA0750.2
MA0103.3
MA1124.1
MA1125.1
MA1508.1
MA1513.1
MA1522.1
MA1573.1
MA1579.1
MA1581.1
MA1583.1
MA1585.1
MA1587.1
MA1588.1
MA1589.1
MA1593.1
MA1594.1
MA1596.1
MA1597.1
MA1599.1
MA1601.1
MA1602.1
MA1603.1
MA1604.1
MA1606.1
MA1607.1
MA1608.1
MA1615.1
MA1616.1
MA1618.1
MA1619.1
MA1620.1
MA1621.1
MA1622.1
MA1623.1
MA1624.1
MA1625.1
MA1627.1
MA1628.1
M