# pyJASPAR Notebook

Once you have installed pyJASPAR, you can load the module and connect to the latest release of JASPAR.

In [22]:
from pyjaspar import jaspardb

Connect to the version of JASPAR you're interested in. This will return jaspardb class object.
For example here we're getting the JASPAR2020.

In [23]:
jdb_obj = jaspardb(release='JASPAR2024')

You can also check JASPAR version you are connected to using:

In [24]:
print(jdb_obj.release)

JASPAR2024


By default it is set to latest release/version of JASPAR database. For example.

In [25]:
jdb_obj = jaspardb()
print(jdb_obj.release)

JASPAR2024


### Get available releases
You can find the available releases/version of JASPAR using.

In [26]:
print(jdb_obj.get_releases())

['JASPAR2024', 'JASPAR2022', 'JASPAR2020', 'JASPAR2018', 'JASPAR2016', 'JASPAR2014']


### Get motif by using JASPAR ID
If you want to get the motif details for a specific TF using the JASPAR ID. If you skip the version of motif, it will return the latest version. 

In [27]:
motif = jdb_obj.fetch_motif_by_id('MA0006.1')

Printing the motif will all the associated meta-information stored in the JASPAR database cluding the matric counts.

In [28]:
print(motif)

TF name	Ahr::Arnt
Matrix ID	MA0006.1
Collection	CORE
TF class	['Basic helix-loop-helix factors (bHLH)', 'Basic helix-loop-helix factors (bHLH)']
TF family	['PAS domain factors', 'PAS domain factors']
Species	10090
Taxonomic group	vertebrates
Accession	['P30561', 'P53762']
Data type used	SELEX
Medline	7592839
Comments	dimer
Matrix:
        0      1      2      3      4      5
A:   3.00   0.00   0.00   0.00   0.00   0.00
C:   8.00   0.00  23.00   0.00   0.00   0.00
G:   2.00  23.00   0.00  23.00   0.00  24.00
T:  11.00   1.00   1.00   1.00  24.00   0.00





Get the count matrix using `.counts`

In [29]:
print(motif.counts['A'])

[3.0, 0.0, 0.0, 0.0, 0.0, 0.0]


### Search motifs by TF name
You can use the `fetch_motifs_by_name` function to find motifs by TF name. This method returns a list of motifs for the same TF name across taxonomic group. For example, below search will return two CTCF motifs one in vertebrates and another in plants taxon.

In [12]:
motifs = jdb_obj.fetch_motifs_by_name("CTCF")

In [13]:
print(len(motifs))

4


In [14]:
print(motifs)

TF name	CTCF
Matrix ID	MA0531.2
Collection	CORE
TF class	['C2H2 zinc finger factors']
TF family	['More than 3 adjacent zinc fingers']
Species	7227
Taxonomic group	insects
Accession	['Q9VS55']
Data type used	ChIP-chip
Medline	17616980
Matrix:
        0      1      2      3      4      5      6      7      8      9
A: 257.00 1534.00 202.00 987.00   2.00   0.00   2.00 124.00   1.00  79.00
C: 714.00   1.00   0.00   0.00   4.00   0.00   0.00 1645.00   0.00 1514.00
G:  87.00 192.00 1700.00 912.00 311.00 1902.00 1652.00   3.00 1807.00   8.00
T: 844.00 175.00   0.00   3.00 1585.00   0.00 248.00 130.00  94.00 301.00



TF name	CTCF
Matrix ID	MA0139.2
Collection	CORE
TF class	['C2H2 zinc finger factors']
TF family	['More than 3 adjacent zinc fingers']
Species	9606
Taxonomic group	vertebrates
Accession	['P49711']
Data type used	ChIP-seq
Medline	17512414
Comments	TF has several motif variants.
Matrix:
        0      1      2      3      4      5      6      7      8      9     10     11     12    

### Search motifs with 
A more commonly used function is `fetch_motifs` helps you to get motifs which match a specified set of criteria.
You can query the database based on the available meta-information in the database.

For example, here we are gettting the widely used CORE collection for vertebrates. It returns a list of non-redundent motifs. 

In [15]:
motifs = jdb_obj.fetch_motifs(
collection = ['CORE'],
tax_group = ['Vertebrates'],
all_versions = False,
)

In [16]:
print(len(motifs))

879


In [25]:
for motif in motifs:
    #print(motif.matrix_id)
    pass # do something with the motif

Get the number of non-redundent motifs from CORE collection per-release.

In [17]:
for release in jdb_obj.get_releases():
    print(release)
    jdb_obj = jaspardb(release=release)
    motifs = jdb_obj.fetch_motifs(
        collection = ["CORE"],
        all_versions = False,
        #species = '10090' # this is the mouse tax ID
    )
    print(len(motifs))

JASPAR2024
2346
JASPAR2022
1956
JASPAR2020
1646
JASPAR2018
1404
JASPAR2016
1082
JASPAR2014
593
