# DNA Motif

A sequence motif is a nucleotide or amino-acid sequence pattern. Sequence motifs are formed by three-dimensional arrangements of amino-acids which may not be adjacent. BioPython provides a separate module, `Bio.motifs` to access the functionalities of sequence motifs

In [1]:
from Bio import motifs

## Creating Simple DNA Motifs

In [4]:
from Bio import motifs
from Bio.Seq import Seq
DNA_Motif = [
    Seq("AGCT"),
    Seq("TCGA"),
    Seq("AACT")
]
seq = motifs.create(DNA_Motif)
print(seq)

AGCT
TCGA
AACT


In [5]:
# To count the sequence values
print(seq.counts)

        0      1      2      3
A:   2.00   1.00   0.00   1.00
C:   0.00   1.00   2.00   0.00
G:   0.00   1.00   1.00   0.00
T:   1.00   0.00   0.00   2.00



In [6]:
# To count A in the sequence
seq.counts["A", :]

(2.0, 1.0, 0.0, 1.0)

In [None]:
# To access a column of counts
seq.counts[:, 3]

{'A': 1.0, 'C': 0.0, 'G': 0.0, 'T': 2.0}

## JASPAR Database

JASPAR is one of the most popular databases. It provides facilities of any of the motif formats for reading, writing and scanning sequences. It stores meta-information for each motif. The module `Bio.motifs` contains a specialized class `jarpar.Motif` to represent meta-information attributes

It has the following attributes:
- `matrix_id`: Unique JASPAR motif ID
- `name`: Name of the motif
- `tf_family`: The family of motif, e.g. Helix-Loop-Helix
- `data_type`: The type of data used in motif