In [1]:
from alignment import HSP, HSPVertex, Alignment, Transcript

1. Examples of HSP usage: init, other methods
2. Explanation of HSPVertex
3. Alignment: parsing, making, methods
4. Transcript: connection to an alignment, methods
5. Real-world case: blast (sequence example fetching, output formatting, plotting)

## HSP

HSP is a high-scoring pair in BLAST notation. It represents an ungapped alignment block of two sequences (i.e. no gaps in both sequences at the same time). Simple `HSP` instance here can be made from coordinates of that block in both sequences and an alignment raw score.

In [2]:
HSP(qstart=10, qend=20, sstart=15, send=27, score=20)

HSP(10, 20, 15, 27, 20, **{})

Additional information can be provided via `kwargs`.

In [3]:
HSP(qstart=10, qend=20, sstart=15, send=27, score=20, gaps=5, gapopens=2, mismatch=7)

HSP(10, 20, 15, 27, 20, **{'gaps': 5, 'gapopens': 2, 'mismatch': 7})

### HSP attributes

In [4]:
hsp = HSP(qstart=10, qend=20, sstart=15, send=27, score=20, gaps=5, gapopens=2, mismatch=7)

Arbitrary initiative attributes can be accessed.

In [5]:
hsp.qstart, hsp.qend, hsp.sstart, hsp.send, hsp.score

(10, 20, 15, 27, 20)

`kwargs` are stored in `kwargs` dict.

In [6]:
hsp.kwargs

{'gaps': 5, 'gapopens': 2, 'mismatch': 7}

Both sequences have strandness in HSP: + or -. Query sequence is always +-stranded. If subject end is greater, than subject start, then subject is +-stranded, --stranded otherwise. Sequence orientations are notated as `qstrand` and `sstrand` and are `True`, if +, `False` otherwise.

In [7]:
hsp.qstrand, hsp.sstrand

(True, True)

HSP has orientation: if both query and subject strands are +, it is direct, reverse otherwise.

In [8]:
hsp.orientation

'direct'

### HSP representation

HSP representation returns a string that can be used to restore original object.

In [9]:
repr(hsp)

"HSP(10, 20, 15, 27, 20, **{'gaps': 5, 'gapopens': 2, 'mismatch': 7})"

String representation of HSP provides verbal representation of HSP with query and subject coordinates, score and orientation.

In [10]:
str(hsp)

'q 10:20 s 15:27 score 20 direct'

### HSP methods

#### Precedence

Precedence is occured when two HSPs are oriented identically and do not intersect on both query and subject sequences. Take `hsp1` and `hsp2` for example. `hsp1` precedes `hsp2` in query sequence, if end of `hsp1` is lesser than start of `hsp2`. In subject sequence in case of direct HSPs orientations the rule the same. If orientations are reverse, the rule is opposite. If orientations are inconsistent, precedence cannot be defined. Only if `hsp1` precedes `hsp2` on both query and subject sequences, `hsp1` precedes `hsp2`.

In [11]:
hsp1 = HSP(qstart=10, qend=20, sstart=15, send=27, score=20)
hsp2 = HSP(qstart=22, qend=25, sstart=30, send=36, score=10)
hsp1.precede(hsp2)

True

The method does not distinguish between succeeding and intersecting.

In [12]:
hsp3 = HSP(qstart=17, qend=25, sstart=30, send=36, score=11)

In [13]:
hsp1.precede(hsp3), hsp2.precede(hsp1)

(False, False)

The exception is thrown in case of inconsistent orientations.

In [14]:
hsp4 = HSP(qstart=22, qend=25, sstart=36, send=30, score=10)
hsp1.precede(hsp4)

ValueError: HSPs have inconsistent orientations: direct for q 10:20 s 15:27 score 20 direct and reverse for q 22:25 s 36:30 score 10 reverse.

#### Copy

HSP can be copied to another instance.

In [15]:
hsp.copy()

HSP(10, 20, 15, 27, 20, **{'gaps': 5, 'gapopens': 2, 'mismatch': 7})

#### Interplay with dictionary representation

HSP can be converted to dictionary representation. It is useful in case of saving to json. Full state of HSP is dumped so user can operate with HSPs both as python objects or as regular dictionaries.

In [16]:
hsp.to_dict()

{'qstart': 10,
 'qend': 20,
 'sstart': 15,
 'send': 27,
 'score': 20,
 'kwargs': {'gaps': 5, 'gapopens': 2, 'mismatch': 7},
 'qstrand': True,
 'sstrand': True,
 'orientation': 'direct'}

HSP can be restored from dictionary representation.

In [17]:
HSP.from_dict(hsp.to_dict())

HSP(10, 20, 15, 27, 20, **{'gaps': 5, 'gapopens': 2, 'mismatch': 7})

#### Distance