### Get sequence context around a site
SHEPHARD makes it easy to find local sequence context around sites in an easy and automated way that deals with end effects automatically.

Here, we use `find_string_position()` to find all the arginine residues in a sequence, and then select the sequence context around that site.

In the demo below we're going to examine context of sequence around arginine in a fragment of the protein [TDP-43](https://www.uniprot.org/uniprotkb/Q13148/entry).

In [4]:
import shephard
from shephard.proteome import Proteome
from shephard.tools import sequence_tools

In [9]:
# define empty proteome
new_prot = Proteome([])

# add a protein
seq = 'KGISVHISNAEPKHNSNRQLERSGRFGGNPGGFGNQGGFGNSRGGGAGLGNNQGSNMGGGMNFGAFSINPAMMAAAQAALQSSWGMMGMLASQQNQSGPSGNNQNQGNMQREPNQAFGSGNNSYSGSNSGAAIGWGSASNAGSGSGFNGGFGSSMDSKSSGWGM'
new_prot.add_protein(seq, 'fragment_TDP43', 'seq_001')

# save protein object as variable
p = new_prot.protein('seq_001')

# add a site to that protein for every Arginine residue 
for i in sequence_tools.find_string_positions('R', p.sequence):
    p.add_site(i, 'intrest_site_R' , value=.23)

In [12]:
# get sequence around all sites
print('Local sequence context of R sites:')
[s.get_local_sequence_context(offset=4) for s in new_prot.protein('seq_001').sites]

Local sequence context of R sites:


['HNSNRQLER', 'RQLERSGRF', 'ERSGRFGGN', 'FGNSRGGGA', 'GNMQREPNQ']