# sgrna_designer

> Python library to design sgRNAs for CRISPR tiling screens

The primary function of this package is `design_sgrna_tiling_library`, in which you can input a list of
ensembl transcript IDs, specify a region of interest (e.g. three_prime_UTR) and get all sgRNAs
tiling those transcript regions.

## Install

`pip install git+https://github.com/gpp-rnd/sgrna_designer.git#egg=sgrna_designer`

## An example

In this example we'll design sgRNAs tiling the 3' UTR of PDL1 (CD274) and BRAF

**Note**: You must also have [pandas installed](https://pandas.pydata.org/pandas-docs/stable/getting_started/install.html)
to run this tutorial

In [None]:
from sgrna_designer.design import design_sgrna_tiling_library

target_transcripts = ['ENST00000381577', 'ENST00000644969'] # [PDL1, BRAF]

Note the design function is agnostic to CRISPR enzyme and pam preferences, so you must specifiy the
following parameters in a design run:
* region: broad region you are trying to target (e.g. UTR)
* region: more specific region you are trying to target (e.g. three_prime_UTR)
* expand_3prime: amount to expand region in 3' direction
* expand_5prime: amount to expand region in 5' direction
* context_len: length of context sequence
* pam_start: position of PAM start relative to the context sequence
* pam_len: length of PAM
* sgrna_start: position of sgRNA relative to context sequence
* sgrna_len: length of sgRNA sequence
* pams: PAMs to target
* sg_positions: positions within the sgRNA to annotate and target
(e.g. [4,8] for nucleotides 4 and 8 of the sgRNA for a base editing window)

In [None]:
sgrna_designs = design_sgrna_tiling_library(target_transcripts, region_parent='UTR',
                                            region='three_prime_UTR', expand_3prime=30,
                                            expand_5prime=30, context_len=30, pam_start=-6,
                                            pam_len=3, sgrna_start=4, sgrna_len=20,
                                            pams=['AGG', 'CGG', 'TGG', 'GGG'],
                                            sg_positions=[4, 8], flag_seqs=['TTTT', 'CGTCTC', 'GAGACG'],
                                            flag_seqs_start=['TCTC', 'AGACG'], flag_seqs_end=['GAGAC'])
sgrna_designs

Unnamed: 0,context_sequence,pam_sequence,sgrna_sequence,sgrna_global_start,sgrna_global_4,sgrna_global_8,sgrna_strand,object_type,transcript_strand,transcript_id,chromosome,region_id,region_start,region_end
0,CATTGGAACTTCTGATCTTCAAGCAGGGAT,AGG,GGAACTTCTGATCTTCAAGC,5467872,5467875,5467879,1,three_prime_UTR,1,ENST00000381577,9,ENST00000381577,5467863,5470554
1,ATTGGAACTTCTGATCTTCAAGCAGGGATT,GGG,GAACTTCTGATCTTCAAGCA,5467873,5467876,5467880,1,three_prime_UTR,1,ENST00000381577,9,ENST00000381577,5467863,5470554
2,CTTCAAGCAGGGATTCTCAACCTGTGGTTT,TGG,AAGCAGGGATTCTCAACCTG,5467888,5467891,5467895,1,three_prime_UTR,1,ENST00000381577,9,ENST00000381577,5467863,5470554
3,GCAGGGATTCTCAACCTGTGGTTTAGGGGT,AGG,GGATTCTCAACCTGTGGTTT,5467894,5467897,5467901,1,three_prime_UTR,1,ENST00000381577,9,ENST00000381577,5467863,5470554
4,CAGGGATTCTCAACCTGTGGTTTAGGGGTT,GGG,GATTCTCAACCTGTGGTTTA,5467895,5467898,5467902,1,three_prime_UTR,1,ENST00000381577,9,ENST00000381577,5467863,5470554
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
845,GCTCAGGTCCCTTCATTTGTACTTTGGAGT,TGG,AGGTCCCTTCATTTGTACTT,140719570,140719567,140719563,-1,three_prime_UTR,-1,ENST00000644969,7,ENST00000644969,140719337,140726493
846,TATAACAGAAAATATTGTTCAGTTTGGATA,TGG,ACAGAAAATATTGTTCAGTT,140719522,140719519,140719515,-1,three_prime_UTR,-1,ENST00000644969,7,ENST00000644969,140719337,140726493
847,ATTGTTCAGTTTGGATAGAAAGCATGGAGA,TGG,TTCAGTTTGGATAGAAAGCA,140719509,140719506,140719502,-1,three_prime_UTR,-1,ENST00000644969,7,ENST00000644969,140719337,140726493
848,TATTTAAAAACTGTATTATATAAAAGGCAA,AGG,TAAAAACTGTATTATATAAA,140719426,140719423,140719419,-1,three_prime_UTR,-1,ENST00000644969,7,ENST00000644969,140719337,140726493
