# **Bioinformatics with Jupyter Notebooks for WormBase:**
## **Analyses 3 - ePCR**
Welcome to the seventh jupyter notebook in the WormBase tutorial series. Over this series of tutorials, we will write code in Python that allows us to retrieve and perform simple analyses with data available on the WormBase sites.

This tutorial will deal with performing In-Silico PCR of your data locally. (This is compatible on Linux OS systems only!!)
Let's get started!

We start by importing the required python libraries and also installing 1 program locally - BLAT (from the UCSC Genome Browser), which includes In-Silico PCR.

In-Silico PCR searches a sequence database with a pair of PCR primers, using an indexing strategy for fast performance. We can download the gfPcr/isPcr version of In-Silico PCR, which performs the same as the web version.

In [1]:
import os
import itertools
import Bio 
from Bio import SeqIO
import math
import pandas as pd 
get_ipython().system = os.system

#### Install BLAT, run the gfPcr and make it ready to receive queries and use isPCR send a In-Silico PCR query

Download the binaries for the BLAT program which also contains In-Silico PCR

In [2]:
!mkdir BLAT
!rsync -a rsync://hgdownload.soe.ucsc.edu/genome/admin/exe/linux.x86_64/blat/ BLAT/
!chmod +x BLAT/isPcr BLAT/gfPcr BLAT/blat

0

Download the .2bit genome file for the C. elegans species

In [3]:
!wget https://hgdownload.soe.ucsc.edu/goldenPath/ce11/bigZips/ce11.2bit

0

Run the server on the downloaded .2bit file

In [4]:
!BLAT/gfPcr start 127.0.0.1 1234 -stepSize=5 ce11.2bit &

0

Query the server with a ePCR request for the example file - ePCR_example.txt

In [5]:
!BLAT/isPcr -maxSize=5000 -minPerfect=5 -minGood=10 ce11.2bit ePCR_example.txt ePCR.fasta

0

#### Parse the output of the In-Silico program and generate readable output

In [6]:
fasta_sequences = SeqIO.parse(open('ePCR.fasta'),'fasta')

In [7]:
for sequence in fasta_sequences:
    print(sequence)
    print('\n')

ID: chrX:10782951+10784138
Name: chrX:10782951+10784138
Description: chrX:10782951+10784138 assay_1 1188bp CGATAAACAATCAACGGCATAAT TTTGAAACTGATATAGAGGGGCA
Number of features: 0
Seq('CGATAAACAATCAACGGCATAATtatcagaattgagctcgaaagcttaaactcc...AAA')


ID: chrV:10296804+10298994
Name: chrV:10296804+10298994
Description: chrV:10296804+10298994 assay_2 2191bp AAGGTTATTTATGCGGTGGAAAT AGCACTTTGAGCTTGATGAAATC
Number of features: 0
Seq('AAGGTTATTTATGCGGTGGAAATgacacgggaaaggtggcaaaatgtaatacat...GCT')


ID: chrII:11338204+11339806
Name: chrII:11338204+11339806
Description: chrII:11338204+11339806 assay_3 1603bp AGATTGGAACGATAACGCAGATA TTTGCCAATTTGCATTTTATTTT
Number of features: 0
Seq('AGATTGGAACGATAACGCAGATAcaaaagataattaatctgaaattgttaaata...AAA')




This is the end of the third tutorial for WormBase data analysis! This tutorial dealt with using In-Silico PCR locally for any worm primers.

In the next tutorial, we will use multiple nucleotide and protein aligners on different WormBase data!