forked from dib-lab/2013-caltech-workshop
-
Notifications
You must be signed in to change notification settings - Fork 0
/
some-exercises.txt
46 lines (32 loc) · 1.92 KB
/
some-exercises.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
Some exercises to try
=====================
0. Finish or re-run one of the tutorials that you're particularly interested
in.
1. Download and assemble some reads from the Short Read Archive (SRA)
or elsewhere.
Remember that `the ENA at EBI <http://www.ebi.ac.uk/ena/>`__ offers
FASTQ downloads of SRA sequence.
(see these tutorials: :doc:`reads_and_qc` and `assembly-lab`)
2. Browse the `NCBI Bacterial genomes site
<http://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/>`__, download a proteome
of interest (you'll want the '.faa' files), and run the various BLASTs
on 'em (e.g. find reciprocal best hits).
See the instructions here: :doc:`tuesday`.
3. Map raw reads for the E. coli 0104 genome back to your assembly --
e.g. take the assembly at http://athyra.idyll.org/~t/ecoli-v41.fa
and map the QC reads to it (see :doc:`bwa_mapping`).
Bonus: load the resulting read mapping into Tablet.
Bonus x 2: use the Prokka annotation .gff file to define features
in Tablet. (If you didn't get all the way through the Prokka run,
you can download a zip file of the results here:
http://athyra.idyll.org/~t/ecoli0104-prokka.zip)
4. Assemble the provided ecoli reads *without* digital normalization.
5. Install BLAST and the ngs-scripts, and then try comparing an assembly
with a genome by using the following commands::
blastall -i assembly.fa -d genome.fa -p blastn -e 1e-20 -o assembly.x.genome.blastn
blastall -d assembly.fa -i genome.fa -p blastn -e 1e-20 -o genome.x.assembly.blastn
python /usr/local/share/ngs-scripts/blast/calc-blast-cover.py genome.fa assembly.x.genome.blastn 300 assembly.fa
python /usr/local/share/ngs-scripts/blast/calc-blast-cover.py assembly.fa genome.x.assembly.blastn 300 genome.fa
6. Try taking one of the assemblies
(e.g. http://athyra.idyll.org/~t/ecoli-v41.fa) and mapping the QC
reads used to assemble it back to it, to assess assembly completeness.