Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
paper edits, removing lost of materials to shift to sup materials
- Loading branch information
Showing
2 changed files
with
78 additions
and
40 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Original file line | Diff line number | Diff line change |
---|---|---|---|
@@ -0,0 +1,34 @@ | |||
#+title: Scythe Supplementary Materials | |||
#+author: Vince Buffalo | |||
#+email: vsbuffalo@ucdavis.edu | |||
#+date: | |||
#+babel: :results output :exports both :session :comments org | |||
|
|||
* 3'-end Quality | |||
|
|||
#+caption: 3'-end quality deterioration. | |||
#+label: fig:qual_plot | |||
#+attr_latex: width=12cm | |||
[[./qual_plot.png]] | |||
|
|||
|
|||
* Method | |||
|
|||
** Assumptions | |||
|
|||
1. All contaminants sequences are known /a priori/ and are reliable. | |||
2. A contaminant sequence with length $l_c$ in a read of length $l_r$ | |||
will only be found between $l_r - l_c$ and $l_r$. | |||
3. If the read is contaminated, the number of bases contaminating the | |||
read $n_c$ is limited to $1 \le n_c \le l_c$, and always beginning from | |||
the 5'-end of the contaminant sequence. While this seems limiting, | |||
it is the mode by which 3'-end adapter and barcode contamination | |||
occurs.[fn:: We have encountered Illumina data in which adapters | |||
contaminate the read and are present past their length in the | |||
3'-end. The sequence after the adapter was all poly-A | |||
sequence. These extreme cases can be removed by appending poly-A | |||
sequence to the end of the adapters in the adapters file.] | |||
4. Contaminants will only have mismatching bases; no contaminants with | |||
insertions, deletions, or transpositions will be addressed by Scythe. | |||
5. The quality line of the FASTQ file accurately estimates the | |||
probability of bases being called correctly. |