Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

intro

  • Loading branch information...
commit c8e81ad03aff23c4f90459dea47d4dd3edf3ca29 1 parent a7b38e1
@vreinharz vreinharz authored
Showing with 15 additions and 16 deletions.
  1. +15 −16 Recomb/introduction_RECOMB.tex
View
31 Recomb/introduction_RECOMB.tex
@@ -2,7 +2,18 @@
\section{Introduction}
\label{sec:introduction}
-Ribonucleic acids (RNAs) are now an ubiquitous class of molecules, being
+
+In recent years, studies as the \emph{Human Microbiome Project}~\cite{Turnbaugh2007},
+leveraging the NGS techniques to sequence as many new organisms
+as possible, are producing a wealth of new information. Although
+those techniques have a huge throughput, they yield a sequencing error rate of around
+$4\%$~\cite{Huse2007}. This error can be highly reduced when highly
+redundant multiple sequence alignments
+ are available, but in studies of new or not well known organisms, there is not
+ enough similarity to differentiate between the sequencing errors and the natural
+ polymorphisms that we want to observe, often inflating the diversity estimates~\cite{Kunin2010}.
+
+ Ribonucleic acids (RNAs) are now an ubiquitous class of molecules, being
found in every living organisms and having a broad range of functions, from catalyzing
chemical reactions as the RNase P or the group II introns,
hybridizing messenger RNA to regulate gene expression,
@@ -16,21 +27,9 @@ \section{Introduction}
evolution~\cite{Zuckerkandl1965}, and with all their characteristics, rRNAs have
become a prime candidate for phylogenetic studies~\cite{Olsen1986, Olsen1993}.
-In recent years, studies as the \emph{Human Microbiome Project}~\cite{Turnbaugh2007},
-leveraging the NGS techniques to sequence as many new organisms
-as possible, are producing a wealth of new information. Although
-those techniques have a huge throughput, they yield a sequencing error rate of around
-$4\%$~\cite{Huse2007}. This error can be highly reduced when highly
-redundant multiple sequence allignments
- are available, but in studies of new or not well known organisms, there is not
- enough similarity to differentiate between the sequencing errors and the natural
- polymorphisms that we want to observe, often inflating the diversity estimates~\cite{Kunin2010}.
- In rRNAs, on top of multiple sequence alignments, we have as additional
- information the conserved secondary structure, and we want to use it to identify
-highly probable sequencing errors.
-Given a new rRNA sequence, a first challenge to find the sequencing errors
- is to efficiently explore the mutant space,
-which grows exponentially, guiding ourselves with the alignment and the consensus structure.
+In this paper, we tackle the problem of exploring an rRNA sequence mutant space, to
+identify positions which are probably sequencing error, given the rRNA family sequences
+and its consensus structure.
Leveraging the techniques in \texttt{RNAmutants}~\cite{Waldispuhl2008}, and building on top
of the \emph{Inside-Outside algorithm}, we define here a new method called \texttt{RNApyro}
Please sign in to comment.
Something went wrong with that request. Please try again.