Main areas. Processing
In order to include further information related to the experimental design of our data set we have to edit our data. This is a necessary step before addressing some of the Babelomics tools such as differential expression or class prediction, where the presence of at least two clearly differentiated classes is mandatory. For specific information about how to edit your data, please visit Edit.
##Microarray Normalization DNA microarray technologies allow for the simultaneous measurement of thousands of genomic features, such as gene expression, copy number variation or SNP variant. The accuracy and reproducibility of microarray measurements has been extensively validated in the past years. Despite that, in the measurements of any microarray set, there are always technological artifacts that may hide the true biological signal. Such possible distortion of the microarray measurements may produce signal effects within each arrays but also across different arrays in the set.
Causes of non biological variation in microarray measurements include, of course, differences in the sample preparation and the hybridization process, but also, [dye bias](Dye bias), [cross-hybridization](Cross hybridization) and scanner differences.
The goal of normalization is to adjust for the effects that are due to variations in the technology rather than the biology.
- [More information.](Preprocessing for microarrays)
RNA sequencing, or RNA-Seq, is becoming more and more popular over the last few years as genome-wide transcriptome profiling method because of its power to detect non-pre-established transcripts and its high reproducibility. However, this kind of data is also sensible to different biases which make that once the counts data matrix has been created, and before addressing any further analysis, a normalization process may be necessary. Babelomics allows us to correct different kinds of biases which could be present in our data, and also suggests the best normalization method for our particular data set.
For normalization of RNA-Seq data destined to differential expression, please see the section Differential Expression for RNA-Seq.
- [More information.](Preprocessing for RNA-Seq)
##Data Matrix Before any meaningful analysis can be done using your genomic data, they will need to go under a thorough cleaning process. Data normalization is the paradigm of such data reshaping processes, but not the only one. Mathematical transformations of your data like taking logarithms or missing data imputation are among them. In general the purpose of this step is to reshape your data into a distribution which will be suitable in further steps of the analysis. Babelomics preprocessing tools allow you doing such data transformations.
- [More information.](Preprocessing for data matrix)