Tutorial Differential Expression for RNA Seq

Ralonso edited this page Mar 22, 2015 · 17 revisions
Clone this wiki locally

INPUT

Input data should be a raw counts matrix upload as the data type expression data matrix (see Data Types). It should have been edited in the Edit tool to include, at least, a categorical attribute with two different cases.


STEPS

  1. Examples: Prepared example data for testing the tool. For loading the example, click on Differential expression example and then on the Launch job button at the bottom of the page.
  2. Select your data: Choose a data set of raw counts (integer numbers) among the data sets you have already upload to your personal user folder. Data should NOT have been normalized.
  3. Select the class to analyze: Select the variable (categorical attribute) and the two classes you want to compare. The data set should have been edited to include at least one variable with two different classes, as for example the variable Case indicating which samples belong to the class Healthy and which samples belong to the class Tumor, which will be compared. To see how to edit your data please visit Edit your data.
  4. Normalization method: If desired, select a normalization method. Available normalization methods can be reviewed here. We recommend to use a normalization method.
  5. Select multiple test-correction: Select the multiple test-correction method, which is the method used to adjust the p-values.
  6. Select adjusted p-value: Select the value of the adjusted p-value.
  7. Job information: Give information about the job you are creating.
    • Select the output folder. Babelomics will create a new folder for the job inside the specified folder.
    • Choose job name and specify a description for the job if desired.
  8. Press the Launch job button.


OUTPUT

  • Job information: Gives information about the job.
  • Input parameters: Gives information about the parameters used as input.
  • Voom graphics: MDS plot and mean-variance relation output by the voom function. For detailed information please refer to the limma package user's guide
  • Significative file: Text file named diffexp_Sig_results.txt containing the name, statistic, p-value and adjusted p-value of each significant differentially expressed feature.
  • Heatmap: Heatmap of the significant differentially expressed features. A maximum of 100 features are included in the heatmap. If there are more than 100 differentially expressed genes, the 100 more significant are selected, conserving the proportion of up- and down-regulated genes present in the set of differentially expressed genes.
  • Network viewer: Cell Maps visualization of the protein network of significant results. You can choose the number of significant UP- or DOWN-regulated genes to show in the Select number of nodes in the top (resp. bottom) list of the differential expression result box. Colored nodes represent the significant results, whereas not colored ones represent nodes connected to them directly. You can choose different options to visualize in the tool bar of the embedded application. For further information about how to use Cell Maps, visit the Cell Maps User Manual.
  • Continue processing: You can redirect the output data to other Babelomics tools to continue with your specific analysis pipeline. Concretely, you can
    • Redirect files to the Single enrichment analysis. For more information about the Single enrichment tool please visit Single Enrichment Tool. For specific information about how to use the tool, see the Single Enrichment page of the tutorial. You can redirect the file of the most UP-regulated genetic features vs. the whole genome, the file of the most DOWN-regulated genetic features vs. the whole genome and the files of the most UP-regulated and DOWN-regulated genetic features.
    • Redirect the file with the t-statistics to the Gene set enrichment tool. For further information on the Gene set enrichment tool, see Gene Set Enrichment Tool. For specific information about how to use the tool, see the Gene Set Enrichment page of the tutorial.
    • Redirect files to the Network enrichment analysis. For more information about the Network enrichment tool please visit Network Enrichment. For specific information about how to use the tool, see the Network Enrichment (SNOW) page of the tutorial. You can redirect the file of the most UP-regulated genetic features vs. the whole genome, the file of the most DOWN-regulated genetic features vs. the whole genome and the files of the most UP-regulated and DOWN-regulated genetic features.
    • Redirect the file with the t-statistics to the Gene set network enrichment tool. For further information on the Gene set network enrichment tool, see Gene Set Network Enrichment. For specific information about how to use the tool, see the Functional Gene Set Network Enrichment page of the tutorial.
    • Redirect the truncated data matrix of the significant genetic features to the Clustering tool. For further information on the Clustering tool, see Clustering. For specific information about how to use the tool, see the Clustering page of the tutorial.