Main areas. Functional
Pages 113
- Home
- Affymetrix
- affymetrix_expression_normalization_with_apt
- Agilent
- Association Analysis
- Association Analysis doc
- Babelomics version
- Babelomics web structure
- Burden test
- Cancer
- CDF
- Changes in this version
- Class comparison. Worked examples and exercises
- Class prediction
- Class prediction. Worked examples and exercises
- Clustering
- Clustering. Worked examples and exercises
- Cross hybridization
- data matrix expression
- Data types
- Define your comparison
- Detailed example of analysis of expression data in Babelomics: from raw data to expression differential and functional profiling
- Differential Expression for arrays
- Differential Expression for RNA Seq
- Dye bias
- Edit
- Edit your data
- example data
- Expression
- Expression array pipeline
- FAQ
- Functional
- Functional Gene Set Network Enrichment
- Functional GO Enrichment
- GAL
- Gene Set Enrichment
- Gene Set Network Enrichment (Network Miner)
- Gene vs annotation
- Genepix
- Genomics
- Genomics doc
- How to cite babelomics
- Id
- Logging in
- Main areas. Cancer
- Main areas. Expression
- Main areas. Functional
- Main areas. Genomics
- Main areas. Processing
- Main areas: Cancer
- Main areas: Expression
- Main areas: Functional
- Main areas: Genomics
- Main areas: Processing
- Network Enrichment (SNOW)
- Other biological data
- Overview and pipelines
- p values adjusted for multiple testing
- PED
- PED_MAP zipped
- Pipelines
- plink.assoc
- plink.assoc.linear
- plink.assoc.logistic
- plink.fisher
- plink.hh
- plink.log
- plink.tdt
- Preprocessing for data matrix
- Preprocessing for microarrays
- Preprocessing for RNA Seq
- Processing
- Ranked
- Requirements
- RNA Seq Normalization
- RNA Seq pipeline
- SDK (Software Development Kit)
- Single Enrichment
- Single Enrichment. Options
- SNPs array pipeline
- Software and databases used
- Technical Info
- The Babelomics Team
- tut_SNP_association
- Tutorial
- Tutorial Affymetrix Expression Microarray Normalization
- Tutorial Agilent One Color Microarray Normalization
- Tutorial Agilent Two Colors Microarray Normalization
- Tutorial Burden test
- Tutorial Class prediction
- Tutorial Clustering
- Tutorial Data matrix preprocessing
- Tutorial Differential Expression for arrays
- Tutorial Differential Expression for RNA Seq
- Tutorial Expression
- Tutorial Expression. Class comparison
- Tutorial Expression. Correlation
- Tutorial Expression. Survival
- Tutorial Functional
- Tutorial Genepix One Color Microarray Normalization
- Tutorial Genepix Two Colors Microarray Normalization
- Tutorial Genomics
- Tutorial OncodriveClust
- Tutorial OncodriveFM
- Tutorial Processing
- Tutorial SNP Association Analysis
- Tutorial SNP stratification
- Upload your data
- VCF 4.0
- VCF file pipeline
- Visualization tools
- Worked examples
- Workflow
- Show 98 more pages…
General
Tutorial
Analysis tools
Worked examples
-
Expression
-
Functional
Clone this wiki locally
Tools for the functional interpretation of the genomic data:
Single Enrichment
The functional interpretation of genomic data is usually performed by studying the enrichment of any type of biologically relevant annotation in the genes or proteins selected by the experiment with respect to the corresponding distribution of the annotation in the background, typically the rest of genes or proteins in the genome.
Single enrichment analysis is less sensitive than gene set analysis and it is reccommended only in situations in which the genes are selected in the experiment in a categorical way (for example, because they are present in amplified or deleted regions or they are targets of regulatory factors, etc.). In many cases this selection of genes is performed by multiple individual, gene-wise tests. This testing strategy is quite conservative and produces, at the end, a loss of testing power in the whole procedure because a large number of false negatives are sacrificed in order to preserve a low ratio of false positives.
The GO Enrichment method (Al-Shahrour et al., 2004) was the first proposal for functional enrichment that took into account the multiple testing problem. GO Enrichment works as follows:
- GO Enrichment takes two lists of genes. Ideally a group of interest and the rest of the genes in the experiment, although any two groups formed in any way, can be tested against each other.
- These two lists are converted into two lists of functional terms using the corresponding gene or protein - term annotation table.
- Then a Fisher's exact test for 2×2 contingency tables is used to check for significant over-representation of functional terms in one of the lists with respect to the other one.
- Multiple testing correction to account for the multiple hypothesis tested (one for each functional term) is applied. GO Enrichment uses the FDR B&H method.
References
- Al-Shahrour, F., Minguez, P., Tárraga, J., Medina, I., Alloza, E., Montaner, D., & Dopazo, J. (2007). FatiGO+: a functional profiling tool for genomic data. Integration of functional annotation, regulatory motifs and interaction data with microarray experiments. Nucleic Acids Research 35 (Web Server issue): W91-96
- Al-Shahrour, F., Díaz-Uriarte, R. & Dopazo, J. (2004). FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes. Bioinformatics 20: 578-580
Gene Set Enrichment
Gene set methods are much more sensitive than single enrichment methods in detecting gene sets (defined as sets of genes with a common annotation) with a collective behaviour in a genomic experiment. These methods very efficiently detect gene sets (annotations) that are consistently associated to high or low values in a ranked list of genes.
Here a logistic regression method has been implemented, which detects asymmetrical distributions of annotations within ranked lists of genes.
References
- Montaner D, Dopazo J (2010). Multidimensional gene set analysis of genomic data. PLoS One. 2010 Apr 27;5(4):e10348. doi: 10.1371/journal.pone.0010348.
- Al-Shahrour F, Arbiza L, Dopazo H, Huerta J, Minguez P, Montaner D, & Dopazo J (2007). From genes to functional classes in the study of biological systems. BMC Bioinformatics 8: 114
- Al-Shahrour, F., Díaz-Uriarte, R. & Dopazo, J. (2005). Discovering molecular functions significantly related to phenotypes by combining gene expression data and biological information. Bioinformatics 21: 2988-2993
Network Enrichment
The Network Enrichment tool (also known as SNOW) extracts and evaluates the cooperative behaviour of lists of proteins/genes in terms of protein-protein interactions. Thus, this tool complements other Babelomics tools as FatiGO, introducing a new dimension in the functional profiling of high-throughput experiments results, this is, protein-protein interaction data.
Network Enrichment performs two different and complementary types of analysis to the list of proteins/genes submitted:
-
Evaluates the role of the list within the interactome. Network Enrichment study of the topological role of your genes within the protein interactome. It evaluates the global degree of connections, centrality and clustering by comparing the distributions of nodes of the list versus the complete distribution of these parameters among the interactome. The topological parameters calculated here account for different biological properties. For example, essential genes tend to code for proteins with high betweenness centrality, connection degree but low clustering coefficient, however, genes coding for protein complexes show high clustering coefficient but lower betweenness centrality.
-
Evaluates the list’s cooperative behaviour as a functional module. Network Enrichment calculates the MCN, the minimum network that connects the proteins/genes in the list using or without using an external nodes (a non-listed protein) to connect nodes in the list. The topology of this network is evaluated by comparing distributions of node parameters of this MCN against a set of random MCNs with same size range. This approach is similar to other’s tools for functional enrichment analysis such as GO Enrichment with the difference of not having pre-annotated functional modules to evaluate, instead Network Enrichment build it using the protein interactome.
References
- Minguez P, Götz S, Montaner D, Al-Shahrour F, Dopazo J. (2009). SNOW, a web-based tool for the statistical analysis of protein-protein interaction networks. Nucleic Acids Res. 37(Web Server issue):W109-14. - pdf - PubMed
Gene Set Network Enrichment
You have obtained a ranked list of proteins or genes ordered from some particular experiment (e.g. they are the result of a differential expression analysis, from a GWAS, etc). From this list of proteins/genes, you want to use protein-protein interaction data to find out their possible role as a protein complex, as a signalling pathway, etc. Gene Set Network Enrichment tool (also known as Network Miner) looks for significant subnetworks of protein-protein interactions within a list of ranked genes/proteins to find out their possible role as a protein complex, as a signalling pathway, etc.
The program also offers the option of defining seed genes. These seed genes are genes known to be associated with the phenotype of interest and therefore you may be interested in including them in your analysis and see the strength of association of the genes in your ranked list with these seed genes.
References
- García-Alonso L, Alonso R, Vidal E, Amadoz A, de María A, Minguez P, Medina I, Dopazo J. (2012). Discovering the hidden sub-network component in a ranked list of genes or proteins derived from genomic experiments. Nucleic Acids Res. 1;40(20):e158. - Pubmed - NAR - Google Scholar
Find the Babelomics suite at http://babelomics.org