Clustering. Worked examples and exercises
Pages 113
- Home
- Affymetrix
- affymetrix_expression_normalization_with_apt
- Agilent
- Association Analysis
- Association Analysis doc
- Babelomics version
- Babelomics web structure
- Burden test
- Cancer
- CDF
- Changes in this version
- Class comparison. Worked examples and exercises
- Class prediction
- Class prediction. Worked examples and exercises
- Clustering
- Clustering. Worked examples and exercises
- Cross hybridization
- data matrix expression
- Data types
- Define your comparison
- Detailed example of analysis of expression data in Babelomics: from raw data to expression differential and functional profiling
- Differential Expression for arrays
- Differential Expression for RNA Seq
- Dye bias
- Edit
- Edit your data
- example data
- Expression
- Expression array pipeline
- FAQ
- Functional
- Functional Gene Set Network Enrichment
- Functional GO Enrichment
- GAL
- Gene Set Enrichment
- Gene Set Network Enrichment (Network Miner)
- Gene vs annotation
- Genepix
- Genomics
- Genomics doc
- How to cite babelomics
- Id
- Logging in
- Main areas. Cancer
- Main areas. Expression
- Main areas. Functional
- Main areas. Genomics
- Main areas. Processing
- Main areas: Cancer
- Main areas: Expression
- Main areas: Functional
- Main areas: Genomics
- Main areas: Processing
- Network Enrichment (SNOW)
- Other biological data
- Overview and pipelines
- p values adjusted for multiple testing
- PED
- PED_MAP zipped
- Pipelines
- plink.assoc
- plink.assoc.linear
- plink.assoc.logistic
- plink.fisher
- plink.hh
- plink.log
- plink.tdt
- Preprocessing for data matrix
- Preprocessing for microarrays
- Preprocessing for RNA Seq
- Processing
- Ranked
- Requirements
- RNA Seq Normalization
- RNA Seq pipeline
- SDK (Software Development Kit)
- Single Enrichment
- Single Enrichment. Options
- SNPs array pipeline
- Software and databases used
- Technical Info
- The Babelomics Team
- tut_SNP_association
- Tutorial
- Tutorial Affymetrix Expression Microarray Normalization
- Tutorial Agilent One Color Microarray Normalization
- Tutorial Agilent Two Colors Microarray Normalization
- Tutorial Burden test
- Tutorial Class prediction
- Tutorial Clustering
- Tutorial Data matrix preprocessing
- Tutorial Differential Expression for arrays
- Tutorial Differential Expression for RNA Seq
- Tutorial Expression
- Tutorial Expression. Class comparison
- Tutorial Expression. Correlation
- Tutorial Expression. Survival
- Tutorial Functional
- Tutorial Genepix One Color Microarray Normalization
- Tutorial Genepix Two Colors Microarray Normalization
- Tutorial Genomics
- Tutorial OncodriveClust
- Tutorial OncodriveFM
- Tutorial Processing
- Tutorial SNP Association Analysis
- Tutorial SNP stratification
- Upload your data
- VCF 4.0
- VCF file pipeline
- Visualization tools
- Worked examples
- Workflow
- Show 98 more pages…
General
Tutorial
Analysis tools
Worked examples
-
Expression
-
Functional
Clone this wiki locally
Worked Examples
Example 1. Fibroblasts K-means clustering
- Go to the Babelomics page and select the Clustering option from the //Expression// menu.
- Press //Online Examples//, select the example number 1 and you will see how the parameters and form fields are now filled. As you can notice, this example is prepared to perform a clustering analysis on genes(rows) and conditions(columns) using the K-means algorithm with 5 sample-clusters and 15 gene-clusters. Here, the selected distance is Euclidean (square).
- Press run, and wait for your job to be finished.
- When the process finishes, a new //green job// is shown at the right side of the web page. Press it to check your results.
** Questions **
These are some questions that you should be able to answer about the previous example:
- Do you think that the clustering was able to differentiate any group of coexpressed genes?
- How many sample clusters are there? and gene clusters?
Launch online examples number 3 (Fibroblasts SOTA clustering) and number 4 (Fibroblasts UPGMA clustering). Compare the results.
- Do you obtain the same result?
- Which are the differences between the results of these three examples?
- Why are they different?
Example 2. Rheumatoid SOTA clustering
- Go to the Babelomics page and select the Clustering option from the //Expression// menu.
- Press //Online Examples//, select the example number 2 and you will see how the parameters and form fields are now filled. As you can notice, this example is prepared to perform a clustering analysis on genes(rows) and conditions(columns) using the SOTA algorithm and Euclidean (square) distance.
- Press run, and wait for your job to be finished.
- When the process finishes, a new //green job// is shown at the right side of the web page. Press it to check your results.
** Questions **
These are some questions that you should be able to answer about the previous example:
- Do you think that the clustering was able to differentiate any group of coexpressed genes?
- How many groups/clusters?
- What is your answer based on?
- Do your selected groups represent different functional classes?
Try to use other distance and clustering methods by selecting different options from the Babelomics interface. Compare the results.
- Do you obtain the same result?
- Which is the main difference between the hierarchical and non-hierarchical results?
- Does the distance method affect to your results?
Exercises
Exercise 1. Random dataset
Download this {{example_data:clustering:random_array.txt|random dataset}} and perform a clustering analysis.
- What would we obtain for an analysis of data with no structure?
- Do you obtain a result?
- What can you say about this result?
Exercise 2. Response of human fibroblasts to serum
Download {{example_data:clustering:fibro.txt|this dataset}} and perform a clustering analysis.
This dataset was explored in detail in http://genome-www.stanford.edu/serum/clusters.html. A functional validation was made on detected clusters. You can perform the same clustering analysis and take a look from the biological interpretation that was made of the different clusters.
Select a cluster and click on the highlighted region. Continue your analysis sending it to enrichment_analysis to compare them against the rest of genome.
- Do you obtain any interesting results? If not, you can try with another cluster of genes.
Exercise 3. Zebrafish embryogenesis data
Download {{:example_data:clustering:zebrafish_embryo.txt|this file}} and perform a hierarchical clustering analysis of its genes. This example file contains the first 999 genes of the 3,657 genes that showed significant levels of differential expression in http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.0010029.
- Do you see any patterns of gene expression between different developmental stages?
- Are gene clusters of different developmental stages functionally enriched?
Exercise 4. Golub data
- Run a differential expression analysis with the Golub data (train dataset from Predictor exercise predictors): \ {{:images:clustering:diff_expr_golub.jpg|}}
\
-
Then, redirect the differentially expressed genes to clustering: \ {{:images:clustering:golub_redirect_clustering.jpg|}}
-
Do the samples clusters make sense? are samples of the some conditions clustered together? Does the genes cluster make sense? Are functionally related? \ \
Find the Babelomics suite at http://babelomics.org