Tutorial SNP Association Analysis
Pages 113
- Home
- Affymetrix
- affymetrix_expression_normalization_with_apt
- Agilent
- Association Analysis
- Association Analysis doc
- Babelomics version
- Babelomics web structure
- Burden test
- Cancer
- CDF
- Changes in this version
- Class comparison. Worked examples and exercises
- Class prediction
- Class prediction. Worked examples and exercises
- Clustering
- Clustering. Worked examples and exercises
- Cross hybridization
- data matrix expression
- Data types
- Define your comparison
- Detailed example of analysis of expression data in Babelomics: from raw data to expression differential and functional profiling
- Differential Expression for arrays
- Differential Expression for RNA Seq
- Dye bias
- Edit
- Edit your data
- example data
- Expression
- Expression array pipeline
- FAQ
- Functional
- Functional Gene Set Network Enrichment
- Functional GO Enrichment
- GAL
- Gene Set Enrichment
- Gene Set Network Enrichment (Network Miner)
- Gene vs annotation
- Genepix
- Genomics
- Genomics doc
- How to cite babelomics
- Id
- Logging in
- Main areas. Cancer
- Main areas. Expression
- Main areas. Functional
- Main areas. Genomics
- Main areas. Processing
- Main areas: Cancer
- Main areas: Expression
- Main areas: Functional
- Main areas: Genomics
- Main areas: Processing
- Network Enrichment (SNOW)
- Other biological data
- Overview and pipelines
- p values adjusted for multiple testing
- PED
- PED_MAP zipped
- Pipelines
- plink.assoc
- plink.assoc.linear
- plink.assoc.logistic
- plink.fisher
- plink.hh
- plink.log
- plink.tdt
- Preprocessing for data matrix
- Preprocessing for microarrays
- Preprocessing for RNA Seq
- Processing
- Ranked
- Requirements
- RNA Seq Normalization
- RNA Seq pipeline
- SDK (Software Development Kit)
- Single Enrichment
- Single Enrichment. Options
- SNPs array pipeline
- Software and databases used
- Technical Info
- The Babelomics Team
- tut_SNP_association
- Tutorial
- Tutorial Affymetrix Expression Microarray Normalization
- Tutorial Agilent One Color Microarray Normalization
- Tutorial Agilent Two Colors Microarray Normalization
- Tutorial Burden test
- Tutorial Class prediction
- Tutorial Clustering
- Tutorial Data matrix preprocessing
- Tutorial Differential Expression for arrays
- Tutorial Differential Expression for RNA Seq
- Tutorial Expression
- Tutorial Expression. Class comparison
- Tutorial Expression. Correlation
- Tutorial Expression. Survival
- Tutorial Functional
- Tutorial Genepix One Color Microarray Normalization
- Tutorial Genepix Two Colors Microarray Normalization
- Tutorial Genomics
- Tutorial OncodriveClust
- Tutorial OncodriveFM
- Tutorial Processing
- Tutorial SNP Association Analysis
- Tutorial SNP stratification
- Upload your data
- VCF 4.0
- VCF file pipeline
- Visualization tools
- Worked examples
- Workflow
- Show 98 more pages…
General
Tutorial
Analysis tools
Worked examples
-
Expression
-
Functional
Clone this wiki locally
INPUT
#### STEPS [1. Select your data](tutorial-snp-association-analysis#select-your-data)
[2. Select the association test](tutorial-snp-association-analysis#select-the-association-test)
[3. Choose the MAF (Minor Allele Frequency)](tutorial-snp-association-analysis#choose-the-maf)
[4. Fill information job](tutorial-snp-association-analysis#fill-information-job)
[5. Press *Launch job* button](tutorial-snp-association-analysis#press-launch-job-button)
#### OUTPUT - [Input parameters](tutorial-snp-association-analysis#input-parameters) - [Output files](tutorial-snp-association-analysis#output-files)
INPUT
#####Input data
-
Input data should be a zip file contains the ped and map files (plane text). See data types [here](Data Types).
-
PED files contain genotype information (one person per row) and MAP files contain information on the name and position of the markers in the PED file. Babelomics can read .zip, .gz and .tar.gz files but needs to be able to uncompressed data without finding any folder structure.
#####Online example
- Association analysis: chi-square test The zip file contains the ped and map files. With this analysis we can study whether there is association between SNPs and case/control samples using the chi-squared test.
### STEPS #####Select your data First step is to select your data to analyze.
#####Select the association test We need to select one of the following tests:
-
Chi-square case/control: to test whether there is association between the two classification variables (phenotype and genotype). (To know whether to reject the null hypothesis that there is no association between variables).
-
Fisher's exact: this test is similar than the Chi-square test but in the case of to have a small sample size, it is better to use Fisher's exact test than Chi-squared.
-
Linear: this test allows for multiple covariates when testing for quantitative trait SNP association, and for interactions with those covariates.
-
Logistic: the logistic regression test is similar than the linear but instead of testing for quantitative trait it is for disease trait SNP association.
-
TDT: we will use this test only for family-based association (eg. trios) testing for disease traits.
#####Choose the MAF
-
The minor allele frequency (MAF value) is used to filter SNPs on the basis of MAF value, it means only include SNPs with MAF >= "MAF value". The default value is 0.02.
-
This quantity is based only on founders (i.e. individuals for whom the paternal and maternal individual codes are both 0).
#####Fill information job
- Select the output folder
- Choose a job name
- Specify a description for the job if desired.
#####Press Launch job button
Press launch button and wait until the results is finished. A normal job may last approximately few minutes but the time may vary depending on the size of data. See the state of your job by clicking the jobs button in the top right at the panel menu. A box will appear at the right of the web browser with all your jobs. When the analysis is finished, you will see the label "Ready". Then, click on it and you will be redirected to the results page.

### OUTPUT #### Input parameters In this section you will find a reminder of the parameters or settings you have used to run the analysis.
Output files
Here you will find the result generated by the association tool, plain-text and space-delimited data files. After use one of the five proposed methods Chi-square, Fisher, Linear, Logistic or tdt you will get 3 links to different files, where the first link gets the main info of the results. Most files will have the same number of fields per line and will have the field names in the first line, facilitating use of a spreadsheet to view and process the results.
-
Association result file (the output files can be plink.assoc, plink.fisher, plink.assoc.linear, plink.assoc.logistic or plink.tdt): This file contains the statistics obtained for all SNPs when applying any test (remember that we do a test par each SNP). When we use the linear or logistic test we obtain a model for each SNP. Then, we see:
- Manhattan plot represents the association for each SNP. Values between 3 and 5 (-log10 p-value) represent strong association.
- top hits.txt shows signficant SNPs. This table includes detailed information about each position: p-value, Odds Ratio,...
- When selecting a SNP, you can visualize this position from Genome Maps.
-
List of heterozygous haploid genotypes (plink.hh) This file contains a list of heterozygous haploid genotypes as the name itself indicates.
-
log file from PLINK (plink.log): This file contains the history of the different steps that association process has carried out. Very useful when a process is not working to see in which step has stopped the process.
| Go back to the Genomics page |
|---|
| Go back to the Home page |
|---|
Find the Babelomics suite at http://babelomics.org