Data analysis pipeline for the Methyl-MAPS method
License
epigenomics/methylmaps
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Methyl-Analyzer is a python package that analyzes genome-wide DNA methylation data from the methylation mapping analysis by paired-end sequencing (Methyl-MAPS) method. 1. Requirements =============== * Python - python 2.7 - numpy - pyfasta - pysam One easy way to install the above packages is to use 'pip'. For example, $ sudo pip install numpy 2. Installation =============== 2.1 Download Methyl-Analyzer from: http://github.com/epigenomics/methylmaps or clone the repository via git: git clone git://github.com/epigenomics/methylmaps.git 2.2 At the methylmaps directory, run the following command: $ python setup.py build $ sudo python setup.py install All data analysis scripts are stored under the python bin directory. 3. Pre-analysis data preparation ================================ Compile required annotation files for the data analysis pipeline 3.1 CpG/RE/McrBC sites * Input 1) chromosome name 2) fasta sequence for the chromosome * Example: $ parse_sites.py chr11 chr11.fa 4. Data Analysis Pipeline ========================= 4.1 Parsing paired-end reads * Input: 1) Paired-end reads sequenced by SOLiD platform. Note: the current parser works only for SOLiD mate-pair format. 2) Chromosome mapping file for mapping the chromosome IDs used in the mate-pair format file to the formal chromosome names. * Example: $ parse_mates.py --out_dir /path/to/frag_dir human.cmap \ re_reads.mates 4.2 Filtering methylated/unmethylated fragments * Input: 1) Parameters used in the filtering process. See the example parameter file at data/filter_para. 2) CpG/RE/McrBC annotation files (generated by 'parse_sites.py') 3) Parsed RE fragments (by 'parse_mates.py') 4) Parsed McrBC fragments (by 'parse_mates.py') * Example: $ filter.py --out_dir /path/to/filter_dir --para filter_para chr1 \ /path/to/anno_dir chr1_re chr1_mcrbc 4.3 Estimating CpG methylation probability * Input: 1) Global methylation level obtained by LUMA assay 2) Methylated/unmethylated information collected by 'filter.py' * Example: $ score.py --out_dir /path/to/meth_dir 0.71 chr1 247249719 \ /path/to/filter_dir/methdata_chr1 4.4 Alternative procedure Run the all-in-one script for data analysis from the beginning to the end * Input: 1) Parameters required for the pipeline. See example at data/pipeline_paras_hg18 2) RE reads in the mate-pair format 3) McrBC reads in the mate-pair format 4) Define output directory * Example: $ run_pipeline.py --format mates pipeline_paras_hg18 re_reads.mates \ mcrbc_reads.mates /path/to/out_dir --run 5. Visualization ================ Create BED, Microarray, and Wiggle format to visually check DNA methylation profiles. * BED tracks for CpG/RE/McrBC sites: use 'create_cpg_track.py' * BED tracks for RE/McrBC fragments: use 'create_frag_track.py' * Microarray tracks for DNA methylation profiles: use 'create_microarray_track.py' * Wiggle tracks for DNA methylation profiles: use 'create_wiggle_track.py' 6. Citation ================ Xin Y, Ge Y, Haghighi F. Methyl-Analyzer - Whole Genome DNA Methylation Profiling. Bioinformatics. 2011 Jun 17. [Epub ahead of print]
About
Data analysis pipeline for the Methyl-MAPS method
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published