Skip to content

epigenomics/methylmaps

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Methyl-Analyzer is a python package that analyzes genome-wide DNA methylation
data from the methylation mapping analysis by paired-end sequencing (Methyl-MAPS)
method. 


1. Requirements
===============

* Python
     - python 2.7
     - numpy
     - pyfasta
     - pysam

One easy way to install the above packages is to use 'pip'. For example, 
    $ sudo pip install numpy


2. Installation
===============

2.1 Download Methyl-Analyzer from: 
     http://github.com/epigenomics/methylmaps
or clone the repository via git:
     git clone git://github.com/epigenomics/methylmaps.git

2.2 At the methylmaps directory, run the following command:
     $ python setup.py build
     $ sudo python setup.py install

All data analysis scripts are stored under the python bin directory.



3. Pre-analysis data preparation
================================

Compile required annotation files for the data analysis pipeline
3.1 CpG/RE/McrBC sites
   * Input
     1) chromosome name
     2) fasta sequence for the chromosome

   * Example:
        $ parse_sites.py chr11 chr11.fa


4. Data Analysis Pipeline
=========================

4.1 Parsing paired-end reads
   * Input: 
     1) Paired-end reads sequenced by SOLiD platform. 
        Note: the current parser works only for SOLiD mate-pair format. 
     2) Chromosome mapping file for mapping the chromosome IDs used in the
        mate-pair format file to the formal chromosome names.

   * Example:
        $ parse_mates.py --out_dir /path/to/frag_dir human.cmap \
          re_reads.mates 

4.2 Filtering methylated/unmethylated fragments
   * Input:
     1) Parameters used in the filtering process. See the example parameter file
        at data/filter_para.
     2) CpG/RE/McrBC annotation files (generated by 'parse_sites.py')
     3) Parsed RE fragments (by 'parse_mates.py')
     4) Parsed McrBC fragments (by 'parse_mates.py')
   
   * Example:
        $ filter.py --out_dir /path/to/filter_dir --para filter_para chr1 \
          /path/to/anno_dir chr1_re chr1_mcrbc

4.3 Estimating CpG methylation probability
   * Input:
     1) Global methylation level obtained by LUMA assay
     2) Methylated/unmethylated information collected by 'filter.py'
     
   * Example:
        $ score.py --out_dir /path/to/meth_dir 0.71 chr1 247249719 \
          /path/to/filter_dir/methdata_chr1 

4.4 Alternative procedure
   Run the all-in-one script for data analysis from the beginning to the end
   * Input:
     1) Parameters required for the pipeline. See example at
        data/pipeline_paras_hg18
     2) RE reads in the mate-pair format
     3) McrBC reads in the mate-pair format
     4) Define output directory

   * Example:
        $ run_pipeline.py --format mates pipeline_paras_hg18 re_reads.mates \
          mcrbc_reads.mates /path/to/out_dir --run


5. Visualization
================

Create BED, Microarray, and Wiggle format to visually check DNA methylation
profiles.

* BED tracks for CpG/RE/McrBC sites: use 'create_cpg_track.py'

* BED tracks for RE/McrBC fragments: use 'create_frag_track.py'

* Microarray tracks for DNA methylation profiles: use 'create_microarray_track.py'

* Wiggle tracks for DNA methylation profiles: use 'create_wiggle_track.py'


6. Citation
================

Xin Y, Ge Y, Haghighi F. Methyl-Analyzer - Whole Genome DNA Methylation
Profiling. Bioinformatics. 2011 Jun 17. [Epub ahead of print]

About

Data analysis pipeline for the Methyl-MAPS method

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages