This is term research project goals to develop benchmark prototype for comprehensive tool comparison, which is easy to extend with new tools and tests.
ChIP-seq has become a widely adopted genomic assay in recent years to determine binding sites for transcription factors or enrichments for specific histone modifications. Many different tools have been developed and published in recent years. However, a comprehensive comparison and review of these tools is still missing.
- clone this repo
- install ChIP-seq tools for comparisons(see supported tools below)
- prepare datasets
- create config files in YAML format(see config/example___...Config.yaml)
The idea of this project based on recent research made by Sebastian Steinhauser, Nils Kurzawa, Roland Eils and Carl Herrmann. See the paper A comprehensive comparison of tools for differential ChIP-seq analysis
- Base
- epigenetics
- ChIP-Seq
- peak calling
- Wrap tools CHip-Seq analysis
- Develop prototype of benchmark
- Make some service for benchmarking tools(optional)
- GSM1534712
- GSM1534713
- GSM1534714
- GSM1534715
- GSM1534736
- GSM1534737
- GSM1534738
- GSM1534739
- Zinbra
- Chipdiff
- MACS2
- SICER
- MAnorm
[term2-research]$ python src/Main.py -h
usage: Main.py [-h] [-t TOOLCONFIG] [-d DATACONFIG]
optional arguments:
-h, --help show this help message and exit
-t TOOLCONFIG, --toolconfig TOOLCONFIG
YAML file with tool config. See example in <config> folder
-d DATACONFIG, --dataconfig DATACONFIG
YAML file with data config. See example in <config> folder
To begin comparison with new tool you have to proceed the following steps:
-
Create
tool_name.py
file insrc/tools
which implements interfaceAbstractTool
. Precicely you need to implement only three methods:configure_data
configure_run_params
run
In the configuration methods you are free to pass any arguments you want to be saved as tool state
-
Add running method in the
src/tools/running.py
to be able to start your tool with given params from outside -
Add method with signature
___extract_newtoolname(self)
inside of classsrc/dataprocessing/dr_extractor.py
which has access to data you work with and configuration params you -
Append YAML configuration block for your tool which contains any keys you need(see
config/example__toolConfig
)
You havе to provide just another method of class Benchmark
which is placed in src/bench/benchmarking.py
with name __show_smth_you_want(self)
. This method will be started automatically with another tests after peak extraction.
- Peak length distribution
- Number of DRs
- Consistency diagrams(Venn and scatterplots)
- How many tools will be compared after start?
- As many as you described in the configuration file. If you want to compare just a subset of tools, you may comment out the rest