Pipeline of Epigenome data analysis

Overview

This workshop recorded the whole processing steps of Epigenome data analysis, including the CUT&Tag-seq (ChIP-seq) and ATAC-seq data analysis. This pipeline of analysis in CC-LY Lab was written by Shawn (Xiangyu) Pan and Xuelan Chen. This page would be helpful and easy to be read and operated, especially for the bioinformatic new-hand. We will try to keep updating of Pipeline-of-Epigenome. And this pipeline is flexible, you could broaden more analysis steps and tools integrated into this page, such as TF enrichment, bulk ATAC-seq data deconvolution and et al. We also expected you could add comments and provide request to improve this page. Hope you could had a good grip of the basic Epigenome data analysis rapidly and smoothly

The analysis pipeline included

1. CUT&Tag-seq (ChIP-seq)

2. ATAC-seq

For CUT&Tag-seq data (ChIP-seq)

You could learn the standard analysis pipeline from the official document by clicking here. However, this pipeline is designed based on the single-index 25x25 PE Illumina sequencing data. In our lab, we prefer use the Novaseq 6000 platform to sequence the library with PE150. Hence, we modified the analysis pipeline to be better in our data processing.

Besides, you could get more skills and knowledge by reading this post, which comprehensively collected the algorithms and tools in CUT&Tag-seq (Chip-seq) analysis.

In this page, fastp was used to make a quality control of raw fastq files. Thebowtie2 was used to align the raw data with references. After the PCR duplicates removing and the peaks calling in each sample, the chromVAR and GenomicRanges were used to quantify the counts of CUT&Tag-seq data. In latest version, DESeq2 normalized data , which was much better to reduce the effect of peak size and library size, were used to identify the differential peaks/genes.

1. The introduction of the tools on `Linux` system

Before the learning of this page, you should install the following tools firstly on your Liunx system:

# The quality control processing of raw data with multiple threads. 用于多线程数据质控
# https://github.com/OpenGene/fastp
> fastp

# Alignment tools for CUT&Tag-seq analysis. 用于数据比对
# https://github.com/BenLangmead/bowtie2
> bowtie2

# For processing the .bam files, including indexing, alignment summary and extraction. 用于bam文件处理：文件索引,比对统计和过滤操作
# https://github.com/samtools/samtools
> samtools

# For PCR duplicates removing.用于PCR去重
# https://github.com/broadinstitute/gatk/releases
> gatk

# Conversion form .bam files to .bedgraph. 用于转换bam文件为bedgraph
# https://github.com/arq5x/bedtools2
> bedtools

# The peaks calling in CUT&Tag-seq data. 用于检测CUT&Tag样本的peaks
# https://github.com/FredHutch/SEACR
> SEACR_1.3.sh

# The peaks calling in CUT&Tag-seq data. 用于检测CUT&Tag样本的peaks
# https://github.com/macs3-project/MACS
> macs2/macs3

# Visulization and peaks distribution summary 用于可视化bam文件和峰值分布量化，包括bw文件生成
# https://github.com/deeptools/deepTools
> deeptools

After you have installed the softwares on your ubuntu system, you could begin to learn the processing steps of analysis.

The codes of CUT&Tag-seq (ChIP-seq) data analysis were divided into two parts, including pre-processing on Linux system and statistic calculation on R environment.

You could visit them by clicking:

Part1. the pre-processing on `Linux` system in each CUT&Tag-seq sample

Part2. the statistic calculation on `R` environment in CUT&Tag-seq data analysis

For ATAC-seq data

[ ]. Keep updating

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
CUTTAG_R_calculation.assets		CUTTAG_R_calculation.assets
CUTTAG_pre.assets		CUTTAG_pre.assets
README.assets		README.assets
reference_bed		reference_bed
CUTTAG_R_calculation.md		CUTTAG_R_calculation.md
CUTTAG_pre.md		CUTTAG_pre.md
README.md		README.md
README_v1.md		README_v1.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pipeline of Epigenome data analysis

Overview

The analysis pipeline included

1. CUT&Tag-seq (ChIP-seq)

2. ATAC-seq

For CUT&Tag-seq data (ChIP-seq)

1. The introduction of the tools on `Linux` system

Part1. the pre-processing on `Linux` system in each CUT&Tag-seq sample

Part2. the statistic calculation on `R` environment in CUT&Tag-seq data analysis

For ATAC-seq data

About

Releases

Packages

pangxueyu233/Pipeline-of-Epigenome

Folders and files

Latest commit

History

Repository files navigation

Pipeline of Epigenome data analysis

Overview

The analysis pipeline included

1. CUT&Tag-seq (ChIP-seq)

2. ATAC-seq

For CUT&Tag-seq data (ChIP-seq)

1. The introduction of the tools on Linux system

Part1. the pre-processing on Linux system in each CUT&Tag-seq sample

Part2. the statistic calculation on R environment in CUT&Tag-seq data analysis

For ATAC-seq data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

1. The introduction of the tools on `Linux` system

Part1. the pre-processing on `Linux` system in each CUT&Tag-seq sample

Part2. the statistic calculation on `R` environment in CUT&Tag-seq data analysis

Packages