Modified HiCUP (Hi-C User Pipeline)

Overview

This is a modified version of HiCUP, a bioinformatics pipeline for processing Hi-C data. It has been designed especially for improving the di-tag yield when using 4-cutter restriction enzymes to create Hi-C junctions.

The rationale for this pipeline is that the standard view that sequenced read pairs contain just one Hi-C junction is an oversimplification, and in reality such read pairs may contain several Hi-C junctions. By identifying each of these putative components, the pipeline then generates all the combinations of interactions. For example, suppose a read pair contained DNA derived from genomic regions A, B and C; then this can be converted into the pairwise interactions: A-B, A-C and B-C.

Additional scripts

This modified version of HiCUP incorporates the following additional Perl scripts:

hicup_combiner:

This script cuts reads at the occurrence of Hi-C ligation junctions, and retains all the resulting "sub-reads". Hi-C sub-reads from the forward read are classified as F1, F2, ...; and from the reverse read are classified as R1, R2, ... Read pairs not containing the Hi-C ligation sequence are given the tag ORIGINAL. Please note that when generating sub-reads derived entirely from either the forward read or reverse read, then one of the sub-reads should be reverse-complemented when creating a new “reconstructed HiC read”. For example: F1, F2 will become F1-FRC2.

Tags using this naming system will be appended to the FASTQ read ID headers.

We can now identify Hi-C interaction “groups” in our reconstructed datasets (i.e. reconstructed Hi-C reads from the same group will all be derived from a single original read pair).

Reads will also have appended in the header the read number e.g. 7.F2-FRC3 will be an intra-read interaction in the 7th forward read of the FASTQ file, between sub-reads 2 and 3.

The script also uses R to generate graphs.

Note: the DpnII ligation junction is hard-coded into the script, irrespective of the digest file used.

hicup_allocater:

This Perl script allocates each read to the restriction fragment from which it was derived (using the mapping results). This additional information is incorporated to the read header.

hicup_prefilter:

Removes identical fragment-fragment interactions from the same “di-tag group” (generated from a conventional read-pair by hicup_combiner). It also removes novel intra-fragment interactions generated by hicup_combiner (but retains those that may have been generated as part of the conventional HiCUP pipeline).

The HiCUP master script runs the pipeline scripts in the following order:

hicup_combiner
hicup_truncater
hicup_mapper
hicup_allocater
hicup_prefilter
hicup_filter
hicup_deduplicator

Usage notes

As for the standard pipeline, the HiCUP master script executes each step in turn. With this modified version of HiCUP, no HTML summary file is generated. However, a pipeline summary file named "hicup_combinations_pipeline_summary_report.txt" is generated.

When running the pipeline, create a configuration file and run HiCUP with the command:

hicup -c [configuration_file]

This is a development version of HiCUP and it may only be used for protocols in which DpnII was used to generate the Hi-C ligation junctions. Furthermore, only process one FASTQ file pair at a time and keep all the input and output separate in a single directory. Make sure the FASTQ input files are in your current working directory and let HiCUP write the output to your current working directory (this is the HiCUP default i.e. do not specify the --outdir option).

The original HiCUP pipeline homepage is on the Babraham Bioinformatics website.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Conversion		Conversion
Documentation		Documentation
Misc		Misc
Testing		Testing
config_files		config_files
r_scripts		r_scripts
LICENSE.txt		LICENSE.txt
README.md		README.md
RELEASE_NOTES.txt		RELEASE_NOTES.txt
hicup		hicup
hicup_allocater		hicup_allocater
hicup_combiner		hicup_combiner
hicup_deduplicator		hicup_deduplicator
hicup_digester		hicup_digester
hicup_filter		hicup_filter
hicup_mapper		hicup_mapper
hicup_module.pm		hicup_module.pm
hicup_prefilter		hicup_prefilter
hicup_reporter		hicup_reporter
hicup_truncater		hicup_truncater

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Modified HiCUP (Hi-C User Pipeline)

Overview

Additional scripts

hicup_combiner:

hicup_allocater:

hicup_prefilter:

Usage notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Modified HiCUP (Hi-C User Pipeline)

Overview

Additional scripts

hicup_combiner:

hicup_allocater:

hicup_prefilter:

Usage notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages