Skip to content

Commit

Permalink
Merge branch 'master' into develop
Browse files Browse the repository at this point in the history
  • Loading branch information
joachimwolff committed May 8, 2018
2 parents 2b8c74c + 8f747ee commit 1d9200c
Show file tree
Hide file tree
Showing 23 changed files with 673 additions and 99 deletions.
51 changes: 37 additions & 14 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,52 +18,61 @@ Set of programs to process, analyze and visualize Hi-C data
Sequencing techniques that probe the 3D organization of the genome generate large amounts of data whose processing,
analysis and visualization is challenging. Here, we present HiCExplorer, a set of tools for the analysis and
visualization of chromosome conformation data. HiCExplorer facilitates the creation of contact matrices, correction
of contacts, TAD detection, A/B compartments, merging, reordering or chromosomes, conversion from different formats including
`cooler <https://github.com/mirnylab/cooler>`_ and detection of long-range contacts. Moreover, it allows the visualization of
multiple contact matrices along with other types of data like genes, compartments, ChIP-seq coverage tracks (and in general
of contacts, TAD detection, A/B compartments, merging, reordering or chromosomes, conversion from different formats including
`cooler <https://github.com/mirnylab/cooler>`_ and detection of long-range contacts. Moreover, it allows the visualization of
multiple contact matrices along with other types of data like genes, compartments, ChIP-seq coverage tracks (and in general
any type of genomic scores), long range contacts and the visualization of viewpoints.


Citation:
^^^^^^^^^


Fidel Ramirez, Vivek Bhardwaj, Jose Villaveces, Laura Arrigoni, Bjoern A Gruening, Kin Chung Lam, Bianca Habermann, Asifa Akhtar, Thomas Manke.
Fidel Ramirez, Vivek Bhardwaj, Jose Villaveces, Laura Arrigoni, Bjoern A Gruening, Kin Chung Lam, Bianca Habermann, Asifa Akhtar, Thomas Manke.
**"High-resolution TADs reveal DNA sequences underlying genome organization in flies". Nature Communications**, Volume 9, Article number: 189 (2018), doi: https://doi.org/10.1038/s41467-017-02525-w


.. image:: ./docs/images/hicex2.png

Availability
^^^^^^^^^^^^

HiCExplorer is available as a **command line suite of tools** on this very GitHub repository and also on other platforms (detailed in *Installation* below).

A **Galaxy HiCExplorer version** is directly available to users at http://hicexplorer.usegalaxy.eu. Training material is available at the `Galaxy Training Network <http://galaxyproject.github.io/training-material/topics/epigenetics/tutorials/hicexplorer/tutorial.html>`_,
while a Galaxy Tour is available `here <https://hicexplorer.usegalaxy.eu/tours/hixexplorer>`_ for users not familiar with this platform. Galaxy HiCExplorer is also available as a Docker image at the `Docker Galaxy HiCExplorer GitHub repository <https://github.com/deeptools/docker-galaxy-hicexplorer>`_. Finally, this Galaxy version is available on the `Galaxy Tool Shed <https://toolshed.g2.bx.psu.edu/>`_ and on the corresponding `GitHub repository <https://github.com/galaxyproject/tools-iuc>`_.


Installation
^^^^^^^^^^^^

With version 2.0 HiCExplorer is available for Python 2 and Python 3:
With version 2.0, HiCExplorer is available for Python 2 and Python 3 and can be installed via:

- Command line usage (via pip/anaconda/github)
- Integration into Galaxy servers (via toolshed/API/web-browser)
- Pip, Anaconda and GitHub for command line usage.
- Toolshed and Docker image for its integration on Galaxy servers.

There are many easy ways to install HiCExplorer. Details can be found
`here <https://hicexplorer.readthedocs.io/en/latest/content/installation.html>`__
`here <https://hicexplorer.readthedocs.io/en/latest/content/installation.html>`_.

Command line version
++++++++++++++++++++

Install with conda
++++++++++++++++++
__________________

The easiest way to install HiCExplorer is using `BioConda <http://bioconda.github.io/>`_
::

$ conda install hicexplorer -c bioconda -c conda-forge



Install with pip
++++++++++++++++
________________
::

$ pip install HiCExplorer

Install by cloning this repository
++++++++++++++++++++++++++++++++++
__________________________________

You can install any one of the HiCExplorer branches on command line
(linux/mac) by cloning this git repository :
Expand All @@ -80,7 +89,21 @@ If you don't have root permission, you can set a specific folder using the ``--p

$ python setup.py install --prefix /User/Tools/hicexplorer

Galaxy version
++++++++++++++

Install with Docker
___________________

Installation instructions as a Docker image can be followed at https://github.com/deeptools/docker-galaxy-hicexplorer.


Install with Tool Shed
______________________

Galaxy HiCExplorer is part of the `Galaxy Tool Shed <https://toolshed.g2.bx.psu.edu/>`_ and can be installed from there to any Galaxy server following `this link <https://toolshed.g2.bx.psu.edu/repository/browse_repository?id=f1554978eeb3da8b>`_.


Documentation:
^^^^^^^^^^^^^^
Please visit our complete documentation `Here <http://hicexplorer.readthedocs.org/>`_
Please visit our complete documentation `Here <http://hicexplorer.readthedocs.org/>`_. This documentation is also available directly within `Galaxy <http://hicexplorer.usegalaxy.eu/>`_.
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
'mpl_toolkits', 'mpl_toolkits.axisartist', 'mpl_toolkits.mplot3d', 'mpl_toolkits.axes_grid1',
'Bio', 'Bio.Seq', 'Bio.Alphabet',
'pyBigWig', 'tables', 'pytables', 'future', 'past', 'builtins', 'past.builtins',
'future.utils', 'cooler', '__future__', 'logging']
'future.utils', 'cooler', '__future__', 'logging', 'unidecode']

for mod_name in MOCK_MODULES:
sys.modules[mod_name] = mock.Mock()
Expand Down
32 changes: 32 additions & 0 deletions docs/content/News.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,38 @@
News and Developments
=====================

Release 2.1.1
------------
**27 March 2018**

This release fixes a problem related to python3 in which chromosome names were of bytes type

Release 2.1
-----------
**5 March 2018**

The 2.1 version of HiCExplorer comes with new features and bugfixes.

- Adding the new feature `hicAggregateContacts`: A tool that allows plotting of aggregated Hi-C sub-matrices of a specified list of positions.
- Many improvements to the documentation and the help text. Thanks to @GinaRe and @gtrichard.
- hicPlotMatrix:
- supports only bigwig files for an additional data track.
- the argument `--pca` was renamed to `--bigwig`
- Smoothing the bigwig values to neighboring bins if no data is present there
- Fixes to a bug concerning a crash of `tight_layout`
- Adding the possibility to flip the sign of the values of the bigwig track
- Adding the possibility to scale the values of the bigwig track
- hicPlotViewpoint: Adds a feature to plot multiple matrices in one image
- cooler file format:
- supports mcool files
- applies correction factors if present
- optionally reads `bin['weight']`
- fixes:
- a crash in hicPlotTads if `horizontal lines` were used
- checks if all characters of a title are ASCII. If not they are converted to the closest looking one.
- Updated and fixate version number of the dependencies


Release 2.0
-----------

Expand Down
2 changes: 1 addition & 1 deletion docs/content/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ a specific folder using the ``--prefix`` option)
Galaxy installation
--------------------

HiCExplorer can be easily integrated into a local `Galaxy <http://galaxyproject.org>`_.
HiCExplorer can be easily integrated into a local `Galaxy <http://galaxyproject.org>`_, the wrappers are provided at the `Galaxy tool shed <https://toolshed.g2.bx.psu.edu/>`_.

Installation with Docker
^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
Binary file modified docs/images/hicex2.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/hicex2_backup.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
23 changes: 15 additions & 8 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,20 @@ HiCExplorer
Set of programs to process, normalize, analyze and visualize Hi-C data
----------------------------------------------------------------------

HiCexplorer addresses the common tasks of Hi-C analysis from processing to visualization.
HiCExplorer addresses the common tasks of Hi-C data analysis from processing to visualization.

.. image:: ./images/hicex2.png


Availability
------------

HiCExplorer is available as a **command line suite of tools** on this `GitHub repository <https://github.com/deeptools/HiCExplorer>`_.

A **Galaxy HiCExplorer version** is directly available to users at http://hicexplorer.usegalaxy.eu. Training material is available at the `Galaxy Training Network <http://galaxyproject.github.io/training-material/topics/epigenetics/tutorials/hicexplorer/tutorial.html>`_,
while a Galaxy Tour is available `here <https://hicexplorer.usegalaxy.eu/tours/hixexplorer>`_ for users not familiar with this platform. Galaxy HiCExplorer is also available as a Docker image at the `Docker Galaxy HiCExplorer GitHub repository <https://github.com/deeptools/docker-galaxy-hicexplorer>`_. Finally, this Galaxy version is available on the `Galaxy Tool Shed <https://toolshed.g2.bx.psu.edu/>`_ and on the corresponding `GitHub repository <https://github.com/galaxyproject/tools-iuc>`_.


The following is the list of tools available in HiCExplorer
-----------------------------------------------------------

Expand All @@ -23,21 +32,21 @@ tool description
:ref:`hicFindEnrichedContacts` Identifies enriched Hi-C contacts
:ref:`hicCorrelate` Computes and visualises the correlation of Hi-C matrices
:ref:`hicFindTADs` Identifies Topologically Associating Domains (TADs)
:ref:`hicPCA` Computes for A / B compartments the eigenvectors
:ref:`hicPCA` Computes for A / B compartments the eigenvectors
:ref:`hicTransform` Computes a obs_exp matrix like Lieberman-Aiden (2009), a pearson correlation matrix and or a covariance matrix. These matrices can be used for plotting.
:ref:`hicMergeMatrixBins` Merges consecutive bins on a Hi-C matrix to reduce resolution
:ref:`hicMergeTADbins` Uses a BED file of domains or TAD boundaries to merge the bin counts of a Hi-C matrix.
:ref:`hicPlotDistVsCounts` Plot the decay in interaction frequency with distance
:ref:`hicPlotMatrix` Plots a Hi-C matrix as a heatmap
:ref:`hicPlotTADs` Plots TADs as a track that can be combined with other tracks (genes, signal, interactions)
:ref:`hicPlotViewpoint` A plot with the interactions around a reference point or region.
:ref:`hicAggreagteContacts` A tool that allows plotting of aggregated Hi-C sub-matrices of a specified list of positions.
:ref:`hicPlotViewpoint` A plot with the interactions around a reference point or region.
:ref:`hicAggreagteContacts` A tool that allows plotting of aggregated Hi-C sub-matrices of a specified list of positions.
:ref:`hicSumMatrices` Adds Hi-C matrices of the same size
:ref:`hicPlotDistVsCounts` Plots distance vs. Hi-C counts of corrected data
:ref:`hicExport` Export matrix to text formats
:ref:`hicInfo` Shows information about a Hi-C matrix file (no. of bins, bin length, sum, max, min, etc)
:ref:`hicCompareMatrices` Computes difference or ratio between two matrices
:ref:`hicLog2Ratio` Computes the log2 ratio between two matrices.
:ref:`hicLog2Ratio` Computes the log2 ratio between two matrices.
=============================== ==========================================================================================================================================================


Expand All @@ -54,14 +63,12 @@ Contents:

.. toctree::
:maxdepth: 2

content/installation
content/list-of-tools
content/example_usage
content/News




Citation
---------
Expand Down
1 change: 1 addition & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
sphinx==1.5.6
mock
sphinx-argparse
2 changes: 1 addition & 1 deletion hicexplorer/_version.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
# This file is originally generated from Git information by running 'setup.py
# version'. Distribution tarballs contain a pre-generated copy of this file.

__version__ = '2.0'
__version__ = '2.1.3'
19 changes: 10 additions & 9 deletions hicexplorer/hicBuildMatrix.py
Original file line number Diff line number Diff line change
Expand Up @@ -181,14 +181,15 @@ def parse_arguments(args=None):
help='Sequence of the restriction site.')

parserOpt.add_argument('--danglingSequence',
help='Dangling end sequence left by the restriction enzyme. For DpnII for example, the '
'dangling end is the same restriction sequence. This is used '
'to discard reads that end/start with such sequence '
'and that are considered un-ligated fragments or '
'"dangling-ends". If not given, such statistics will '
'not be available. Dangling-ends usually represent a significant proportion '
'of Hi-C libraries and might lead to erronous Hi-C matrices if they are not discarded. '
'This parameter must be taken into account.')
help='Sequence left by the restriction enzyme after cutting. Each restriction enzyme '
'recognizes a different DNA sequence and, after cutting, they leave behind a specific '
'"sticky" end or dangling end sequence. For example, for HindIII the restriction site '
'is AAGCTT and the dangling end is AGCT. For DpnII, the restriction site and dangling '
'end sequence are the same: GATC. This information is easily found on the description '
'of the restriction enzyme. The dangling sequence is used to classify and report reads '
'whose 5\' end starts with such sequence as dangling-end reads. A significant portion '
'of dangling-end reads in a sample are indicative of a problem with the re-ligation '
'step of the protocol.')

parserOpt.add_argument('--region', '-r',
help='Region of the genome to limit the operation to. '
Expand All @@ -198,7 +199,7 @@ def parse_arguments(args=None):
required=False,
type=genomicRegion
)
# # curently not implemented
# currently not implemented
parserOpt.add_argument('--removeSelfLigation',
# help='If set, inward facing reads less than 1000 bp apart and having a restriction'
# 'site in between are removed. Although this reads do not contribute to '
Expand Down
19 changes: 8 additions & 11 deletions hicexplorer/hicPlotDistVsCounts.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import os.path
import numpy as np
import pandas as pd
import argparse

import hicexplorer.HiCMatrix as HiCMatrix
Expand Down Expand Up @@ -384,17 +385,13 @@ def main(args=None):
axs[row, col] = ax
idx += 1
if args.outFileData is not None:
if args.perchr and len(args.matrices) > 1:
label = labels[matrix_file]
args.outFileData.write("#{}\n".format(chrom))

elif args.perchr:
label = chrom
else:
label = labels[matrix_file]
args.outFileData.write("#{}\n".format(label))
args.outFileData.write("\t".join(map(str, x)) + "\n")
args.outFileData.write("\t".join(map(str, y)) + "\n")
x_vals = np.stack(x).T
y_vals = np.stack(y).T
table_to_export = pd.DataFrame({'Matrix': labels[matrix_file],
'Chromosome': chrom,
'Distance': x_vals,
'Contacts': y_vals})
table_to_export.to_csv(args.outFileData, sep='\t')

for ax in axs.reshape(-1):
if ax is None:
Expand Down
13 changes: 9 additions & 4 deletions hicexplorer/hicPlotMatrix.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ def parse_arguments(args=None):
help='Plot title.')

parserOpt.add_argument('--scoreName', '-s',
help='Score name.')
help='Score name label for the heatmap legend.')

parserOpt.add_argument('--perChromosome',
help='Instead of plotting the whole matrix, '
Expand All @@ -67,7 +67,8 @@ def parse_arguments(args=None):

parserOpt.add_argument('--clearMaskedBins',
help='If set, masked bins are removed from the matrix '
'and not shown as black lines.',
'and the nearest bins are extended to cover the empty space '
'instead of plotting black lines.',
action='store_true')

parserOpt.add_argument('--chromosomeOrder',
Expand All @@ -80,7 +81,7 @@ def parse_arguments(args=None):
help='Plot only this region. The format is '
'chr:start-end The plotted region contains '
'the main diagonal and is symmetric unless '
' --region2 is given.'
'--region2 is given.'
)

parserOpt.add_argument('--region2',
Expand Down Expand Up @@ -457,7 +458,11 @@ def main(args=None):
# if args.matrix.endswith('.cool') or cooler.io.is_cooler(args.matrix) or'.mcool' in args.matrix:
is_cooler = check_cooler(args.matrix)
log.debug("Cooler or no cooler: {}".format(is_cooler))
if is_cooler and not args.region2:
open_cooler_chromosome_order = True
if args.chromosomeOrder is not None and len(args.chromosomeOrder) > 1:
open_cooler_chromosome_order = False

if is_cooler and not args.region2 and open_cooler_chromosome_order:
log.debug("Retrieve data from cooler format and use its benefits.")
regionsToRetrieve = None
if args.region:
Expand Down
2 changes: 1 addition & 1 deletion hicexplorer/hicPlotTADs.py
Original file line number Diff line number Diff line change
Expand Up @@ -257,7 +257,7 @@ def parse_arguments(args=None):
required=False)

parserOpt.add_argument('--scoreName', '-s',
help='Score name.',
help='Score name label for the heatmap legend.',
required=False)

parserOpt.add_argument('--outFileName', '-out',
Expand Down

0 comments on commit 1d9200c

Please sign in to comment.