Skip to content

Commit

Permalink
Merge pull request #511 from deeptools/develop
Browse files Browse the repository at this point in the history
Develop
  • Loading branch information
joachimwolff committed Mar 9, 2020
2 parents 047f2a1 + 30391a5 commit 66f3a66
Show file tree
Hide file tree
Showing 143 changed files with 1,254 additions and 9,550 deletions.
25 changes: 25 additions & 0 deletions docs/content/News.rst
@@ -1,6 +1,31 @@
News and Developments
=====================

Release 3.4.2
-------------
**7 March 2020**

- This release fixes the wrong name scheme which was used in the chicModules. The most .bed files are now .txt files.
- hicDetectLoops got an inner chromosome parallelization to decrease the compute time.
- hicPlotMatrix got three new parameters: rotationX, rotationY and fontSize to adjust the position and font size of the labels. We hope this can lead in certain cases to a a better readability
- hicPlotMatrix: fixed a bug that occurred if the list of chromosomes was given and the last chromosome appeared as an additional label.
- Improving and updating the documentation.


Preprint
--------
**6 March 2020*
The preprint of the loop detection algorithm is online via biorXiv: `<https://www.biorxiv.org/content/10.1101/2020.03.05.979096v1>`_



Release 3.4.1
-------------
**3 February 2020**

- This release fixes a bug in chicViewpoint that caused a crash if the data to be averaged is less than the window size.

Release 3.4
-----------
**28 January 2020**
Expand Down
44 changes: 22 additions & 22 deletions docs/content/capture-Hi-C.rst
Expand Up @@ -129,7 +129,7 @@ compute a p-value per relative distance in each sample, which is used to make th

.. code:: bash
chicViewpointBackgroundModel -m FL-E13-5.cool MB-E10-5.cool --fixateRange 500000 -t 20 -rp reference_points.bed -o background_model.bed
chicViewpointBackgroundModel -m FL-E13-5.cool MB-E10-5.cool --fixateRange 500000 -t 20 -rp reference_points.bed -o background_model.txt
The background model looks like this:

Expand All @@ -154,11 +154,11 @@ For each viewpoint one viewpoint file is created and stored in the folder given

.. code:: bash
chicViewpoint -m FL-E13-5.cool MB-E10-5.cool --averageContactBin 5 --range 1000000 1000000 -rp referencePoints.bed -bmf background_model.bed --writeFileNamesToFile interactionFiles.txt --outputFolder interactionFilesFolder --fixateRange 500000 --threads 20
chicViewpoint -m FL-E13-5.cool MB-E10-5.cool --averageContactBin 5 --range 1000000 1000000 -rp referencePoints.bed -bmf background_model.txt --writeFileNamesToFile interactionFiles.txt --outputFolder interactionFilesFolder --fixateRange 500000 --threads 20
The name of each viewpoint file starts with its sample name (given by the name of the matrix), the
exact location and the gene / promoter name. For example, the viewpoint `chr1 4487435 4487435 Sox17` from `MB-E10-5.cool` matrix will be called `MB-E10-5_chr1_4487435_4487435_Sox17.bed` and looks like the following:
exact location and the gene / promoter name. For example, the viewpoint `chr1 4487435 4487435 Sox17` from `MB-E10-5.cool` matrix will be called `MB-E10-5_chr1_4487435_4487435_Sox17.txt` and looks like the following:

.. code:: text
Expand All @@ -185,12 +185,12 @@ for the differential analysis. Example: matrices `FL-E13-5.cool` and `MB-E10-5.

.. code:: bash
FL-E13-5_chr1_4487435_4487435_Sox17.bed
MB-E10-5_chr1_4487435_4487435_Sox17.bed
FL-E13-5_chr1_14300280_14300280_Eya1.bed
MB-E10-5_chr1_14300280_14300280_Eya1.bed
FL-E13-5_chr1_19093103_19093103_Tfap2d.bed
MB-E10-5_chr1_19093103_19093103_Tfap2d.bed
FL-E13-5_chr1_4487435_4487435_Sox17.txt
MB-E10-5_chr1_4487435_4487435_Sox17.txt
FL-E13-5_chr1_14300280_14300280_Eya1.txt
MB-E10-5_chr1_14300280_14300280_Eya1.txt
FL-E13-5_chr1_19093103_19093103_Tfap2d.txt
MB-E10-5_chr1_19093103_19093103_Tfap2d.txt
Expand All @@ -201,7 +201,7 @@ Significant interactions detection

To detect significant interactions and to prepare a target file for each viewpoint which will be used for the differential analysis, the script `chicSignificantInteractions` is used. It offers two modes: either the user can specify
an x-fold value or a loose p-value. The first one considers all interactions with a minimum x-fold over the average background for its relative distribution as a candidate or secondly, all interactions with a loose p-value or less are considered.
These are preselection steps to be able to detect wider peaks in the same way as sharp ones. All detected candidates are merged to one peak if they are direct neighbors, and the sum of all interaction values of this neighborhood
These are pre-selection steps to be able to detect wider peaks in the same way as sharp ones. All detected candidates are merged to one peak if they are direct neighbors, and the sum of all interaction values of this neighborhood
is used to compute a new p-value. The p-value is computed based on the relative distance negative binomial distribution of the interaction with the original highest interaction value. All peaks considered are accepted as significant interactions if
their p-value is as small as the threshold `--pvalue`.

Expand All @@ -211,15 +211,15 @@ In this example we use the reference point Mstn at location chr1 53118507, a loo

.. code:: bash
chicSignificantInteractions --interactionFile interactionFilesFolder/FL-E13-5_chr1_53118507_53118507_Mstn.bed -bmf background_model.bed --range 1000000 1000000 --pValue 0.01 --loosePValue 0.1
chicSignificantInteractions --interactionFile interactionFilesFolder/FL-E13-5_chr1_53118507_53118507_Mstn.txt -bmf background_model.txt --range 1000000 1000000 --pValue 0.01 --loosePValue 0.1
This creates two files:

.. code:: bash
FL-E13-5_chr1_53118507_53118507_Mstn_target.bed
FL-E13-5_chr1_53118507_53118507_Mstn__significant_interactions.bed
FL-E13-5_chr1_53118507_53118507_Mstn_target.txt
FL-E13-5_chr1_53118507_53118507_Mstn__significant_interactions.txt
These files are stored in the folders given by the parameters `--targetFolder` and `--outputFolder`.

Expand All @@ -242,7 +242,7 @@ The target file looks like:
.. code:: bash
# Significant interactions result file of HiCExplorer's chicSignificantInteractions version 3.2-dev
# targetFolder/FL-E13-5_chr1_53118507_53118507_Mstn_target.bed
# targetFolder/FL-E13-5_chr1_53118507_53118507_Mstn_target.txt
# Mode: loose p-value with 0.1
# Used p-value: 0.01
#
Expand All @@ -263,7 +263,7 @@ two samples and one target file is supported.

.. code:: bash
chicSignificantInteractions --interactionFile interactionFiles.txt --interactionFileFolder interactionFilesFolder/ -bmf background_model.bed --range 1000000 1000000 --pValue 0.01 --loosePValue 0.1 --batchMode
chicSignificantInteractions --interactionFile interactionFiles.txt --interactionFileFolder interactionFilesFolder/ -bmf background_model.txt --range 1000000 1000000 --pValue 0.01 --loosePValue 0.1 --batchMode
The output is:

Expand All @@ -282,7 +282,7 @@ or one target file which applies for all viewpoints.

.. code:: bash
chicAggregateStatistic --interactionFile interactionFilesFolder/FL-E13-5_chr1_53118507_53118507_Mstn.bed interactionFilesFolder/MB-E10-5_chr1_53118507_53118507_Mstn.bed --targetFile targetFolder/FL-E13-5_MB-E10-5_chr1_53118507_53118507_Mstn_target.bed
chicAggregateStatistic --interactionFile interactionFilesFolder/FL-E13-5_chr1_53118507_53118507_Mstn.txt interactionFilesFolder/MB-E10-5_chr1_53118507_53118507_Mstn.txt --targetFile targetFolder/FL-E13-5_MB-E10-5_chr1_53118507_53118507_Mstn_target.txt
It selects the original data based on the target locations and returns one file per sample which is used for the differential test.

Expand All @@ -292,7 +292,7 @@ Batch mode
In the batch mode the interaction file is the file containing the viewpoint file names, the folder needs to be defined by `--interactionFileFolder`, the same applies to the target file and folder.
Two viewpoint files are match with one target file created by `chicSignificantInteractions` or one target file for all viewpoints. Please notice the output files are written to the folder name
defined by `--outputFolder`, the default is `aggregatedFiles` and it is recommended to write the file names for further batch processing with `hicDifferentialTest` to `--writeFileNamesToFile`. All output files
get the suffix defined by `--outFileNameSuffix`, default value is `_aggregate_target.bed`.
get the suffix defined by `--outFileNameSuffix`, default value is `_aggregate_target.txt`.

.. code:: bash
Expand All @@ -311,7 +311,7 @@ This can be computed per sample:

.. code:: bash
chicDifferentialTest --interactionFile aggregatedFiles/FL-E13-5_chr1_53118507_53118507_Mstn__aggregate_target.bed aggregatedFiles/MB-E10-5_chr1_53118507_53118507_Mstn__aggregate_target.bed --alpha 0.05 --statisticTest fisher
chicDifferentialTest --interactionFile aggregatedFiles/FL-E13-5_chr1_53118507_53118507_Mstn__aggregate_target.txt aggregatedFiles/MB-E10-5_chr1_53118507_53118507_Mstn__aggregate_target.txt --alpha 0.05 --statisticTest fisher
Or via batch mode:

Expand Down Expand Up @@ -400,15 +400,15 @@ One viewpoint:

.. code:: bash
chicPlotViewpoint --interactionFile interactionFilesFolder/FL-E13-5_chr1_53118507_53118507_Mstn.bed --range 200000 200000 -o single_plot.png
chicPlotViewpoint --interactionFile interactionFilesFolder/FL-E13-5_chr1_53118507_53118507_Mstn.txt --range 200000 200000 -o single_plot.png
.. image:: ../images/chic/single_plot.png

Two viewpoints, background, differential expression and p-values:

.. code:: bash
chicPlotViewpoint --interactionFile interactionFilesFolder/FL-E13-5_chr1_53118507_53118507_Mstn.bed interactionFilesFolder/MB-E10-5_chr1_53118507_53118507_Mstn.bed --range 300000 300000 --pValue --differentialTestResult differentialResults/FL-E13-5_MB-E10-5_chr1_53118507_53118507_Mstn__H0_rejected.bed --backgroundModelFile background_model.bed -o differential_background_pvalue.png
chicPlotViewpoint --interactionFile interactionFilesFolder/FL-E13-5_chr1_53118507_53118507_Mstn.txt interactionFilesFolder/MB-E10-5_chr1_53118507_53118507_Mstn.txt --range 300000 300000 --pValue --differentialTestResult differentialResults/FL-E13-5_MB-E10-5_chr1_53118507_53118507_Mstn__H0_rejected.txt --backgroundModelFile background_model.txt -o differential_background_pvalue.png
.. image:: ../images/chic/differential_background_pvalue.png
Expand All @@ -417,7 +417,7 @@ Two viewpoints, background, significant interactions and p-values:

.. code:: bash
chicPlotViewpoint --interactionFile interactionFilesFolder/FL-E13-5_chr1_53118507_53118507_Mstn.bed interactionFilesFolder/MB-E10-5_chr1_53118507_53118507_Mstn.bed --range 300000 300000 --pValue --significantInteractions significantFiles/FL-E13-5_chr1_53118507_53118507_Mstn__significant_interactions.bed significantFiles/MB-E10-5_chr1_53118507_53118507_Mstn__significant_interactions.bed --backgroundModelFile background_model.bed -o significant_background_pvalue.png
chicPlotViewpoint --interactionFile interactionFilesFolder/FL-E13-5_chr1_53118507_53118507_Mstn.txt interactionFilesFolder/MB-E10-5_chr1_53118507_53118507_Mstn.txt --range 300000 300000 --pValue --significantInteractions significantFiles/FL-E13-5_chr1_53118507_53118507_Mstn__significant_interactions.txt significantFiles/MB-E10-5_chr1_53118507_53118507_Mstn__significant_interactions.txt --backgroundModelFile background_model.txt -o significant_background_pvalue.png
.. image:: ../images/chic/significant_background_pvalue.png

Expand All @@ -432,4 +432,4 @@ For all modes the principle of a file containing the file names and a folder con

.. code:: bash
chicPlotViewpoint --interactionFile interactionFiles.txt --interactionFileFolder interactionFilesFolder/ --range 300000 300000 --pValue --significantInteractions significantFilesBatch.txt --significantInteractionFileFolder significantFiles --backgroundModelFile background_model.bed --outputFolder plots --threads 20 --batchMode
chicPlotViewpoint --interactionFile interactionFiles.txt --interactionFileFolder interactionFilesFolder/ --range 300000 300000 --pValue --significantInteractions significantFilesBatch.txt --significantInteractionFileFolder significantFiles --backgroundModelFile background_model.txt --outputFolder plots --threads 20 --batchMode
4 changes: 2 additions & 2 deletions docs/content/example_usage.rst
Expand Up @@ -31,7 +31,7 @@ Reads mapping
Mates have to be mapped individually to avoid mapper specific heuristics designed
for standard paired-end libraries.

We have used the HiCExplorer sucessfuly with `bwa`, `bowtie2` and `hisat2`. However, it is important to:
We have used the HiCExplorer successfully with `bwa`, `bowtie2` and `hisat2`. However, it is important to:

* for either `bowtie2`or `hisat2` use the `--reorder` parameter which tells bowtie2 or hisat2 to output
the *sam* files in the **exact** same order as in the *.fastq* files.
Expand Down Expand Up @@ -167,7 +167,7 @@ plot the counts using the `--log1p` option.
Quality control of Hi-C data and biological replicates comparison
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

HiCExplorer integrates multiple tools that allow the evualuation of the quality of Hi-C libraries and matrices.
HiCExplorer integrates multiple tools that allow the evaluation of the quality of Hi-C libraries and matrices.

- hicQC on the log files produced by hicBuildMatrix and control of the pdf file produced.

Expand Down
4 changes: 2 additions & 2 deletions docs/content/list-of-tools.rst
Expand Up @@ -14,13 +14,13 @@ HiCExplorer tools
+--------------------------------------+------------------+-----------------------------------+---------------------------------------------+-----------------------------------------------------------------------------------+
|:ref:`hicCorrectMatrix` | preprocessing | hicMatrix object | normalized hicMatrix object | Uses iterative correction or Knight-Ruiz to remove biases from a Hi-C matrix |
+--------------------------------------+------------------+-----------------------------------+---------------------------------------------+-----------------------------------------------------------------------------------+
|:ref:`hicMergeMatrixBins` | preprocessing | hicMatrix object | hicMatrix object | Merges consecutives bins on a Hi-C matrix to reduce resolution |
|:ref:`hicMergeMatrixBins` | preprocessing | hicMatrix object | hicMatrix object | Merges consecutive bins on a Hi-C matrix to reduce resolution |
+--------------------------------------+------------------+-----------------------------------+---------------------------------------------+-----------------------------------------------------------------------------------+
|:ref:`hicSumMatrices` | preprocessing | 2 or more hicMatrix objects | hicMatrix object | Adds Hi-C matrices of the same size |
+--------------------------------------+------------------+-----------------------------------+---------------------------------------------+-----------------------------------------------------------------------------------+
|:ref:`hicNormalize` | preprocessing | multiple Hi-C matrices | multiple Hi-C matrices | Normalize data to 0 to 1 range or to smallest total read count |
+--------------------------------------+------------------+-----------------------------------+---------------------------------------------+-----------------------------------------------------------------------------------+
|:ref:`hicCorrelate` | analysis | 2 or more hicMatrix objects | a heatmap/scatterplot | Computes and visualises the correlation of Hi-C matrices |
|:ref:`hicCorrelate` | analysis | 2 or more hicMatrix objects | a heatmap/scatter plot | Computes and visualizes the correlation of Hi-C matrices |
+--------------------------------------+------------------+-----------------------------------+---------------------------------------------+-----------------------------------------------------------------------------------+
|:ref:`hicFindTADs` | analysis | hicMatrix object | bedGraph file (TAD score), a boundaries.bed | Identifies Topologically Associating Domains (TADs) |
| | | | file, a domains.bed file (TADs) | |
Expand Down
21 changes: 9 additions & 12 deletions docs/content/mES-HiC_analysis.rst
Expand Up @@ -446,27 +446,24 @@ In following plot we will use the listed track file. Please store it as track.in
depth = 2000000
height = 7
transform = log1p
x labels = yes
type = interaction
file_type = hic_matrix
[tads]
file = TADs/marks_et-al_TADs_20kb-Bins_domains.bed
file_type = domains
border color = black
border_color = black
color = none
height = 5
line width = 1.5
overlay previous = share-y
show data range = no
line_width = 1.5
overlay_previous = share-y
show_data_range = no
[x-axis]
fontsize=16
where=top
fontsize = 16
where = top
[tad score]
file = TADs/marks_et-al_TADs_20kb-Bins_score.bm
title = "TAD separation score"
title = TAD separation score
height = 4
file_type = bedgraph_matrix
Expand All @@ -475,8 +472,8 @@ In following plot we will use the listed track file. Please store it as track.in
[gene track]
file = mm10_genes_sorted.bed
height = 10
title = "mm10 genes"
labels = off
title = mm10 genes
labels = false
We used as a gene track `mm10 genes <https://github.com/lucapinello/Haystack/blob/master/gene_annotations/mm10_genes.bed>`__ and
Expand Down
2 changes: 1 addition & 1 deletion docs/content/tools/hicBuildMatrix.rst
Expand Up @@ -12,7 +12,7 @@ Building multicooler matrices
------------------------------

``hicBuildMatrix`` supports building multicooler matrices which are for example needed for visualization with `HiGlass <https://higlass.io/>`__.
To do so, use as outfile format either .cool or .mcool and define the desired resolutions as `--binSize`.
To do so, use as out file format either .cool or .mcool and define the desired resolutions as `--binSize`.
``hicBuildMatrix`` builds the interaction matrix for the highest resolution and merges the bins for the lower resolutions.
The lower resolutions need to be an integer multiplicative of the highest resolution.

Expand Down
2 changes: 1 addition & 1 deletion docs/content/tools/hicConvertFormat.rst
Expand Up @@ -66,7 +66,7 @@ The cool data format allows to use the following options:
- correction_name: In case correction factors are not stored in 'weight' the correct column name can be defined using this parameter and the resulting matrix will store the values in 'weight'.
- correction_division: Correction factors can be applied by a multiplication or a division. The default behaviour is to use the multiplication, in case the correction factors are inverted, set this parameter.
- store_applied_correction: Set this parameter if correction factors should be applied on the data and should be written back to colum 'counts' in the corrected form and not as raw. Default: not set.
- chromosomes: Define a list of chromosomes which should be included in the output matrix. All chromosomes which are not defined are not part of the new matrix. This parameter can speed up the processing especiallly if only one chromosome is used.
- chromosomes: Define a list of chromosomes which should be included in the output matrix. All chromosomes which are not defined are not part of the new matrix. This parameter can speed up the processing especially if only one chromosome is used.
- enforce_integer: Raw interaction data is stored as integers, after the correction is applied the data is a float. Set a this parameter to enforce integer values in the new matrix.
- load_raw_values: Set this parameter if the interaction data should not be loaded with the correction factors.

Expand Down

0 comments on commit 66f3a66

Please sign in to comment.