Skip to content

Commit

Permalink
Documentation and News (#968)
Browse files Browse the repository at this point in the history
* News and docs

---------

Co-authored-by: katarzyna.otylia.sikora@gmail.com <sikora@maximus.ie-freiburg.mpg.de>
Co-authored-by: WardDeb <ward@deboutte.be>
  • Loading branch information
3 people committed Jan 23, 2024
1 parent d430504 commit 457d7ec
Show file tree
Hide file tree
Showing 3 changed files with 10 additions and 2 deletions.
1 change: 1 addition & 0 deletions docs/content/News.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ snakePipes x.x.x
* Changed the behaviour of snakePipes createEnvs - it is no longer possible to set condaEnvDir with this function. It is required to set it with snakePipes config beforhand, instead. To ingore what's in the defaults.yaml and overwrite the condaEnvDir value with default system conda prefix, use '--autodetectCondaEnvDir'.
* Snakemake options in the defaults.yaml are now an empty string. The required arguments '--use-conda --conda-prefix' have been directly added to the command string. condaEnvDir is parsed from defaults.yaml, requiring running snakePipes config first.
* Added a 'three-prime-seq' mode to mRNAseq (David Koppstein and Katarzyna Sikora).
* Added DESeq2 run on PAS clusters to the 'three-prime-seq' mode of mRNAseq.
* Added support for multiple comparison groups to ChIPseq and ATAC-seq.
* Added SEACR as an optional peak caller to ChIPseq.
* Fixes #819
Expand Down
2 changes: 2 additions & 0 deletions docs/content/workflows/ATAC-seq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,8 @@ For more information on the contents of the **CSAW_MACS2_sampleSheet** folder, s

.. note:: The ``_sampleSheet`` suffix for the ``CSAW_MACS2_sampleSheet`` is drawn from the name of the sample sheet you use. So if you instead named the sample sheet ``mySampleSheet.txt`` then the folder would be named ``CSAW_mySampleSheet``. This facilitates using multiple sample sheets. Similarly, ``_MACS2`` portion will be different if you use HMMRATAC or Genrich for peak calling.

.. note:: If you provide a sampleSheet with name, condition and group columns, "multiple comparison mode" will be detected. The original sampleSheet will be split on the group column, and multiple pairwise comparisons will be run with CSAW, one per group.

.. note:: The output from Genrich will be peaks called per-group if you specify a sample sheet. This is because Genrich is capable of directly using replicates during peak calling.


Expand Down
9 changes: 7 additions & 2 deletions docs/content/workflows/ChIP-seq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -210,14 +210,19 @@ Following up on the DNA-mapping module results (see :doc:`DNA-mapping`), the wor

* **histoneHMM**: This folder contains the output of `histoneHMM <https://github.com/matthiasheinig/histoneHMM>`__. This folder will only exist if you have broad marks.

* **CSAW_sampleSheet**: This folder is created optionally, if you provide a sample sheet for differential binding analysis. (see :ref:`diffBinding`)
* **CSAW_sampleSheet**: This folder is created optionally, if you provide a sample sheet for differential binding analysis. (see :ref:`diffBinding`) CSAW will be run using peaks called by the chosen peak caller, and the output folder will be named accordingly.
* **AnnotatedResults_sampleSheet**: This folder is created optionally, if you provide a sample sheet for differential binding analysis. (see :ref:`diffBinding`). Differentially bound regions annotated with distance to nearest gene are stored here.

.. note:: Although in case of broad marks, we also perform the MACS2 `broadpeak` analysis (output available as ``MACS2/<sample>.filtered.BAM_peaks.broadPeak``), we would recommend using the histoneHMM outputs in these cases, since histoneHMM produces better results than MACS2 for broad peaks.

.. note:: For narrow marks, the user may choose the peak caller from MACS2 (default), Genrich or `SEACR <https://github.com/FredHutch/SEACR>`__. By deafult, SEACR is run in the stringent mode, applying normalization to counts over bed files. If invoked together with ``--useSpikeInForNorm``, SEACR will be run in stringent mode, using spikein-normalized counts. FDR can be set by the user (default 0.05).

.. note:: The ``_sampleSheet`` suffix for the ``CSAW_sampleSheet`` is drawn from the name of the sample sheet you use. So if you instead named the sample sheet ``mySampleSheet.txt`` then the folder would be named ``CSAW_mySampleSheet``. This facilitates using multiple sample sheets.

.. note:: At the moment Genrich is NOT jointly calling peaks within a group since it's not aware of which samples contain which antibody. It is utilizing the input control if one exists.
.. note:: If you provide a sampleSheet with name, condition and group columns, "multiple comparison mode" will be detected. The original sampleSheet will be split on the group column, and multiple pairwise comparisons will be run with CSAW, one per group.


.. note:: If provided with sampleSheet, Genrich will be used to jointly call peaks within a condition group. It will utilize the input controls if they exist.


Command line options
Expand Down

0 comments on commit 457d7ec

Please sign in to comment.