Skip to content

Commit

Permalink
Reworked Citations
Browse files Browse the repository at this point in the history
  • Loading branch information
dstrib committed Sep 30, 2021
1 parent 7549ce9 commit f33b19a
Show file tree
Hide file tree
Showing 3 changed files with 72 additions and 37 deletions.
31 changes: 20 additions & 11 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,16 @@ hybkit
.. image:: https://img.shields.io/pypi/pyversions/hybkit?logo=python&logoColor=white
:target: https://pypi.org/project/hybkit/
:alt: PyPI - Python Version

.. image:: https://img.shields.io/badge/License-GPLv3+-blue?logo=GNU
:target: https://www.gnu.org/licenses/gpl-3.0.en.html
:alt: GNU GPLv3+ License

| Welcome to *hybkit*, a toolkit for analysis of ".hyb" format chimeric (hybrid) RNA sequence data
generated from ribonomics techniques such as Crosslinking, Ligation, and
Sequencing of Hybrids (CLASH) and Quick CLASH (qCLASH).
defined with the Hyb software package by |Travis2014|.
This genomic data-type is generated from ribonomics techniques such as Crosslinking, Ligation, and
Sequencing of Hybrids (CLASH; |Helwak2013|) and Quick CLASH (qCLASH; |Gay2018|).
| This software is available via Github, at http://www.github.com/RenneLab/hybkit .
| Full project documentation is available at
`hybkit's ReadTheDocs <https://hybkit.readthedocs.io/>`_.
| Full project documentation is available at |docs_link|_.
This project contains multiple components:
#. The hybkit toolkit of command-line utilities for manipulating,
Expand All @@ -32,23 +34,27 @@ This project contains multiple components:
Hybkit Toolkit:
hybkit includes command-line utilities for the manipulation of ".hyb" format data:

=================================== =========================================================
=================================== ==========================================================
Utility Description
=================================== =========================================================
=================================== ==========================================================
hyb_check Read a ".hyb" file and check for errors
hyb_analyze Analyze and set details for hyb records, such as segtypes
hyb_analyze Analyze and set details for hyb records, such as seg types
hyb_filter Filter a ".hyb" file to a specific subset of sequences
hyb_type_analysis (pending) Perform a type analysis on a prepared "hyb" file
hyb_mirna_count_anlaysis (pending) Perform a miRNA_count analysis on a prepared "hyb" file
hyb_summary_anlaysis (pending) Perform a summary analysis on a prepared "hyb" file
hyb_mirna_target_analysis (pending) Perform a mirna_target analysis on a prepared "hyb" file
hyb_fold_analysis (pending) Perform a fold analysis on a prepared "hyb" file
=================================== =========================================================
=================================== ==========================================================

These scripts are used on the command line with hyb-format files. For example, to filter a
hyb file to contain only hybrids with a sequence identifier containing the string "kshv"::
hyb file to contain only hybrids with a sequence identifier containing the string "kshv"

Example:

::

$ hyb_filter -i my_hyb_file.hyb --filter seg_contains kshv
$ hyb_filter -i my_hyb_file.hyb --filter seg_contains kshv

Further detail on the usage of each script is provided in
the |hybkit Toolkit| section of |docs_link|_.
Expand Down Expand Up @@ -144,6 +150,9 @@ Further documentation on hybkit usage can be found in |docs_link|_.
.. |hybkit API| replace:: *hybkit API*
.. |docs_link| replace:: hybkit's ReadTheDocs
.. _docs_link: https://hybkit.readthedocs.io#
.. |Travis2014| replace:: *Travis et al. (Methods 2014)*
.. |Helwak2013| replace:: *Helwak et al. (Cell 2013)*
.. |Gay2018| replace:: *Gay et al. (J. Virol. 2013)*
.. |sample_01_image| image:: sample_01_summary_analysis/example_output/combined_analysis_type_hybrids.png

.. include:: docs_readme_format.rst
38 changes: 30 additions & 8 deletions docs/source/about.rst
Original file line number Diff line number Diff line change
@@ -1,14 +1,36 @@

References
==========

#. "Travis, Anthony J., et al. "Hyb: a bioinformatics pipeline for the analysis of CLASH
(crosslinking, ligation and sequencing of hybrids) data."
Methods 65.3 (2014): 263-273."
#. Gay, Lauren A., et al. "Modified cross-linking, ligation, and sequencing of
hybrids (qCLASH) identifies Kaposi's Sarcoma-associated herpesvirus microRNA
targets in endothelial cells." Journal of virology 92.8 (2018): e02138-17.
#. The Vienna File Format: http://unafold.rna.albany.edu/doc/formats.php#VIENNA
.. [ViennaFormat] `ViennaRNA Vienna File Format Description <https://www.tbi.univie.ac.at/RNA/tutorial/#sec2_7>`_
`UNAFold Vienna File Format Description <https://www.tbi.univie.ac.at/RNA/tutorial/#sec2_7>`_
.. [CTFormat] `UNAFold CT Format Description <http://www.unafold.org/doc/formats.php#CT>`_
`RNAStructure CT Format Description <https://rna.urmc.rochester.edu/Text/File_Formats.html#CT>`_
`RNAStructure CT Format Description <https://rna.urmc.rochester.edu/Text/File_Formats.html#CT>`_
.. [Zuker2003] Zuker M. Mfold web server for nucleic acid folding and hybridization
prediction. Nucleic Acids Res. 2003 Jul 1;31(13):3406-15. doi: 10.1093/nar/gkg595.
PMID: 12824337; PMCID: PMC169194.
.. [Hunter2007] J. Hunter, "Matplotlib: A 2D Graphics Environment" in Computing in
Science & Engineering, vol. 9, no. 03, pp. 90-95, 2007.
doi: 10.1109/MCSE.2007.55
.. [Cock2009] Cock PJ, Antao T, Chang JT, Chapman BA, Cox CJ, Dalke A, Friedberg I,
Hamelryck T, Kauff F, Wilczynski B, de Hoon MJ. Biopython: freely available
Python tools for computational molecular biology and bioinformatics. Bioinformatics.
2009 Jun 1;25(11):1422-3. doi: 10.1093/bioinformatics/btp163. Epub 2009 Mar 20.
PMID: 19304878; PMCID: PMC2682512.
.. [Lorenz2011] Lorenz, R., Bernhart, S.H., Höner zu Siederdissen, C. et al.
ViennaRNA Package 2.0. Algorithms Mol Biol 6, 26 (2011).
doi: 10.1186/1748-7188-6-26
.. [Helwak2013] Helwak A, Kudla G, Dudnakova T, Tollervey D. Mapping the human miRNA
interactome by CLASH reveals frequent noncanonical binding. Cell. 2013
Apr 25;153(3):654-65. doi: 10.1016/j.cell.2013.03.043. PMID: 23622248; PMCID: PMC3650559.
.. [Travis2014] Travis AJ, et al. Hyb: a bioinformatics pipeline for the analysis of
CLASH (crosslinking, ligation and sequencing of hybrids) data.
Methods. 2014 Feb;65(3):263-73. doi: 10.1016/j.ymeth.2013.10.015.
.. [Gay2018] Gay LA, Sethuraman S, Thomas M, Turner PC, Renne R. Modified Cross-Linking,
Ligation, and Sequencing of Hybrids (qCLASH) Identifies Kaposi's
Sarcoma-Associated Herpesvirus MicroRNA Targets in Endothelial Cells.
J Virol. 2018 Mar 28;92(8):e02138-17. doi: 10.1128/JVI.02138-17.
PMID: 29386283; PMCID: PMC5874430.
About
Expand Down
40 changes: 22 additions & 18 deletions hybkit/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,21 @@

"""
This module contains classes and methods for reading, writing, and manipulating data
in the ".hyb" genomic sequence format.
in the ".hyb" genomic sequence format (|Travis2014|).
This is primarily based on two classes for storage of
This is primarily based on three classes for storage of
chimeric sequence information and associated fold-information:
+---------------------+----------------------------------------------------------------------+
| :class:`HybRecord` | Class for storage of hybrid sequence records |
+---------------------+----------------------------------------------------------------------+
| :class:`FoldRecord` | Minimal class for storage of predicted RNA |
| | secondary structure information for chimeric sequence reads |
+---------------------+----------------------------------------------------------------------+
+----------------------------+---------------------------------------------------------------+
| :class:`HybRecord` | Class for storage of hybrid sequence records |
+----------------------------+---------------------------------------------------------------+
| :class:`FoldRecord` | Minimal class for storage of predicted RNA |
| | secondary structure information for chimeric sequence reads |
+----------------------------+---------------------------------------------------------------+
| :class:`DynamicFoldRecord` | Minimal class for storage of predicted RNA |
| | secondary structure information for sequence constructed from |
| | aligned portions of chimeric sequence reads |
+----------------------------+---------------------------------------------------------------+
It also includes classes for reading, writing, and iterating over files containing that
information:
Expand All @@ -41,8 +45,6 @@
Todo:
Add Hybrecord.to_csv_header()
Create hyb-format database download script.
Add user-friendly individual scripts.
Implement all sample analyses with bash workflows and individual scripts.
Implement all sample analyses with nextflow workflows and individual scripts.
Make decision and clean "extra" scripts.
Expand Down Expand Up @@ -80,12 +82,12 @@ class HybRecord(object):
"""
Class for storing and analyzing chimeric (hybrid) RNA-seq reads in ".hyb" format.
Hyb format entries are a GFF-related file format described by Travis, et al.
Hyb format entries are a GFF-related file format described by |Travis2014|
(see :ref:`References`)
that contain information about a genomic sequence read identified to be a chimera by
anlaysis software. Each line contains 15 or 16 columns separated by tabs ("\\\\t") and provides
annotations on each components. An example .hyb format line
from Gay et al. (See :ref:`References`)::
from |Gay2018|::
2407_718\tATCACATTGCCAGGGATTTCCAATCCCCAACAATGTGAAAACGGCTGTC\t.\tMIMAT0000078_MirBase_miR-23a_microRNA\t1\t21\t1\t21\t0.0027\tENSG00000188229_ENST00000340384_TUBB2C_mRNA\t23\t49\t1181\t1207\t1.2e-06
Expand Down Expand Up @@ -1448,11 +1450,12 @@ class FoldRecord(object):
Class for storing secondary structure (folding) information for a nucleotide sequence.
This class supports the following file types:
(Data courtesy of Gay et al. [see :ref:`References`])
(Data courtesy of |Gay2018|)
.. _vienna_file_format:
* | The .vienna file format used by the RNAStructure package (see :ref:`References`):
* | The .vienna file format used by the ViennaRNA package (see :ref:`References`;
|ViennaFormat|; |Lorenz2011|):
Example:
::
Expand All @@ -1461,7 +1464,8 @@ class FoldRecord(object):
TAGCTTATCAGACTGATGTTAGCTTATCAGACTGATG
.....((((((.((((((......)))))).)))))) (-11.1)
* | The .ct file format utilized by the UNAFold Software Package:
* | The .ct file format used by UNAFold and other packages (see :ref:`References`;
|CTFormat|, |Zuker2003|):
Example:
::
Expand Down Expand Up @@ -1548,7 +1552,7 @@ def to_vienna_string(self, newline=False):
suffix = ''
return ('\n'.join(self.to_vienna_lines(newline=False)) + suffix)

# DynamicFoldRecord : Public Methods : HybRecord Comparison
# FoldRecord : Public Methods : HybRecord Comparison
def count_hyb_record_mismatches(self, hyb_record):
"""
Count mismatches between dynamic hyb_record seq and fold_record.seq
Expand Down Expand Up @@ -1622,7 +1626,7 @@ def from_vienna_lines(cls,
error_mode='raise',
):
"""
Construct instance from a list of 3 strings of Vienna-format lines.
Construct instance from a list of 3 strings of Vienna-format (|ViennaFormat|) lines.
Args:
record_lines (str or tuple): Iterable of 3 strings corresponding to lines of a
Expand Down Expand Up @@ -1704,7 +1708,7 @@ def from_vienna_string(cls, record_string, hybformat_file=False):
def from_ct_lines(cls, record_lines, error_mode=None):
"""
Create a FoldRecord entry from a list of an arbitrary number of strings
corresponding to lines in the ".ct" file format.
corresponding to lines in the ".ct" file format (|CTFormat|).
Args
error_mode (str, optional): 'string representing the error mode.
Expand Down

0 comments on commit f33b19a

Please sign in to comment.