Skip to content

Commit

Permalink
Update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
nvictus committed Aug 26, 2016
1 parent ace61ed commit 4a8e8b1
Show file tree
Hide file tree
Showing 2 changed files with 45 additions and 7 deletions.
15 changes: 13 additions & 2 deletions docs/glossary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,21 @@ HDF5 is a general purpose binary container format for large scientific datasets.

h5py is a Python library providing low-level bindings to the libhdf5 C-library and a high-level, numpy-aware API to interact with HDF5 files on disk.

Cooler is a Python package providing an API to create Cooler-HDF5 files and to interact with them both as data frames and sparse matrices.
Cooler is a Python package providing an API to create COOL files and to interact with them both as data frames and sparse matrices.

The Cooler Schema is a flexible binary schema for Hi-C data based on a two-table sparse data model.

COOL is also the name of an implementation of the Cooler schema using HDF5 as the container format.

Nomenclature subject to change... :)
Chrom table

Bin table

Pixel table

Contact list

Contact matrix


Some nomenclature subject to change... :)
37 changes: 32 additions & 5 deletions docs/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,22 +7,36 @@ Installation

Requirements:

- Python 2.7/3.3+
- libhdf5 and Python packages ``numpy``, ``scipy``, ``pandas``, ``h5py``. These packages have heavy binary dependencies, so if you don't have them installed already, we recommend you use the `conda <http://conda.pydata.org/miniconda.html>`_ package manager to manage them instead of pip.
- All other Python package dependencies are easily handled by pip.
- Python 2.7 or 3.3+
- libhdf5
- Python packages ``numpy``, ``scipy``, ``pandas``, ``h5py``.

We highly recommend using the conda package manager to install scientific packages like these. To get ``conda``, you can download either the full `Anaconda <https://www.continuum.io/downloads>`_ Python distribution which comes with lots of data science software or the minimal `Miniconda <http://conda.pydata.org/miniconda.html>`_ distribution which is just the standalone package manager plus Python. In the latter case, you can install the packages as follows:

Install from PyPI using pip.
::

$ conda install numpy scipy pandas h5py


Install Cooler from PyPI using pip.

::

$ pip install cooler

All other Python package dependencies are automatically handled by pip.


Additionally, the following tools are required for building ``cool`` files from contact lists:

- Parallel gzip ``pigz``. Install using your system package manager.
- Tabix/bgzf. These come with `Samtools <http://www.htslib.org/download/>`_ but are also available on system package managers like ``brew`` (Mac OS) and ``apt`` (Ubuntu). Alternatively, if you are using ``conda``, consider adding the `bioconda <https://bioconda.github.io/>`_ channel to get access to many more bioinformatics packages.


Command line interface
----------------------

The ``cooler`` library includes utilities for creating and querying `cool` files and for performing out-of-core contact matrix balancing on a cooler file of any resolution. See the :ref:`cli-reference` for more information.
The ``cooler`` library includes utilities for creating and querying ``cool`` files and for performing out-of-core contact matrix balancing on a cooler file of any resolution.

::

Expand All @@ -47,6 +61,12 @@ Output:
chr3 10000000 10010000 chr17 1800000 1810000 1 0.745982


See also:

- `CLI Reference <http://cooler.readthedocs.io/en/latest/cli.html>`_ for more information.
- Jupyter Notebook `walkthrough <https://github.com/mirnylab/cooler-binder>`_.


Python API
----------

Expand Down Expand Up @@ -78,3 +98,10 @@ The ``cooler`` library provides a thin wrapper over the excellent NumPy-aware `h
>>> resolution = c.info['bin-size']
>>> mat = c.matrix(balance=True).fetch('chr5:10,000,000-15,000,000')
>>> plt.matshow(np.log10(mat.toarray()), cmap='YlOrRd')


See also:

- `API Reference <http://cooler.readthedocs.io/en/latest/api.html>`_ for more information.
- Jupyter Notebook `walkthrough <https://github.com/mirnylab/cooler-binder>`_.

0 comments on commit 4a8e8b1

Please sign in to comment.