Skip to content

Commit

Permalink
Merge branch 'master' of https://github.com/dereneaton/ipyrad
Browse files Browse the repository at this point in the history
  • Loading branch information
isaacovercast committed Nov 15, 2015
2 parents 0548616 + 759daa9 commit 085aa04
Show file tree
Hide file tree
Showing 5 changed files with 141 additions and 61 deletions.
36 changes: 36 additions & 0 deletions docs/Ethos.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@

.. _ Ethos:


How is it different from *pyrad*?
-------------------------------
ipyrad_ is a complete re-write of pyrad_ with an expanded focus on speed and flexibility.
While we continue in the minimalist ethos of pyrad_ which emphasized a simple
installation procedure and ease-of-use, ipyrad_ differs in offering an additional
interactive interface through which to access data and results with simple Python scripts.
We continue to support a command line interface (CLI_) that will be familiar
to legacy pyrad_ users, but the real power of ipyrad_ comes from its
implementation as a Python module which allows users to design complex
assemblies that construct multiple data sets under multiple sets
of parameter settings; to directly access assembly statistics; to plot assembly results;
and to perform interactive downstream analyses.


Features
--------
Major new features and improvements include:

- New assembly methods: *de novo*, reference alignment,
or hybrid (*de novo* & reference).
- Parallel implementation using ipyparallel_ which utilizes MPI
allowing greater use of HPC clusters.
- Better checkpointing. If your job is ever interrupted you should
be able to simply restart the
script and continue from where it left off.
- Faster code (speed comparisons forthcoming with publication).
- Write highly reproducible documented code with Jupyter Notebooks (see Notebook_workflow_).
- No external installations: vsearch, muscle and all other
dependencies are installed with ipyrad_ using conda (see Installation_).


.. include:: global.rst
40 changes: 40 additions & 0 deletions docs/Features.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@

.. _Features:


What does *ipyrad* do?
----------------------
*ipyrad_* can be used to assemble RADseq data sets using `*de novo* assembly`_,
`reference mapping assembly`_, or `*hybrid assembly*`_ -- a combination
of the two approaches. Assembled data sets can be output in a large variety of
`formats`_, facilitating downstream genomic analyses for both population
genetic and phylogenetic studies. It also includes methods for visualizing
data and results, inferring population genetic statistics, and inferring genomic introgression.


How is it different from *pyrad*?
-------------------------------

*ipyrad* is a complete re-write of pyrad_ with an expanded focus on speed and flexibility.
While we continue in the minimalist ethos of pyrad_ which emphasized a simple
installation procedure and ease-of-use, ipyrad_ offers an additional interactive
interface with which to access data and results through simple Python scripts.
We continue to support a command line interface (CLI_) that will be familiar
to legacy pyrad_ users, but the real power of ipyrad_ comes from its
implementation as a Python module which allows users to design complex
assemblies that construct multiple data sets under multiple sets
of parameter settings; to directly access assembly statistics; to plot assembly results;
and to perform interactive downstream analyses.


Major new features and improvements include:

- New assembly methods: *de novo*, reference alignment, or hybrid (*de novo* & reference).
- Parallel implementation using ipyparallel_ which utilizes MPI allowing greater use of HPC clusters.
- Better checkpointing. If your job is ever interrupted you should be able to simply restart the
script and continue from where it left off.
- Faster code (speed comparisons forthcoming with publication).
- Write highly reproducible documented code with Jupyter Notebooks (see Notebook_workflow_).
- No external installations: vsearch, muscle and all other dependencies are installed with ipyrad_
using conda (see Installation_).

76 changes: 46 additions & 30 deletions docs/Installation.rst
Original file line number Diff line number Diff line change
@@ -1,52 +1,68 @@

.. _installation:

Installation
============

.. toctree::
:maxdepth: 2
Installation with conda
-----------------------
The easiest way to install *ipyrad* and all of its dependencies is
to use conda_, which is a command line program for installing Python
packages. If you do not have *conda* installed, follow these
instructions_ to install either *Anaconda* or *Miniconda*
for Python2.7. If you're working on an HPC system you can install
*conda* in your home directory even without administrative privileges.

macports-installation.rst
Either installation wil create a command line program called *conda*
which can be used to install Python packages. The main difference between the
two is that *Anaconda* will also install a large suite of commonly used Python
packages along with it, whereas *Miniconda* is a bare bones version that
includes only the framework to install new packages. Unless you're really hard
up for disk space I recommend installing *Anaconda*.

Using Pip / Easy Install
------------------------
To install *ipyrad* using *conda* simply type the following into a terminal ::

The easiest way to install *ipyrad* and all of its dependencies is
to use *conda*, which is the current standard for installing Python
packages. Follow the very simple instructions to install *Anaconda*
or *Miniconda* for Python2.7 here_. Once installed you can use *conda*
to install additional Python packages including *ipyrad* with commands
like below.
$ conda upgrade ## updates conda
$ conda install ipyrad ## installs the latest release

.. _here: http://conda.pydata.org/docs/install/quick.html
If you wish to install a specific version of ipyrad, or to upgrade to the
latest release from an older version, you could use one of the following commands::

$ conda update
$ conda install ipyrad
$ conda install ipyrad=0.7.0 ## install ipyrad v.0.7.0
$ conda update ipyrad ## update to the latest


In contrast to its predecessor (*pyrad*), *ipryad* makes use of many more
Python resources which are easily bundled with the installation when *conda*
is used. These include the following:
Dependencies
------------
All required dependencies for *ipyrad* should be installed along with
it when using *conda*. This will include the following Python packages,
some of which install additional dependencies of their own.

- Numpy -- Scientific processing
- Scipy -- Scientific processing
- Pandas -- Used for manipulating data frames
- Sphinx -- Used for building documentation
- IPython -- Interactivity
- IPython2 -- Interactive version of Python 2.7
- ipyparallel -- Parallel, threading, MPI support
- jupyter -- Creating reproducible notebooks
- H5py -- Database structure for large data sets
- Cython -- C bindings for Python
- H5py -- Database and HDF5 headers
- Dill -- Store pickle objects of complex Classes
- Toyplot -- [optional]...

Installing on HPC
-----------------
One of the greatest strengths of using *conda* for installation is that it
creates a Python package directory in your home directory called ``~/anaconda/``
where new packages are installed, and because they are not stored in a
system-wide directory you do not need administrator privileges to install
new packages.
- Toyplot -- [optional].


HPC installation
----------------
One of the benefits of using *conda* for installation is that it
creates a Python package directory in your home directory called
`~/anaconda/` (or ~/miniconda/) where new packages are installed.
Make sure you follow the installation instructions_ so that Python
scripts will look in this directory by default.
Because these Python packages are not stored in a system-wide
directory you will not need administrator privileges to install
new packages, nor will you have to load these modules from the
system before using them.

TODO: is MPI an exception to this?


.. include:: global.rst

Expand Down
2 changes: 1 addition & 1 deletion docs/global.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@
.. _muscle: http://www.drive5.com/muscle/
.. _anaconda: http://docs.continuum.io/anaconda/install.html
.. _miniconda: http://repo.continuum.io/miniconda/

.. _`simple_instructions`: http://conda.pydata.org/docs/install/quick.html
48 changes: 18 additions & 30 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,32 +6,23 @@
.. include:: global.rst


ipyrad: assembly and analysis of RADseq data sets
*ipyrad*: interactive assembly and analysis of RADseq data sets
-------------------------------------------------
Welcome to *ipyrad*, an interactive toolkit for assembly and analysis of
restriction-site associated genomic data sets (e.g., RAD, ddRAD, GBS) for
population genetic or phylogenetic studies, with the following goals:

Welcome! ipyrad_ is an interactive toolkit for assembly and analysis of genomic RADseq data sets.
Our goal is to support all restriction-site associated data types (e.g., RAD, ddRAD, GBS;
see Data_types_), and to offer simple but powerful methods for assembling data into
output files for downstream genomic analyses.
- Provide an easy-to-use and intuitive workflow to convert raw data to formatted output files.
- Offer a range of fast and parallelized assembly methods.
- Create a `reproducible framework`_ for designing complex assembly procedures.
- Allow visualization and checks on the quality of data assemblies.
- Enable interactive_ access to assembled data and statistics.

Read more about the broader goals behind *ipyrad* here_.

How is it different from pyrad?
-------------------------------

ipyrad_ is a complete re-write of pyrad_ built with a very different philosophy in mind.
While it retains the easy-to-use command line interface (CLI_) that will be familiar to pyrad_ users,
the real power of ipyrad_ comes from its implementation through a Python API, which allows users to
write scripts that detail complex assemblies able to construct multiple data sets under multiple
parameter settings. Other improvements include:

- 3 modes of assembly: *de novo*, reference alignment, or hybrid (*de novo* & reference).
- Parallel implementation using ipyparallel_ which utilizes MPI allowing greater use of HPC clusters.
- Better checkpointing. If your job is ever interrupted you should be able to simply restart the
script and continue from where it left off.
- Faster code (speed comparisons forthcoming with publication).
- Write highly reproducible documented code with Jupyter Notebooks (see Notebook_workflow_).
- No external installations: vsearch, muscle and all other dependencies are installed with ipyrad_
using conda (see Installation_).
.. here_ :: Ethos.rst
.. `reproducible framework`_ :: Notebooks.rst
.. interactive_ :: interactive.rst
Documentation
Expand All @@ -41,17 +32,14 @@ Documentation
:maxdepth: 2

Ethos.rst
Installation
Command_line_interface.rst
ipyrad_scripts.rst
Notebook_workflow.rst
Tutorials
test_rad
Data_types.rst
Features.rst
Installation.rst
Quick-guide.rst
Assembly.rst
Tutorials.rst
Citing.rst
License.rst
Contributions.rst
Dependencies.rst


Indices and tables
Expand Down

0 comments on commit 085aa04

Please sign in to comment.