Skip to content

Commit

Permalink
Merge branch '1.0.x' into 'master'
Browse files Browse the repository at this point in the history
  • Loading branch information
remram44 committed Jun 19, 2018
2 parents a33ddc6 + ec6c9c4 commit 5e70c6e
Show file tree
Hide file tree
Showing 22 changed files with 526 additions and 450 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
[![Matrix](https://img.shields.io/badge/chat-matrix.org-blue.svg)](https://riot.im/app/#/room/#reprozip:matrix.org)
[![Say Thanks!](https://img.shields.io/badge/Say%20Thanks-!-1EAEDB.svg)](https://saythanks.io/to/remram44)
[![status](https://img.shields.io/badge/JOSS-10.21105%2Fjoss.00107-green.svg)](http://joss.theoj.org/papers/b578b171263c73f64dfb9d040ca80fe0)
[![DOI](https://img.shields.io/badge/DOI-10.5281%2Fzenodo.1210301-green.svg)](https://doi.org/10.5281/zenodo.1210301)
[![DOI](https://img.shields.io/badge/DOI/10.5281%2Fzenodo.1247557-green.svg)](https://doi.org/10.5281/zenodo.1247557)

ReproZip
========
Expand Down Expand Up @@ -32,7 +32,7 @@ To run it with reprozip, you just need to use the prefix *reprozip trace*:

$ reprozip trace ./myexperiment -my --options inputs/somefile.csv other_file_here.bin

This command creates a *.reprozip* directory, in which you'll find the configuration file, named *config.yml*. You can edit the command line and environment variables, and choose which files to pack.
This command creates a *.reprozip-trace* directory, in which you'll find the configuration file, named *config.yml*. You can edit the command line and environment variables, and choose which files to pack.

If you are using Debian or Ubuntu, most of these files (library dependencies) are organized by package. You can add or remove files, or choose not to include a package by changing option *packfiles* from true to false. In this way, smaller packs can be created with reprozip (if space is an issue), and reprounzip can download these files from the package manager; however, note this is only available for Debian and Ubuntu for now, and also be aware that package versions might differ. Choosing which files to pack is also important to remove sensitive information and third-party software that is not open source and should not be distributed.

Expand Down
4 changes: 2 additions & 2 deletions docs/developerguide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Writing Unpackers

ReproZip is divided into two steps. The first is packing, which gives a generic package containing the trace SQLite database, the YAML configuration file (which lists the paths, packages, and metadata such as command line, environment variables, and input/output files), and actual files. In the second step, a package can be run using *reprounzip*. This decoupling allows the reproducer to select the unpacker of his/her desire, and also means that when a new unpacker is released, users will be able to use it on their old packages.

Currently, different unpackers are maintained: the defaults ones (``directory`` and ``chroot``), ``vagrant`` (distributed as `reprounzip-vagrant <https://pypi.python.org/pypi/reprounzip-vagrant>`__) and ``docker`` (distributed as `reprounzip-docker <https://pypi.python.org/pypi/reprounzip-docker>`__). However, the interface is such that new unpackers can be easily added. While taking a look at the "official" unpackers' source is probably a good idea, this page gives some useful information about how they work.
Currently, different unpackers are maintained: the defaults ones (``directory`` and ``chroot``), ``vagrant`` (distributed as `reprounzip-vagrant <https://pypi.org/project/reprounzip-vagrant>`__) and ``docker`` (distributed as `reprounzip-docker <https://pypi.org/project/reprounzip-docker>`__). However, the interface is such that new unpackers can be easily added. While taking a look at the "official" unpackers' source is probably a good idea, this page gives some useful information about how they work.

ReproZip Pack Format (``.rpz``)
'''''''''''''''''''''''''''''''
Expand All @@ -33,7 +33,7 @@ The ``METADATA/trace.sqlite3`` file is the original trace generated by the C tra
Structure
'''''''''

An unpacker is a Python module. It can be distributed separately or be a part of a bigger distribution, given that it is declared in that distribution's ``setup.py`` as an `entry_point` to be registered with `pkg_resources` (see `setuptools' dynamic discovery of services and plugins <https://pythonhosted.org/setuptools/setuptools.html#dynamic-discovery-of-services-and-plugins>`__ section). You should declare a function as `entry_point` ``reprounzip.unpackers``. The name of the entry_point (before ``=``) will be the *reprounzip* subcommand, and the value is a callable that will get called with the :class:`argparse.ArgumentParser` object for that subcommand.
An unpacker is a Python module. It can be distributed separately or be a part of a bigger distribution, given that it is declared in that distribution's ``setup.py`` as an `entry_point` to be registered with `pkg_resources` (see `setuptools' dynamic discovery of services and plugins <https://setuptools.readthedocs.io/en/latest/setuptools.html#dynamic-discovery-of-services-and-plugins>`__ section). You should declare a function as `entry_point` ``reprounzip.unpackers``. The name of the entry_point (before ``=``) will be the *reprounzip* subcommand, and the value is a callable that will get called with the :class:`argparse.ArgumentParser` object for that subcommand.

The package :mod:`reprounzip.unpackers` is a namespace package, so you should be able to add your own unpackers there if you want to. Please remember to put the correct code in the ``__init__.py`` file (which you can copy from `here <https://github.com/ViDA-NYU/reprozip/blob/master/reprounzip/reprounzip/unpackers/__init__.py>`__) so that namespace packages work correctly.

Expand Down
2 changes: 1 addition & 1 deletion docs/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ Can ReproZip pack graphical tools?

Yes!
On Linux, graphical display is handled by the X server. Applications can connect to it as clients to display their windows and components, and to get user input.
Most unpackers now support forwarding the X connection from the experiment to the X server running on the unpacking machine. You will need a running X server to make this work, such as `Xming <http://sourceforge.net/projects/xming/>`__ for Windows or `XQuartz <http://xquartz.macosforge.org/>`__ for Mac OS X. If you are running Linux, chances are that an X server is already configured and running.
Most unpackers now support forwarding the X connection from the experiment to the X server running on the unpacking machine. You will need a running X server to make this work, such as `Xming <https://sourceforge.net/projects/xming/>`__ for Windows or `XQuartz <https://www.xquartz.org/>`__ for Mac OS X. If you are running Linux, chances are that an X server is already configured and running.

X support is **not** enabled by default; to enable it, use the flag ``--enable-x11`` in the ``run`` command of your preferred unpacker.

Expand Down
Binary file removed docs/figures/reprounzip-qt-defaultApp-1.png
Binary file not shown.
Binary file removed docs/figures/reprounzip-qt-defaultApp.png
Binary file not shown.
4 changes: 2 additions & 2 deletions docs/graph.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
Visualizing the Provenance Graph
********************************

.. note:: If you are using a Python version older than 2.7.3, this feature will not be available due to `Python bug 13676 <http://bugs.python.org/issue13676>`__ related to sqlite3.
.. note:: If you are using a Python version older than 2.7.3, this feature will not be available due to `Python bug 13676 <https://bugs.python.org/issue13676>`__ related to sqlite3.

To generate a *provenance graph* related to the experiment execution, the ``reprounzip graph`` command should be used::

Expand All @@ -15,7 +15,7 @@ Alternatively, you can generate the graph after running ``reprozip trace`` witho

$ reprounzip graph [-d tracedirectory] graphfile.dot

The graph is outputted in the `DOT <http://en.wikipedia.org/wiki/DOT_(graph_description_language)>`__ language. You can use `Graphviz <http://www.graphviz.org/>`__ to load and visualize the graph::
The graph is outputted in the `DOT <https://en.wikipedia.org/wiki/DOT_(graph_description_language)>`__ language. You can use `Graphviz <http://www.graphviz.org/>`__ to load and visualize the graph::

$ dot -Tpng graphfile.dot -o graph.png

Expand Down
8 changes: 0 additions & 8 deletions docs/gui.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,6 @@ If you are using Anaconda2, you can install *reprounzip-qt* from anaconda.org::

Otherwise, you will need to `install PyQt4 <https://www.riverbankcomputing.com/software/pyqt/download>`__ before you can install *reprounzip-qt* from pip (on Debian or Ubuntu, you can use ``apt-get install python-qt4``).

On Linux, setting it as the default to open ``.rpz`` files is a bit more involved. Once the application is setup, run `this script <https://gist.github.com/remram44/0092c0b27269cfd0e5530428612d9309>`__ by opening up a terminal window and entering ``bash register-linux.sh``. From there, you can back-click on a ``.rpz`` file, and set the default application to ReproUnzip. The following is an example of this process on Ubuntu 16.04:

.. image:: figures/reprounzip-qt-defaultApp.png

If ReproUnzip doesn't appear right away, simply click "Other Application" and find it in the menu, alphabetically.

.. image:: figures/reprounzip-qt-defaultApp-1.png

Usage
============

Expand Down
26 changes: 13 additions & 13 deletions docs/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,14 @@ For Linux distributions, both *reprozip* and *reprounzip* components are availab
Required Software Packages
--------------------------

Python 2.7.3 or greater, or 3.3 or greater is required to run ReproZip [#bug]_. If you don't have Python on your machine, you can get it from `python.org <https://www.python.org/>`__. You will also need the `pip <https://pip.pypa.io/en/latest/installing.html>`__ installer.
Python 2.7.3 or greater, or 3.3 or greater is required to run ReproZip [#bug]_. If you don't have Python on your machine, you can get it from `python.org <https://www.python.org/>`__. You will also need the `pip <https://pip.pypa.io/en/latest/installing/>`__ installer.

Besides Python and pip, each component or plugin to be used may have additional dependencies that you need to install (if you do not have them already installed in your environment), as described below:

+------------------------+-------------------------------------------------+
| Component / Plugin | Required Software Packages |
+========================+=================================================+
| *reprozip* | `SQLite <http://www.sqlite.org/>`__, |
| *reprozip* | `SQLite <https://www.sqlite.org/>`__, |
| | Python headers, |
| | a working C compiler |
+------------------------+-------------------------------------------------+
Expand Down Expand Up @@ -56,9 +56,9 @@ You can get the dependencies using the Yum packaging manager::

yum install python python-devel gcc sqlite-devel openssl-devel libffi-devel

.. [#bug] ``reprozip`` and ``reprounzip graph`` will not work before 2.7.3 due to `Python bug 13676 <http://bugs.python.org/issue13676>`__ related to sqlite3. Python 2.6 is ancient and unsupported.
.. [#bug] ``reprozip`` and ``reprounzip graph`` will not work before 2.7.3 due to `Python bug 13676 <https://bugs.python.org/issue13676>`__ related to sqlite3. Python 2.6 is ancient and unsupported.
.. [#pycrypto] Required to build `PyCrypto <https://www.dlitz.net/software/pycrypto/>`__.
.. [#vis1] `VisTrails v2.2.3+ <http://www.vistrails.org/>`__ is required to run the workflow generated by the plugin.
.. [#vis1] `VisTrails v2.2.3+ <https://www.vistrails.org/>`__ is required to run the workflow generated by the plugin.
Installing *reprozip*
---------------------
Expand Down Expand Up @@ -94,7 +94,7 @@ An installer containing Python 2.7, *reprounzip*, and all the plugins can be `do
Required Software Packages
--------------------------

Python 2.7.3 or greater, or 3.3 or greater is required to run ReproZip [#bug2]_. If you don't have Python on your machine, you can get it from `python.org <https://www.python.org/>`__; you should prefer a 2.x release to a 3.x one. You will also need the `pip <https://pip.pypa.io/en/latest/installing.html>`__ installer.
Python 2.7.3 or greater, or 3.3 or greater is required to run ReproZip [#bug2]_. If you don't have Python on your machine, you can get it from `python.org <https://www.python.org/>`__; you should prefer a 2.x release to a 3.x one. You will also need the `pip <https://pip.pypa.io/en/latest/installing/>`__ installer.

Besides Python and pip, each component or plugin to be used may have additional dependencies that you need to install (if you do not have them already installed in your environment), as described below:

Expand All @@ -113,13 +113,13 @@ Besides Python and pip, each component or plugin to be used may have additional
| *reprounzip-vistrails* | None [#vis2]_ |
+------------------------+-------------------------------------------------+

You will need Xcode installed, which you can get from the Mac App Store, and the Command Line Developer Tools; instrucions on installing the latter may depend on your Mac OS X version (some information on StackOverflow `here <http://stackoverflow.com/questions/9329243/xcode-4-4-and-later-install-command-line-tools?answertab=active#tab-top>`__).
You will need Xcode installed, which you can get from the Mac App Store, and the Command Line Developer Tools; instrucions on installing the latter may depend on your Mac OS X version (some information on StackOverflow `here <https://stackoverflow.com/questions/9329243/xcode-install-command-line-tools?answertab=active#tab-top>`__).

.. seealso:: :ref:`Why does reprounzip-vagrant installation fail with error "unknown argument: -mno-fused-madd" on Mac OS X? <compiler_mac>`

.. [#bug2] ``reprozip`` and ``reprounzip graph`` will not work before 2.7.3 due to `Python bug 13676 <http://bugs.python.org/issue13676>`__ related to sqlite3. Python 2.6 is ancient and unsupported.
.. [#bug2] ``reprozip`` and ``reprounzip graph`` will not work before 2.7.3 due to `Python bug 13676 <https://bugs.python.org/issue13676>`__ related to sqlite3. Python 2.6 is ancient and unsupported.
.. [#pycrypto2] Required to build `PyCrypto <https://www.dlitz.net/software/pycrypto/>`__.
.. [#vis2] `VisTrails v2.2.3+ <http://www.vistrails.org/>`__ is required to run the workflow generated by the plugin.
.. [#vis2] `VisTrails v2.2.3+ <https://www.vistrails.org/>`__ is required to run the workflow generated by the plugin.
Installing *reprounzip*
-----------------------
Expand Down Expand Up @@ -152,7 +152,7 @@ A 32-bit installer containing Python 2.7, *reprounzip*, and all the plugins can
Required Software Packages
--------------------------

Python 2.7.3 or greater, or 3.3 or greater is required to run ReproZip [#bug3]_. If you don't have Python on your machine, you can get it from `python.org <https://www.python.org/>`__; you should prefer a 2.x release to a 3.x one. You will also need the `pip <https://pip.pypa.io/en/latest/installing.html>`__ installer.
Python 2.7.3 or greater, or 3.3 or greater is required to run ReproZip [#bug3]_. If you don't have Python on your machine, you can get it from `python.org <https://www.python.org/>`__; you should prefer a 2.x release to a 3.x one. You will also need the `pip <https://pip.pypa.io/en/latest/installing/>`__ installer.

Besides Python and pip, each component or plugin to be used may have additional dependencies that you need to install (if you do not have them already installed in your environment), as described below:

Expand All @@ -172,9 +172,9 @@ Besides Python and pip, each component or plugin to be used may have additional

.. seealso:: :ref:`Why does reprounzip-vagrant installation fail with error "Unable to find vcvarsall.bat" on Windows? <pycrypto_windows>`

.. [#bug3] ``reprozip`` and ``reprounzip graph`` will not work before 2.7.3 due to `Python bug 13676 <http://bugs.python.org/issue13676>`__ related to sqlite3. Python 2.6 is ancient and unsupported.
.. [#pycrypto3] A working C compiler is required to build PyCrypto. For installation without building from source, please see `this page <http://stackoverflow.com/questions/11405549/how-do-i-install-pycrypto-on-windows>`__.
.. [#vis3] `VisTrails v2.2.3+ <http://www.vistrails.org/>`__ is required to run the workflow generated by the plugin.
.. [#bug3] ``reprozip`` and ``reprounzip graph`` will not work before 2.7.3 due to `Python bug 13676 <https://bugs.python.org/issue13676>`__ related to sqlite3. Python 2.6 is ancient and unsupported.
.. [#pycrypto3] A working C compiler is required to build PyCrypto. For installation without building from source, please see `this page <https://stackoverflow.com/questions/11405549/how-do-i-install-pycrypto-on-windows>`__.
.. [#vis3] `VisTrails v2.2.3+ <https://www.vistrails.org/>`__ is required to run the workflow generated by the plugin.
Installing *reprounzip*
-----------------------
Expand All @@ -193,7 +193,7 @@ Or you can install *reprounzip* and choose components manually::
Anaconda
========

*reprozip* and *reprounzip* can also be installed on the `Anaconda <https://store.continuum.io/cshop/anaconda>`__ Python distribution, from anaconda.org::
*reprozip* and *reprounzip* can also be installed on the `Anaconda <https://www.anaconda.com/download/>`__ Python distribution, from anaconda.org::

$ conda install --channel vida-nyu reprozip reprounzip reprounzip-docker reprounzip-vagrant reprounzip-vistrails

Expand Down
2 changes: 1 addition & 1 deletion docs/reprozip.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,6 @@ Reproducibility is a core component of the scientific process: it helps research

The truth is computational reproducibility can be very painful to achieve for a number of reasons. Take the author-reviewer scenario of a scientific paper as an example. Authors must generate a compendium that encapsulates all the inputs needed to correctly reproduce their experiments: the data, a complete specification of the experiment and its steps, and information about the originating computational environment (OS, hardware architecture, and library dependencies). Keeping track of this information manually is rarely feasible: it is both time-consuming and error-prone. First, computational environments are complex, consisting of many layers of hardware and software, and the configuration of the OS is often hidden. Second, tracking library dependencies is challenging, especially for large experiments. If authors did not plan for reproducibility since the beginning of the project, reproducibility is drastically hampered.

For reviewers, even with a compendium in their hands, it may be hard to reproduce the results. There may be no instructions about how to execute the code and explore it further; the experiment may not run on his operating system; there may be missing libraries; library versions may be different; and several issues may arise while trying to install all the required dependencies, a problem colloquially known as `dependency hell <http://en.wikipedia.org/wiki/Dependency_hell>`__.
For reviewers, even with a compendium in their hands, it may be hard to reproduce the results. There may be no instructions about how to execute the code and explore it further; the experiment may not run on his operating system; there may be missing libraries; library versions may be different; and several issues may arise while trying to install all the required dependencies, a problem colloquially known as `dependency hell <https://en.wikipedia.org/wiki/Dependency_hell>`__.

ReproZip helps alleviate these problems by allowing the user to easily capture all the necessary components in a single, distributable package. Also, the tool makes it easier to reproduce an experiment by providing different unpacking methods and interfaces that avoids the need to install all the required dependencies and that makes it possible to run the experiment under different inputs.

0 comments on commit 5e70c6e

Please sign in to comment.