diff --git a/.gitignore b/.gitignore index 749e9fdf..c10e49de 100644 --- a/.gitignore +++ b/.gitignore @@ -56,9 +56,6 @@ coverage.xml # Django stuff: *.log -# Sphinx documentation -docs/_build/ - # PyBuilder target/ diff --git a/CONTRIBUTING.rst b/CONTRIBUTING.rst index c8314c4c..f466bad2 100644 --- a/CONTRIBUTING.rst +++ b/CONTRIBUTING.rst @@ -4,7 +4,7 @@ Contributing Issues ------ -Bug reports, feature requests, and other contributions are welcome. If you find +Bug reports, issues, feature requests, and other contributions are welcome. If you find a demonstrable problem that is caused by the REANA code, please: 1. Search for `already reported problems @@ -13,11 +13,8 @@ a demonstrable problem that is caused by the REANA code, please: latest `master` branch. 3. Create an issue, ideally with **a test case**. -Pull requests -------------- - -If you create a feature branch, you can run the tests to ensure that everything -is operating correctly: +If you create a pull request fixing a bug or implementing a feature, you can run +the tests to ensure that everything is operating correctly: .. code-block:: console @@ -25,9 +22,3 @@ is operating correctly: Each pull request should preserve or increase code coverage. -Kanban ------- - -We are using Kanban technique for keeping track of ongoing tasks. Please see our -`Kanban board `_ and look for issues that are -labelled as "ready for work". diff --git a/MANIFEST.in b/MANIFEST.in index ea0f7dd2..d8925733 100644 --- a/MANIFEST.in +++ b/MANIFEST.in @@ -11,12 +11,6 @@ include *.md include *.html include *.sh include pytest.ini -prune docs/_build -recursive-include reana *.py -recursive-include docs *.py -recursive-include docs *.png -recursive-include docs *.rst -recursive-include docs *.txt recursive-include tests *.py recursive-include helm *.yaml recursive-include helm *.txt @@ -24,6 +18,7 @@ recursive-include helm *.helmignore recursive-include helm *.lock recursive-include benchmark *.py recursive-include scripts *.sh +recursive-include images *.png # Helm recursive-include helm *.md diff --git a/Makefile b/Makefile index 858bcdd2..70eb0107 100644 --- a/Makefile +++ b/Makefile @@ -235,9 +235,7 @@ test: # Run unit tests on the REANA package. pydocstyle reana black --check . check-manifest --ignore ".travis-*" - sphinx-build -qnNW docs docs/_build/html python setup.py test - sphinx-build -qnNW -b doctest docs docs/_build/doctest helm lint helm/reana # end of file diff --git a/README.rst b/README.rst index 066a5621..c82f934c 100644 --- a/README.rst +++ b/README.rst @@ -1,3 +1,7 @@ +.. image:: images/logo-reana.png + :target: http://docs.reana.io + :align: center + =========================== REANA - Reusable Analyses =========================== @@ -44,11 +48,24 @@ Features - support for several shared storage systems (Ceph) - support for several container technologies (Docker) +Getting started +--------------- + +You can `install REANA locally `_, `deploy it at scale on premises +`_ (in about 10 minutes) or use https://reana.cern.ch. Once the system +is ready, you can follow the guide to run `your first example `_. +For more in depth information visit the `official REANA documentation `_. + +Community +--------- + +- Discuss `on Forum `_ +- Chat `on Gitter `_ +- Follow us `on Twitter `_ + Useful links ------------ - `REANA home page `_ +- `REANA documentation `_ - `REANA on DockerHub `_ -- `REANA on GitHub `_ -- `REANA on ReadTheDocs `_ -- `REANA on Twitter `_ diff --git a/docs/_static/reana-architecture.png b/docs/_static/reana-architecture.png deleted file mode 100644 index 64019b8f..00000000 Binary files a/docs/_static/reana-architecture.png and /dev/null differ diff --git a/docs/_static/setting-the-breakpoint.png b/docs/_static/setting-the-breakpoint.png deleted file mode 100644 index 9cb3d1a1..00000000 Binary files a/docs/_static/setting-the-breakpoint.png and /dev/null differ diff --git a/docs/_static/wdb-active-sessions.png b/docs/_static/wdb-active-sessions.png deleted file mode 100644 index 0a92d7bd..00000000 Binary files a/docs/_static/wdb-active-sessions.png and /dev/null differ diff --git a/docs/_static/wdb-debugging-ui.png b/docs/_static/wdb-debugging-ui.png deleted file mode 100644 index c431384e..00000000 Binary files a/docs/_static/wdb-debugging-ui.png and /dev/null differ diff --git a/docs/administratorguide.rst b/docs/administratorguide.rst deleted file mode 100644 index 7968de2a..00000000 --- a/docs/administratorguide.rst +++ /dev/null @@ -1,179 +0,0 @@ -.. _administratorguide: - -Administrator guide -=================== - -This administrator guide is meant for people who would like to deploy and manage -REANA clusters. (The researchers are probably interested in reading the -:ref:`userguide` instead.) - -Architecture ------------- - -REANA system is composed of multiple separated components that permit to define -and manage computing cloud resources that run computational workflows on the -cloud. - -.. image:: /_static/reana-architecture.png - -REANA uses the following technologies: - -- `Python `_ -- `Flask `_ -- `Docker `_ -- `Kubernetes `_ -- `RabbitMQ `_ -- `Yadage `_ -- `CWL `_ -- `EOS `_ - -Components ----------- - -REANA system is composed of multiple separated components that are developed -independently. The components are usually published as Python packages (user -client, administrator cluster management) or as Docker images (internal REANA -components). - -reana-client -~~~~~~~~~~~~ - -REANA command line client for end users. - -- source code: ``_ -- release notes: ``_ -- known issues: ``_ -- documentation: ``_ - -reana-commons -~~~~~~~~~~~~~ - -Shared utilities for REANA components. - -- source code: ``_ -- release notes: ``_ -- known issues: ``_ -- documentation: ``_ - -reana-db -~~~~~~~~ - -REANA component containing database models and utilities. - -- source code: ``_ -- release notes: ``_ -- known issues: ``_ -- documentation: ``_ - -reana-job-controller -~~~~~~~~~~~~~~~~~~~~ - -REANA component for running and managing jobs. - -- source code: ``_ -- release notes: ``_ -- known issues: ``_ -- documentation: ``_ - -reana-message-broker -~~~~~~~~~~~~~~~~~~~~ - -REANA component for messaging needs. - -- source code: ``_ -- release notes: ``_ -- known issues: ``_ -- documentation: ``_ - -pytest-reana -~~~~~~~~~~~~ - -Shared pytest fixtures and other common testing utilities. - -- source code: ``_ -- release notes: ``_ -- known issues: ``_ -- documentation: ``_ - -reana-server -~~~~~~~~~~~~ - -REANA component providing API server replying to client queries. - -- source code: ``_ -- release notes: ``_ -- known issues: ``_ -- documentation: ``_ - -reana-ui -~~~~~~~~ - -REANA UI frontend. - -- source code: ``_ -- release notes: ``_ -- known issues: ``_ -- documentation: ``_ - -reana-workflow-controller -~~~~~~~~~~~~~~~~~~~~~~~~~ - -REANA component for running and managing workflows. - -- source code: ``_ -- release notes: ``_ -- known issues: ``_ -- documentation: ``_ - -reana-workflow-engine-cwl -~~~~~~~~~~~~~~~~~~~~~~~~~ - -REANA component for running CWL types of workflows. - -- source code: ``_ -- release notes: ``_ -- known issues: ``_ -- documentation: ``_ - -reana-workflow-engine-serial -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -REANA component for running simple sequential workflows. - -- source code: ``_ -- release notes: ``_ -- known issues: ``_ -- documentation: ``_ - -reana-workflow-engine-yadage -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -REANA component for running Yadage types of workflows. - -- source code: ``_ -- release notes: ``_ -- known issues: ``_ -- documentation: ``_ - -Deployment ----------- - -Local deployment using Minikube -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -REANA cloud uses `Kubernetes `_ container orchestration -system. The best way to try it out locally is to set up `Minikube -`_ (minikube -version 1.5.2 is known to work the best). - -The minikube can be started as follows: - -.. code-block:: console - - $ minikube start --vm-driver=virtualbox --feature-gates="TTLAfterFinished=true" - -REANA cluster can be easily deployed using `Helm `_: - -.. code-block:: console - - $ helm install reana helm/reana diff --git a/docs/authors.rst b/docs/authors.rst deleted file mode 100644 index a887eee3..00000000 --- a/docs/authors.rst +++ /dev/null @@ -1,3 +0,0 @@ -.. _authors: - -.. include:: ../AUTHORS.rst diff --git a/docs/changes.rst b/docs/changes.rst deleted file mode 100644 index 58b3592c..00000000 --- a/docs/changes.rst +++ /dev/null @@ -1,3 +0,0 @@ -.. _changes: - -.. include:: ../CHANGES.rst diff --git a/docs/conf.py b/docs/conf.py deleted file mode 100644 index 0480bc27..00000000 --- a/docs/conf.py +++ /dev/null @@ -1,204 +0,0 @@ -#!/usr/bin/env python3 -# -*- coding: utf-8 -*- -# -# reana documentation build configuration file, created by -# sphinx-quickstart on Mon Jan 23 14:17:34 2017. -# -# This file is execfile()d with the current directory set to its -# containing dir. -# -# Note that not all possible configuration values are present in this -# autogenerated file. -# -# All configuration values have a default; values that are commented out -# serve to show the default. - -# If extensions (or modules to document with autodoc) are in another directory, -# add these directories to sys.path here. If the directory is relative to the -# documentation root, use os.path.abspath to make it absolute, like shown here. -# -# import os -# import sys -# sys.path.insert(0, os.path.abspath('.')) - -from __future__ import print_function - -import os - -import sphinx.environment - -# -- General configuration ------------------------------------------------ - -# If your documentation needs a minimal Sphinx version, state it here. -# -# needs_sphinx = '1.0' - -# Do not warn on external images. -suppress_warnings = ["image.nonlocal_uri"] - -# Add any Sphinx extension module names here, as strings. They can be -# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom -# ones. -extensions = [ - "sphinx.ext.autodoc", - "sphinx.ext.coverage", - "sphinx.ext.doctest", - "sphinx.ext.graphviz", - "sphinx.ext.intersphinx", - "sphinx.ext.viewcode", - "sphinx_click.ext", - "sphinxcontrib.programoutput", -] - -# Add any paths that contain templates here, relative to this directory. -templates_path = ["_templates"] - -# The suffix(es) of source filenames. -# You can specify multiple suffix as a list of string: -# -# source_suffix = ['.rst', '.md'] -source_suffix = ".rst" - -# The master toctree document. -master_doc = "index" - -# General information about the project. -project = "reana" -copyright = "2017, 2018, 2019, info@reana.io" -author = "info@reana.io" - -# The version info for the project you're documenting, acts as replacement for -# |version| and |release|, also used in various other places throughout the -# built documents. -# -# The short X.Y version. - -# Get the version string. Cannot be done with import! -g = {} -with open(os.path.join("..", "reana", "version.py"), "rt") as fp: - exec(fp.read(), g) - version = g["__version__"] - -# The full version, including alpha/beta/rc tags. -release = version - -# The language for content autogenerated by Sphinx. Refer to documentation -# for a list of supported languages. -# -# This is also used if you do content translation via gettext catalogs. -# Usually you set "language" from the command line for these cases. -language = None - -# List of patterns, relative to source directory, that match files and -# directories to ignore when looking for source files. -# This patterns also effect to html_static_path and html_extra_path -exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"] - -# The name of the Pygments (syntax highlighting) style to use. -pygments_style = "sphinx" - -# If true, `todo` and `todoList` produce output, else they produce nothing. -todo_include_todos = False - - -# -- Options for HTML output ---------------------------------------------- - -# The theme to use for HTML and HTML Help pages. See the documentation for -# a list of builtin themes. -# -html_theme = "alabaster" - -# Theme options are theme-specific and customize the look and feel of a theme -# further. For a list of options available for each theme, see the -# documentation. -# -html_theme_options = { - "logo": "logo-reana.png", - "description": """

REANA is a reusable and reproducible - research data analysis platform.

Structure your - analysis inputs, code, environments, workflows and run - your analysis on remote containerised compute - clouds.

""", - "github_user": "reanahub", - "github_repo": "reana", - "github_button": False, - "github_banner": True, - "show_powered_by": False, - "extra_nav_links": { - "REANA@DockerHub": "https://hub.docker.com/u/reanahub/", - "REANA@GitHub": "https://github.com/reanahub", - "REANA@Twitter": "https://twitter.com/reanahub", - "REANA@Web": "http://www.reana.io", - }, -} - -# Add any paths that contain custom static files (such as style sheets) here, -# relative to this directory. They are copied after the builtin static files, -# so a file named "default.css" will overwrite the builtin "default.css". -html_static_path = ["_static"] - -# Custom sidebar templates, maps document names to template names. -html_sidebars = { - "**": [ - "about.html", - "navigation.html", - "relations.html", - "searchbox.html", - "donate.html", - ] -} - -# -- Options for HTMLHelp output ------------------------------------------ - -# Output file base name for HTML help builder. -htmlhelp_basename = "reanadoc" - - -# -- Options for LaTeX output --------------------------------------------- - -latex_elements = { - # The paper size ('letterpaper' or 'a4paper'). - # - # 'papersize': 'letterpaper', - # The font size ('10pt', '11pt' or '12pt'). - # - # 'pointsize': '10pt', - # Additional stuff for the LaTeX preamble. - # - # 'preamble': '', - # Latex figure (float) alignment - # - # 'figure_align': 'htbp', -} - -# Grouping the document tree into LaTeX files. List of tuples -# (source start file, target name, title, -# author, documentclass [howto, manual, or own class]). -latex_documents = [ - (master_doc, "reana.tex", "reana Documentation", "info@reana.io", "manual"), -] - - -# -- Options for manual page output --------------------------------------- - -# One entry per manual page. List of tuples -# (source start file, name, description, authors, manual section). -man_pages = [(master_doc, "reana", "reana Documentation", [author], 1)] - - -# -- Options for Texinfo output ------------------------------------------- - -# Grouping the document tree into Texinfo files. List of tuples -# (source start file, target name, title, author, -# dir menu entry, description, category) -texinfo_documents = [ - ( - master_doc, - "reana", - "reana Documentation", - author, - "reana", - "One line description of project.", - "Miscellaneous", - ), -] diff --git a/docs/contributing.rst b/docs/contributing.rst deleted file mode 100644 index 46281822..00000000 --- a/docs/contributing.rst +++ /dev/null @@ -1,3 +0,0 @@ -.. _contributing: - -.. include:: ../CONTRIBUTING.rst diff --git a/docs/developerguide.rst b/docs/developerguide.rst deleted file mode 100644 index 904fe8c5..00000000 --- a/docs/developerguide.rst +++ /dev/null @@ -1,97 +0,0 @@ -.. _developerguide: - -Developer guide -=============== - -This developer guide is meant for software developers who would like to -understand REANA source code and contribute to it. - - -Local development workflow --------------------------- - -REANA cluster is composed of several micro-services with multiple independent -source code repositories. - -The main source code repository contains a ``Makefile`` which allows you to -quickly clone all the necessary repositories and kick-start your REANA platform -developments locally. - -You can simply type ``make`` to see the available options and usage scenarios. - -.. program-output:: cd .. && make help - :shell: - -In addition, REANA comes with a ``reana-dev`` helper development script that -simplifies working with multiple repositories during local development and -integration testing. You can use ``--help`` option to see the detailed usage -instructions. - -.. click:: reana.reana_dev.cli:reana_dev - :prog: reana-dev - -Debugging ---------- - -In order to debug a REANA component, you first have to mount REANA's source -code into Minikube's `/code` directory and then build and re-deploy in -development mode: - -.. code-block:: console - - $ minikube mount $(pwd)/..:/code - $ CLUSTER_FLAGS=debug.enabled=true make build deploy - -Let us now introduce `wdb` breakpoint as the first instruction of the -first instruction of the `get_workflows()` function located in -`reana_server/rest/workflows.py`: - -.. image:: /_static/setting-the-breakpoint.png - -We can check that the code has been in fact updated and make a request to the -component: - -.. code-block:: console - - $ kubectl logs --selector=app="server" -c server - - DB Created. - * Serving Flask app "/code/reana_server/app.py" (lazy loading) - * Environment: production - WARNING: Do not use the development server in a production environment. - Use a production WSGI server instead. - * Debug mode: on - * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit) - * Restarting with stat - * Debugger is active! - * Debugger PIN: 221-564-335 - - $ curl $REANA_SERVER_URL/api/workflows?access_token=$REANA_ACCESS_TOKEN - -After doing that we can go to the `wdb` dashboard: - -.. code-block:: console - - $ firefox http://`minikube ip`:31984 - - -.. image:: /_static/wdb-active-sessions.png - -And finally select the debugging session. - -.. image:: /_static/wdb-debugging-ui.png - -Port forwarding ---------------- - -If you ever need to access one specific microservice via HTTP there is a Kubernetes -command that can help. The ``port-forward`` command connects a local port on the -machine to a port on a Kubernetes pod. It directs the traffic reaching the local -port to the pod port through an HTTP connection. Example: - -.. code-block:: console - - $ kubectl port-forward --address 0.0.0.0 : - -The ``--address`` flag defines the local IP address to listen on. Using ``0.0.0.0`` -makes the connection listen to all local IP addresses. diff --git a/docs/gettingstarted.rst b/docs/gettingstarted.rst deleted file mode 100644 index 1d417f82..00000000 --- a/docs/gettingstarted.rst +++ /dev/null @@ -1,102 +0,0 @@ -.. _gettingstarted: - -Getting started -=============== - -Get started with the REANA reusable analysis platform by exploring the following -three steps. - -Step One: Structure your analysis ---------------------------------- - -Structure your research data analysis repository into input "data" and -"parameters", runtime "code", computing "environments", and computational -"workflows", following the model of the :ref:`fourquestions`. Create -``reana.yaml`` describing your structure: - -.. code-block:: yaml - - version: 0.4.0 - inputs: - files: - - code/mycode.py - - data/mydata.csv - parameters: - myparameter: myvalue - workflow: - type: cwl - file: workflow/myworkflow.cwl - outputs: - files: - - results/myplot.png - -See and run some :ref:`examples`. - -Step Two: Install REANA cluster -------------------------------- - -You can use an existing REANA cloud deployment (if you have access to one) by -setting the ``REANA_SERVER_URL`` environment variable and providing a valid -token: - -.. code-block:: console - - $ export REANA_SERVER_URL=https://reana.cern.ch/ - $ export REANA_ACCESS_TOKEN=XXXXXXX - -You can also easily deploy your own REANA cloud instance by using -`Helm `_: - -.. code-block:: console - - $ # install kubectl 1.16.3 and minikube 1.5.2 - $ sudo dpkg -i kubectl*.deb minikube*.deb - $ minikube start --feature-gates="TTLAfterFinished=true" - $ helm install reana helm/reana - -Step Three: Run REANA client ----------------------------- - -You can run your analysis on the REANA cloud by using the ``reana-client`` -command line client: - -.. code-block:: console - - $ # create new virtual environment - $ virtualenv ~/.virtualenvs/myreana - $ source ~/.virtualenvs/myreana/bin/activate - $ # install REANA client - $ pip install reana-client - $ # create new workflow - $ reana-client create -n my-analysis - $ export REANA_WORKON=my-analysis - $ # upload input code and data to the workspace - $ reana-client upload - $ # start computational workflow - $ reana-client start - $ # check its progress - $ reana-client status - $ # list workspace files - $ reana-client ls - $ # download output results - $ reana-client download - -See `REANA-Client's Getting started guide -`_ for more -information. - -Next steps ----------- - -For more information, please see: - -- Are you a researcher who would like to run a reusable analysis on REANA cloud? - You can install and use `reana-client `_ - utility that provides interface to both local and remote REANA cloud - installations. For more information, please see the :ref:`userguide`. You may - also be interested in checking out some existing :ref:`examples`. - -- Are you a software developer who would like to contribute to REANA? You may be - interested in trying out REANA both from the user point of view and the - administrator point of view first. Follow by reading the :ref:`developerguide` - afterwards. diff --git a/docs/index.rst b/docs/index.rst deleted file mode 100644 index 4d542245..00000000 --- a/docs/index.rst +++ /dev/null @@ -1,20 +0,0 @@ -.. include:: ../README.rst - :end-before: About - -.. include:: ../README.rst - :start-after: ----- - :end-before: Features - -.. toctree:: - :numbered: - :maxdepth: 2 - - introduction - gettingstarted - userguide - administratorguide - developerguide - contributing - changes - license - authors diff --git a/docs/introduction.rst b/docs/introduction.rst deleted file mode 100644 index 27571930..00000000 --- a/docs/introduction.rst +++ /dev/null @@ -1,7 +0,0 @@ -.. _introduction: - -Introduction -============ - -.. include:: ../README.rst - :start-line: 22 diff --git a/docs/license.rst b/docs/license.rst deleted file mode 100644 index ec63ef70..00000000 --- a/docs/license.rst +++ /dev/null @@ -1,8 +0,0 @@ -License -======= - -.. include:: ../LICENSE - -In applying this license, CERN does not waive the privileges and immunities -granted to it by virtue of its status as an Intergovernmental Organization or -submit itself to any jurisdiction. diff --git a/docs/requirements.txt b/docs/requirements.txt deleted file mode 100644 index 830145b4..00000000 --- a/docs/requirements.txt +++ /dev/null @@ -1,7 +0,0 @@ -# This file is part of REANA. -# Copyright (C) 2017, 2018 CERN. -# -# REANA is free software; you can redistribute it and/or modify it -# under the terms of the MIT License; see LICENSE file for more details. - --e .[all] diff --git a/docs/userguide.rst b/docs/userguide.rst deleted file mode 100644 index b8d193ee..00000000 --- a/docs/userguide.rst +++ /dev/null @@ -1,640 +0,0 @@ -.. _userguide: - -User guide -========== - -This user guide is meant for researchers who would like to structure their data -analysis and run them on REANA cloud. - -Reusable analyses ------------------ - -Making a research data analysis reproducible basically means to provide -structured "runnable recipes" addressing (1) where is the input data, (2) what -software was used to analyse the data, (3) which computing environments were -used to run the software and (4) which computational steps were taken to run the -analysis. This will permit to instantiate the analysis on the computational -cloud and run the analysis to obtain its (5) output results. - -.. _fourquestions: - -Four questions --------------- - -REANA helps to make the research analysis reproducible by providing a structure -helping to answer the "Four Questions": - -1. What is your input data? - - - input data files - - input parameters - - live database calls - -2. Which code analyses it? - - - custom analysis code - - analysis frameworks - - Jupyter notebooks - -3. What is your environment? - - - operating system - - software packages and libraries - - CPU and memory resources - -4. Which steps did you take? - - - simple shell commands - - complex computational workflows - - local and/or remote task execution - -Let us see step by step on how we could go about making an analysis reproducible -and run it on the REANA platform. - -Structure your analysis ------------------------ - -It is advised to structure your research data analysis sources to clearly -declare and separate your analysis inputs, code, and outputs. A simple -hypothetical example: - -.. code-block:: console - - $ find . - data/mydata.csv - code/mycode.py - docs/mynotes.txt - results/myplot.png - -Note how we put the input data file in the ``data`` directory, the runtime code -that analyses it in the ``code`` directory, the documentation in the ``docs`` -directory, and the produced output plots in the ``results`` directory. - -Note that this structure is fully optional and you can use any you prefer, or -simply store everything in the same working directory. You can also take -inspiration by looking at several real-life examples in the :ref:`examples` -section of the documentation. - -Capture your workflows ----------------------- - -Now that we have structured our analysis data and code, we have to provide -recipe how to produce final plots. - -**Simple analyses** - -Let us assume that our analysis is run in two stages, firstly a data filtering -stage and secondly a data plotting stage. A hypothetical example: - -.. code-block:: console - - $ python ./code/mycode.py \ - < ./data/mydata.csv > ./workspace/mydata.tmp - $ python ./code/mycode.py --plot myparameter=myvalue \ - < ./workspace/mydata.tmp > ./results/myplot.png - -Note how we call a given sequence of commands to produce our desired output -plots. In order to capture this sequence of commands in a "runnable" or -"actionable" manner, we can write a short shell script ``run.sh`` and make it -parametrisable: - -.. code-block:: console - - $ ./run.sh --myparameter myvalue - -In this case you will want to use the `Serial -`_ workflow engine of -REANA. The engine permits to express the workflow as a sequence of commands: - -.. code-block:: console - - START - | - | - V - +--------+ - | filter | <-- mydata.csv - +--------+ - | - | mydata.tmp - | - V - +--------+ - | plot | <-- myparameter=myvalue - +--------+ - | - | plot.png - V - STOP - -Note that you can run different commands in different computing environments, -but they must be run in a linear sequential manner. - -The sequential workflow pattern will usually cover only simple computational -workflow needs. - -**Complex analyses** - -For advanced workflow needs we may want to run certain commands in parallel in a -sort of map-reduce fashion. There are `many workflow systems -`_ -that are dedicated to expressing complex computational schemata in a structured -manner. REANA supports several, such as `CWL `_ and -`Yadage `_. - -The workflow systems enable to express the computational steps in the form of -`Directed Acyclic Graph (DAG) -`_ permitting advanced -computational scenarios. - -.. code-block:: console - - START - | - | - +------+----------+ - / | \ - / V \ - +--------+ +--------+ +--------+ - | filter | | filter | | filter | <-- mydata - +--------+ +--------+ +--------+ - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - +-------+ - | merge | - +-------+ - | - | mydata.tmp - | - V - +--------+ - | plot | <-- myparameter=myvalue - +--------+ - | - | plot.png - V - STOP - - - -We pick for example the CWL standard to express our computational steps. We -store the workflow specification in the ``workflow`` directory: - -.. code-block:: console - - $ find workflow - workflow/myinput.yaml - workflow/myworkflow.cwl - workflow/step-filter.cwl - workflow/step-plot.cwl - -You will again be able to take inspiration from some real-life examples later in -the :ref:`examples` section of the documentation. - - -**To pick a workflow engine** - -For simple needs, the ``Serial`` workflow engine is the quickest to start with. -For regular needs, ``CWL`` or ``Yadage`` would be more appropriate. - -Note that the level of REANA platform support for a particular workflow engine -can differ: - - +----------------+---------------+---------------------+-------------+ - | Engine | Parametrised? | Parallel execution? | Caching? | - +================+===============+=====================+=============+ - | CWL | yes | yes | no(1) | - +----------------+---------------+---------------------+-------------+ - | Serial | yes | no | yes | - +----------------+---------------+---------------------+-------------+ - | Yadage | yes | yes | no(1) | - +----------------+---------------+---------------------+-------------+ - - (1) The vanilla workflow system may support the feature, but not when run - via REANA environment. - -**Develop workflow locally** - -Now that we have declared our analysis input data and code, as well as captured -the computational steps in a structured manner, we can see whether our analysis -runs in the original computing environment. We can use the helper wrapper -scripts: - -.. code-block:: console - - $ run.sh - -or use workflow-specific commands, such as ``cwltool`` in case of CWL workflows: - -.. code-block:: console - - $ cwltool --quiet --outdir="./results" \ - ./workflow/myworkflow.cwl ./workflow/myinput.yaml - -This completes the first step in the parametrisation of our analysis in a -reproducible manner. - -Containerise your environment ------------------------------ - -Now that we have fully described our inputs and code and the steps to run the -analysis and produce our results, we need to make sure we shall be running the -commands in the same environment. Capturing the environment specifics is -essential to ensure reproducibility, for example the same version of Python we -are using and the same set of pre-installed libraries that are needed for our -analysis. - -The environment is encapsulated by means of "containers" such as Docker or -Singularity. - -**Using an existing environment** - -Sometimes you can use an already-existing container environment prepared by -others. For example ``python:2.7`` for Python programs or -``clelange/cmssw:5_3_32`` for CMS Offline Software framework. In this case you -simply specify the container name and the version number in your workflow -specification and you are good to go. This is usually the case when your code -does not have to be compiled, for example Python scripts or ROOT macros. - -Note also REANA offers a set of containers that can server as examples about how -to containerise popular analysis environments such as ROOT (see `reana-env-root6 -`_), Jupyter (see -`reana-env-jupyter `_) or an -analysis framework such as AliPhysics (see `reana-env-aliphysics -`_). - -**Building your own environment** - -Other times you may need to build your own container, for example to add a -certain library on top of Python 2.7. This is the most typical use case that -we'll address below. - -This is usually the case when your code needs to be compiled, for example C++ -analysis. - -If you need to create your own environment, this can be achieved by means of -providing a particular ``Dockerfile``: - -.. code-block:: console - - $ find environment - environment/myenv/Dockerfile - - $ less environment/Dockerfile - # Start from the Python 2.7 base image: - FROM python:2.7 - - # Install HFtools: - RUN apt-get -y update && \ - apt-get -y install \ - python-pip \ - zip && \ - apt-get autoremove -y && \ - apt-get clean -y - RUN pip install hftools - - # Mount our code: - ADD code /code - WORKDIR /code - -You can build this customised analysis environment image and give it some name, -for example ``johndoe/myenv``: - -.. code-block:: console - - $ docker build -f environment/myenv/Dockerfile -t johndoe/myenv . - -and push the created image to the DockerHub image registry: - -.. code-block:: console - - $ docker push johndoe/myenv - -**Supporting arbitrary user IDs** - -In the Docker container ecosystem, the processes run in the containers by -default use the ``root`` user identity. However, this may not be secure. If -you want to improve the security in your environment you can set up your own -user under which identity the processes will run. - -In order for processes to run under any user identity and still be able to -write to shared workspaces, we use a GID=0 technique -`as used by OpenShift `_: - -- UID: you can use any user ID you want; -- GID: your should add your user to group with GID=0 (the root group) - -This will ensure the writable access to workspace directories managed by the -REANA platform. - -For example, you can create the user ``johndoe`` with UID=501 and add the user -to GID=0 by adding the following commands at the end of the previous -``Dockerfile``: - -.. code-block:: console - - # Setup user and permissions - RUN adduser johndoe -u 501 --disabled-password --gecos "" - RUN usermod -a -G 0 johndoe - USER johndoe - -**Testing the environment** - -We now have a containerised image representing our computational environment -that we can use to run our analysis in another replicated environment. - -We should test the containerised environment to ensure it works properly, for -example whether all the necessary libraries are present: - -.. code-block:: console - - $ docker run -i -t --rm johndoe/myenv /bin/bash - container> python -V - Python 2.7.15 - container> python mycode.py < mydata.csv > /tmp/mydata.tmp - -**Multiple environments** - -Note that various steps of the analysis can run in various environments; the -data filtering step on a big cloud having data selection libraries installed, -the data plotting step in a local environment containing only the preferred -graphing system of choice. You can prepare several different environments for -your analysis if needed. - -Write your ``reana.yaml`` -------------------------- - -We are now ready to tie all the above reproducible elements together. Our -analysis example becomes: - -.. code-block:: console - - $ find . - code/mycode.py - data/mydata.csv - docs/mynotes.txt - environment/myenv/Dockerfile - workflow/myinput.yaml - workflow/myworkflow.cwl - workflow/step-filtering.cwl - workflow/step-plotting.cwl - results/myplot.png - -There is only thing that remains in order to make it runnable on the REANA -cloud; we need to capture the above structure by means of a ``reana.yaml`` file: - -.. code-block:: yaml - - version: 0.4.0 - inputs: - files: - - code/mycode.py - - data/mydata.csv - parameters: - myparameter: myvalue - workflow: - type: cwl - file: workflow/myworkflow.cwl - outputs: - files: - - results/myplot.png - -This file is used by REANA to instantiate and run the analysis on the cloud. - -Declare necessary resources ---------------------------- - -You can declare other additional runtime dependencies that your workflow needs -for successful operation. - -**CVMFS** - -If your workflow needs to access `CVMFS `_ -filesystem, you should provide a ``cvmfs`` sub-clause of the ``resources`` clause that -would list all the CVMFS volumes that would be mounted for the workflow execution. -For example: - -.. code-block:: yaml - - workflow: - type: serial - resources: - cvmfs: - - fcc.cern.ch - specification: - steps: - - environment: 'cern/slc6-base' - commands: - - ls -l /cvmfs/fcc.cern.ch/sw/views/releases/ - - -**Kerberos** - -If your workflow requires Kerberos authentication, you should add ``kerberos: true`` -for the steps in need. Please note that step's docker image -(e.g ``environment: 'cern/slc6-base'``) should have Kerberos client installed and -you should `upload keytab `_ -file for the Kerberos authentication to work. - -Serial example: - -.. code-block:: yaml - - workflow: - type: serial - resources: - cvmfs: - - fcc.cern.ch - specification: - steps: - - environment: 'cern/slc6-base' - kerberos: true - commands: - - ls -l /cvmfs/fcc.cern.ch/sw/views/releases/ - -CWL example: - -.. code-block:: yaml - - steps: - first: - hints: - reana: - kerberos: true - run: helloworld.tool - in: - helloworld: helloworld - - inputfile: inputfile - sleeptime: sleeptime - outputfile: outputfile - out: [result] - -Yadage example: - -.. code-block:: yaml - - step: - process: - process_type: 'string-interpolated-cmd' - cmd: 'python "{helloworld}" --sleeptime {sleeptime} --inputfile "{inputfile}" --outputfile "{outputfile}"' - publisher: - publisher_type: 'frompar-pub' - outputmap: - outputfile: outputfile - environment: - environment_type: 'docker-encapsulated' - image: 'python' - imagetag: '2.7-slim' - resources: - - kerberos: true - -Compute backends -~~~~~~~~~~~~~~~~ - -REANA supports Kubernetes as a primary compute backend alongside HTCondor and -Slurm. - -In order to use HTCondor or Slurm clusters users should upload their CERN username -and keytab secrets using: - -.. code-block:: console - - $ reana-client secrets-add --env CERN_USER=johndoe - --env CERN_KEYTAB=.keytab - --file ~/.keytab - - .. note:: - Please note that CERN Slurm cluster access is not granted by - `default `_. - -**Kubernetes** - -Kubernetes is a default REANA job compute backend. If `step` does not contain -`compute_backend` specification, it will be executed on the default backend. - -.. code-block:: yaml - - # Serial example - ... - steps: - - name: reana_demo_helloworld_htcondorcern - environment: 'python:2.7-slim' - compute_backend: kubernetes - commands: - - python "${helloworld}" - ... - -**HTCondor** - -In order to execute the job on the HTCondor cluster user must specify ``htcondorcern`` -as the step's execution backend in the workflow specification. - -.. code-block:: yaml - - # Serial example - ... - steps: - - name: reana_demo_helloworld_htcondorcern - environment: 'python:2.7-slim' - compute_backend: htcondorcern - commands: - - python "${helloworld}" - ... - -Examples for CWL and Yadage can be found in -`REANA example - "hello world"`` - -**Slurm** - -In order to execute the job on the Slurm cluster user must specify ``slurmcern`` -as the step's execution backend in the workflow specification. - -.. code-block:: yaml - - # Serial example - ... - steps: - - name: reana_demo_helloworld_htcondorcern - environment: 'python:2.7-slim' - compute_backend: slurmcern - commands: - - python "${helloworld}" - ... - -.. note:: - Please note that REANA copies workflow workspace (with all its files) into - user's scratch space on the Slurm cluster and it is user responsibility to - delete it afterwards. - - -Run your analysis on REANA cloud --------------------------------- - -We can now download ``reana-client`` command-line utility, configure access to -the remote REANA cloud where we shall run the analysis, and launch it as -follows: - -.. code-block:: console - - $ # create new virtual environment - $ virtualenv ~/.virtualenvs/myreana - $ source ~/.virtualenvs/myreana/bin/activate - $ # install REANA client - $ pip install reana-client - $ # connect to some REANA cloud instance - $ export REANA_SERVER_URL=https://reana.cern.ch/ - $ export REANA_ACCESS_TOKEN=XXXXXXX - $ # create new workflow - $ reana-client create -n my-analysis - $ export REANA_WORKON=my-analysis - $ # upload input code and data to the workspace - $ reana-client upload ./code ./data - $ # start computational workflow - $ reana-client start - $ # ... should be finished in about a minute - $ reana-client status - $ # list workspace files - $ reana-client ls - $ # download output results - $ reana-client download results/plot.png - -We are done! Our outputs plot should be located in the ``results`` directory. - -Note that you can inspect your analysis workspace by opening `Jupyter notebook interactive sessions -`_. - -For more information on how to use ``reana-client``, please see `REANA-Client's -Getting started guide -`_. - -.. _examples: - -Examples --------- - -This section lists several REANA-compatible research data analysis examples that -illustrate how to a typical research data analysis can be packaged in a -REANA-compatible manner to facilitate its future reuse. - -- `reana-demo-helloworld `_ - a simple "hello world" example -- `reana-demo-worldpopulation `_ - a parametrised Jupyter notebook example -- `reana-demo-root6-roofit `_ - a simplified ROOT RooFit physics analysis example -- `reana-demo-alice-lego-train-test-run `_ - ALICE experiment analysis train test run and validation -- `reana-demo-alice-pt-analysis `_ - a simple ALICE Pt analysis demonstrator -- `reana-demo-atlas-recast `_ - ATLAS collaboration production software stack example recasting an analysis -- `reana-demo-bsm-search `_ - a typical BSM search example with complex particle physics workflows -- `reana-demo-cms-h4l `_ - CMS Higgs-to-four-leptons open data analysis example -- `reana-demo-cms-reco `_ - CMS RAW-to-AOD reconstruction example -- `reana-demo-lhcb-d2pimumu `_ - LHCb rare charm decay search example - -Next steps ----------- - -For more information on how to use ``reana-client``, you can explore -`REANA-Client documentation `_. diff --git a/docs/_static/logo-reana.png b/images/logo-reana.png similarity index 100% rename from docs/_static/logo-reana.png rename to images/logo-reana.png diff --git a/pytest.ini b/pytest.ini index 1c71d8e4..c14b544c 100644 --- a/pytest.ini +++ b/pytest.ini @@ -5,4 +5,4 @@ # under the terms of the MIT License; see LICENSE file for more details. [pytest] -addopts = --ignore=docs --cov=reana --cov-report=term-missing +addopts = --cov=reana --cov-report=term-missing diff --git a/setup.cfg b/setup.cfg index 3bc9a32b..d493e74e 100644 --- a/setup.cfg +++ b/setup.cfg @@ -7,10 +7,5 @@ [aliases] test = pytest -[build_sphinx] -source-dir = docs/ -build-dir = docs/_build -all_files = 1 - [bdist_wheel] universal = 1 diff --git a/setup.py b/setup.py index 43cd507b..8d23ddf6 100644 --- a/setup.py +++ b/setup.py @@ -23,12 +23,6 @@ ] extras_require = { - "docs": [ - "Sphinx>=1.4.4", - "sphinx-click>=1.0.4", - "sphinxcontrib-programoutput>=0.13", - "sphinx-rtd-theme>=0.1.9", - ], "tests": tests_require, }