Skip to content

Commit

Permalink
Merge pull request #65 from epigen/dev
Browse files Browse the repository at this point in the history
Update master with features_0.5
  • Loading branch information
vreuter committed Feb 28, 2017
2 parents 3c3015f + c7a1fb3 commit d396d7b
Show file tree
Hide file tree
Showing 20 changed files with 2,066 additions and 912 deletions.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -63,3 +63,8 @@ open_pipelines/
# Reserved files for comparison
*RESERVE*

# Build-related stuff
build/
dist/
looper.egg-info/

16 changes: 16 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
language: python
python:
- "2.7"
before_install:
- pip install -U pip
install:
- pip install -U -r requirements.txt
- pip install -U -r requirements-dev.txt
- pip install -U -r requirements-test.txt
script: pytest
branches:
only:
- features_0.5
- dev
- master

1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Looper

[![Documentation Status](http://readthedocs.org/projects/looper/badge/?version=latest)](http://looper.readthedocs.io/en/latest/?badge=latest)
[![Build Status](https://travis-ci.org/vreuter/looper.svg?branch=master)](https://travis-ci.org/vreuter/looper)

__`Looper`__ is a pipeline submission engine that parses sample inputs and submits pipelines for each sample. Looper was conceived to use [pypiper](https://github.com/epigen/pypiper/) pipelines, but does not require this.

Expand Down
1 change: 0 additions & 1 deletion VERSION

This file was deleted.

3 changes: 0 additions & 3 deletions dev-requirements.txt

This file was deleted.

16 changes: 13 additions & 3 deletions doc/source/advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -97,14 +97,24 @@ In `Templates <https://github.com/epigen/looper/tree/master/templates>`__ are ex
Handling multiple input files with a merge table
****************************************

Sometimes you have multiple input files that you want to merge for one sample. Rather than putting multiple lines in your sample annotation sheet, which causes conceptual and analytical challenges, we introduce a *merge table* which maps input files to samples for samples with more than one input file.
Sometimes you have multiple input files that you want to merge for one sample. For example, the primary use case is a single library that was spread across multiple sequencing lanes; these need to be first merged, and then run through the pipeline. Rather than putting multiple lines in your sample annotation sheet, which causes conceptual and analytical challenges, we introduce two ways to merge these:

Just provide a merge table in the *metadata* section of your project config:
1. Use shell expansion characters (like '*' or '[]') in your `data_source` definition or filename (good for simple merges)
2. Specify a *merge table* which maps input files to samples for samples with more than one input file (infinitely customizable for more complicated merges).

To do the first option, just change your data source specifications, like this:

.. code-block:: yaml
data_R1: "${DATA}/{id}_S{nexseq_num}_L00*_R1_001.fastq.gz"
data_R2: "${DATA}/{id}_S{nexseq_num}_L00*_R2_001.fastq.gz"
To do the second option, just provide a merge table in the *metadata* section of your project config:

metadata:
merge_table: mergetable.csv

Make sure the ``sample_name`` column of this table matches, and then include any columns you need to point to the data. ``Looper`` will automatically include all of these files as input passed to the pipelines.
Make sure the ``sample_name`` column of this table matches, and then include any columns you need to point to the data. ``Looper`` will automatically include all of these files as input passed to the pipelines. Warning: do not use both of these options simultaneously for the same sample, it will lead to multiple merges.

Note: to handle different *classes* of input files, like read1 and read2, these are *not* merged and should be handled as different derived columns in the main sample annotation sheet (and therefore different arguments to the pipeline).

Expand Down
35 changes: 25 additions & 10 deletions looper/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@

import logging
import os
from sys import stderr
from sys import stdout
from _version import __version__


LOOPERENV_VARNAME = "LOOPERENV"
Expand All @@ -18,28 +19,42 @@
DEFAULT_LOOPERENV_CONFIG_RELATIVE = os.path.join(SUBMISSION_TEMPLATES_FOLDER,
DEFAULT_LOOPERENV_FILENAME)

LOGGING_LEVEL = logging.INFO
LOGGING_LOCATIONS = (stderr, )
DEFAULT_LOGGING_FMT = "%(asctime)s %(name)s %(module)s : %(lineno)d - [%(levelname)s] > %(message)s"
LOGGING_LEVEL = "INFO"
LOGGING_LOCATIONS = (stdout, )

# Default user logging format is simple
DEFAULT_LOGGING_FMT = "%(message)s"
# Developer logger format is more information-rich
DEV_LOGGING_FMT = "%(module)s:%(lineno)d [%(levelname)s] > %(message)s "

def setup_looper_logger(level, additional_locations=None,
fmt=None, datefmt=None):


def setup_looper_logger(level, additional_locations=None, devmode=False):
"""
Called by test configuration via `pytest`'s `conftest`.
All arguments are optional and have suitable defaults.
:param int | str level: logging level
:param tuple(str | FileIO[str]) additional_locations: supplementary
destination(s) to which to ship logs
:param str fmt: message format string for log message, optional
:param str datefmt: datetime format string for log message time, optional
:param bool devmode: whether to use developer logging config
:return logging.Logger: project-root logger
"""

fmt = DEV_LOGGING_FMT if devmode else DEFAULT_LOGGING_FMT

# Establish the logger.
LOOPER_LOGGER = logging.getLogger("looper")
# First remove any previously-added handlers
LOOPER_LOGGER.handlers = []
LOOPER_LOGGER.propagate = False

# Handle int- or text-specific logging level.
try:
level = int(level)
except ValueError:
level = level.upper()

try:
LOOPER_LOGGER.setLevel(level)
except Exception:
Expand All @@ -64,7 +79,7 @@ def setup_looper_logger(level, additional_locations=None,
format(additional_locations, LOGGING_LOCATIONS))

# Add the handlers.
formatter = logging.Formatter(fmt or DEFAULT_LOGGING_FMT, datefmt)
formatter = logging.Formatter(fmt=(fmt or DEFAULT_LOGGING_FMT))
for loc in where:
if isinstance(loc, str):
# File destination
Expand All @@ -77,7 +92,7 @@ def setup_looper_logger(level, additional_locations=None,
handler_type = logging.StreamHandler
else:
# Strange supplementary destination
logging.warn("{} as logs destination appears to be neither "
logging.info("{} as logs destination appears to be neither "
"a filepath nor a stream.".format(loc))
continue
handler = handler_type(loc)
Expand Down
1 change: 1 addition & 0 deletions looper/_version.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
__version__ = "0.5.0"
78 changes: 72 additions & 6 deletions looper/exceptions.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,66 @@
""" Exceptions for specific looper issues. """


# Simplify imports, especially for models.
__all__ = ["LooperConstructionException", "ProjectConstructionException",
"DefaultLooperenvException", "ComputeEstablishmentException"]
# Simplify imports by permitting '*', especially for models.
__all__ = ["ComputeEstablishmentException", "DefaultLooperenvException",
"MetadataOperationException", "MissingConfigEntryException",
"ModelConstructionException", "PipelinesException",
"ProjectConstructionException"]


class LooperConstructionException(Exception):

class MetadataOperationException(Exception):
""" Illegal/unsupported operation, motivated by `AttributeDict`. """

def __init__(self, obj, meta_item):
"""
Instance with which the access attempt was made, along with the
name of the reserved/privileged metadata item, define the exception.
:param object obj: instance with which
offending operation was attempted
:param str meta_item: name of the reserved metadata item
"""
try:
classname = obj.__class__.__name__
except AttributeError:
# Maybe we were given a class or function not an instance?
classname = obj.__name__
explanation = "Attempted unsupported operation on {} item '{}'". \
format(classname, meta_item)
super(MetadataOperationException, self). \
__init__(explanation)



class MissingConfigEntryException(Exception):
""" Represent case in which Project config is missing required entry. """
def __init__(self, entry_name, section_name="", classname="",
alleged_collection=None):
"""
Define the exception via message, with name of the missing entry
the only requirement. Provide section name and classname for
additional context.
:param str entry_name: name of required entry
:param str section_name: name of section where entry is required
:param str classname: name of class giving rise to this exception
"""
explanation = "Missing required entry '{}'".format(entry_name)
if section_name:
explanation += " in '{}'".format(section_name)
if classname:
explanation += " of {}".format(classname)
if alleged_collection:
explanation += ": {}".format(alleged_collection)
super(MissingConfigEntryException, self).__init__(explanation)




class ModelConstructionException(Exception):
""" Error during construction of a looper ADT instance. """

def __init__(self, datatype, stage="", context=""):
"""
Explain failure during creation of `datatype` instance, with
Expand All @@ -29,11 +82,20 @@ def __init__(self, datatype, stage="", context=""):
typename = str(datatype)
explanation = "Error creating {dt}; stage: {s}; context: {c}".\
format(dt=typename, s=stage or filler, c=context or filler)
super(LooperConstructionException, self).__init__(explanation)
super(ModelConstructionException, self).__init__(explanation)



class PipelinesException(Exception):
""" Oh no, no pipelines for a project. """
def __init__(self):
super(PipelinesException, self).__init__()


class ProjectConstructionException(LooperConstructionException):

class ProjectConstructionException(ModelConstructionException):
""" An error occurred during attempt to instantiate `Project`. """

def __init__(self, reason, stage=""):
"""
Explain exception during `looper` `Project` construction.
Expand All @@ -46,14 +108,18 @@ def __init__(self, reason, stage=""):
datatype="Project", stage=stage, context=reason)



class DefaultLooperenvException(ProjectConstructionException):
""" Default looperenv setup call failed to
set relevant `Project` attributes. """

def __init__(self, reason="Could not establish default looperenv"):
super(DefaultLooperenvException, self).__init__(reason=reason)



class ComputeEstablishmentException(ProjectConstructionException):
""" Failure to establish `Project` `compute` setting(s). """

def __init__(self, reason="Could not establish Project compute env."):
super(ComputeEstablishmentException, self).__init__(reason=reason)

0 comments on commit d396d7b

Please sign in to comment.