# RNA sequencing workflow example

In this notebook, we'll walk through an RNA sequencing workflow in Ovation for Service Labs. Although the activities in the workflow can be accomplished using the web app (https://lab.ovation.io) to download & upload files, this notebook illustrates the API interactions to complete the workflow using existing bioinformatics tools. 

## Setup

In [1]:
import urllib
import texttable
import os
import glob

import ovation.lab.workflows as workflows
import ovation.lab.download as download

from ovation.session import connect

from tqdm import tqdm_notebook as tqdm
from pprint import pprint

In [2]:
cwd = os.getcwd()

## Connection

This interactive notebooks starts with an interactive `Session` connection. If you already have a (long-lived) API token, you can create a session with:

    s = ovation.session.Session(token, api='https://services.ovation.io', token='/api/v1/sessions')

In [3]:
s = connect(input('Email: '), api='https://services-staging.ovation.io', token='/api/v1/sessions')

Email: barry@ovation.io
Ovation password: ········


## Workflow

We'll need to know which workflow to post data to.

In [4]:
workflow_id = input('Workflow ID: ')

Workflow ID: 132


In [5]:
r = s.get(s.entity_path('workflows', workflow_id))
workflow = r.workflow

Here's the full workflow: 
![title](workflow.png)

The burnt-orange activities are most easily accomplished in the web app, so we'll assume that they're completed in the app. The secions below show the API calls for the light-orange colored activities.

What samples are in the pool?

In [22]:
samples = s.get(workflow.links.samples)

table = texttable.Texttable()
table.set_deco(texttable.Texttable.HEADER)
# table.set_cols_align(["l", "r", "r", "r", "l"])
table.add_rows([["Identifier", "Date received"]] + [[s.identifier, s.date_received] for s in samples])
print(table.draw())

Identifier      Date received    
Lib-Sample   2016-09-06 00:00:00 


In [25]:
pprint(list(workflow.relationships.keys()))

['bam_qc',
 'alignment',
 'sequencing',
 'sequencing_qc',
 'sequencing_qc_prep_sortmerna',
 'sequencing_qc_prep_fastqc',
 'batch_creation',
 'demultiplex',
 'pool_sample',
 'alignment_prep_sortmerna',
 'alignment_prep_fastqc',
 'alignment_prep_trimmomatic',
 'bam_qc_prep']


In [40]:
workflow.relationships.sequencing_qc_prep_sortmerna

{'self': '/api/v1/activities?organization_id=122&workflow_activity_id=1244'}

### Downloading files

In many activities, you'll want to download files from previous activities (e.g. the `fastq` files from demultiplexing in the Sequencing QC Prep activities). You can use `ovation.lab.download.download_resources` to get the resources from a labeled activity. For example:

    # Download the `xml-file` from the sequencing activity to the current working directory
    download.download_resources(s, workflow, 'sequencing', 'xml-file', output=cwd, progress=tqdm)

### Create batch, pool, sequencing (with QC)

*Complete in web app*

### Demultiplex

In [32]:
# Download the `xml-file` from the sequencing activity to the current working directory
download.download_resources(s, workflow, 'sequencing', 'xml-file', output=cwd, progress=tqdm)




In [33]:
ls *.xml && rm sequencing.xml

sequencing.xml


In [7]:
activity_label = 'demultiplex'
metadata = {}
resources = {'sample-sheet': [os.path.join(cwd, 'files/sample-sheet.txt')],
            'fastq-file': glob.glob(os.path.join(cwd, "files/*.fastq"), recursive=False),
            'xml-file': [os.path.join(cwd, 'files/demultiplex.xml')]}

In [8]:
demultiplex = workflows.create_activity(s, 
                                        workflow_id, 
                                        activity_label, 
                                        activity=metadata, 
                                        resources=resources, 
                                        progress=tqdm)






### Sequencing QC Prep — SortME RNA

In [26]:
activity_label = 'sequencing_qc_prep_sortmerna'
metadata = {'singleRead': False} # True for paired-end
resources = {'sortmerna-file': [os.path.join(cwd, 'files/sortmerna_single_end.xls')]}

In [27]:
seq_qc_prep_sortmerna = workflows.create_activity(s, 
                                                  workflow_id, 
                                                  activity_label, 
                                                  activity=metadata, 
                                                  resources=resources,
                                                  progress=tqdm)




### Sequencing QC Prep — FastQC

In [36]:
activity_label = 'sequencing_qc_prep_fastqc'
metadata = {'singleRead': False} # True for paired-end
resources = {'fastqc-file': [os.path.join(cwd, 'files/fastqc_single_end.xls')]}
resource_groups = {'fastqc-output': [os.path.join(cwd, 'files/fastqc')]}

In [37]:
seq_qc_prep_fastqc = workflows.create_activity(s, 
                                                workflow_id, 
                                                activity_label, 
                                                activity=metadata, 
                                                resources=resources,
                                                progress=tqdm)




### Sequencing QC

*Complete in web app*

### Trimmomatic

In [38]:
activity_label = 'alignment_prep_trimmomatic'
metadata = {}
resources = {'trimmomatic-file': []}

In [39]:
trimmomatic = workflows.create_activity(s,
                                        workflow_id, 
                                        activity_label, 
                                        activity=metadata, 
                                        resources=resources,
                                        progress=tqdm)

### SortME RNA

In [47]:
activity_label = 'alignment_prep_sortmerna'
metadata = {}
resources = {'sortmerna-file': ['files/sortmerna_single_end.xls']}

In [48]:
trimmomatic = workflows.create_activity(s,
                                        workflow_id, 
                                        activity_label, 
                                        activity=metadata, 
                                        resources=resources,
                                        progress=tqdm)




### FastQC

In [49]:
activity_label = 'alignment_prep_fastqc'
metadata = {}
resources = {'fastqc-file': ['files/fastqc_single_end.xls']}

In [50]:
fastqc = workflows.create_activity(s,
                                        workflow_id, 
                                        activity_label, 
                                        activity=metadata, 
                                        resources=resources,
                                        progress=tqdm)




### Alignment

In [51]:
activity_label = 'alignment'
metadata = {}
resources = {'bam-file': ['files/seq1.bam']}

In [52]:
alginment = workflows.create_activity(s,
                                      workflow_id, 
                                      activity_label, 
                                      activity=metadata, 
                                      resources=resources,
                                      progress=tqdm)




### BAM QC Prep

In [53]:
activity_label = 'bam_qc_prep'
metadata = {}
resources = {'rna-seqc-file': ['files/rnaseqc.xls']}

In [55]:
bamqc_prep = workflows.create_activity(s,
                                      workflow_id, 
                                      activity_label, 
                                      activity=metadata, 
                                      resources=resources,
                                      progress=tqdm)




### BAM QC

*Complete in web app*