# Project analysis workflow

In this notebook, we'll walk through an RNA projectd analysis workflow in Ovation for Service Labs. Although the activities in the workflow can be accomplished using the web app (https://lab.ovation.io) to download & upload files, this notebook illustrates the API interactions to complete the workflow using existing bioinformatics tools. 

## Setup

In [None]:
import urllib
import texttable
import os
import glob

import ovation.lab.workflows as workflows
import ovation.lab.download as download

from ovation.session import connect
from importlib import reload
from tqdm import tqdm_notebook as tqdm
from pprint import pprint

## Connection

This interactive notebooks starts with an interactive `Session` connection. If you already have a (long-lived) API token, you can create a session with:

    s = ovation.session.Session(token, api='https://lab-services.ovation.io', token='/api/v1/sessions')

In [None]:
s = connect(input('Email: '), api='https://services-staging.ovation.io', token='/api/v1/sessions')

## Workflow

In [None]:
workflow_id = input('Workflow ID: ')

In [None]:
r = s.get(s.entity_path('workflows', workflow_id))
workflow = r.workflow

### Create batch

_Complete in web app_

### SortME RNA

In [None]:
# Download flowcell index as JSON

In [None]:
activity_label = 'sequencing_qc_prep_sortmerna'
metadata = {'singleRead': False} # True for paired-end
resources = {'sortmerna-report': ['sequencing-sortmerna.xls'],
             'sortmerna-log-tar':['sortmerna-log.tar.gz']}

In [None]:
seq_qc_prep_sortmerna = workflows.create_activity(s, 
                                                  workflow_id, 
                                                  activity_label, 
                                                  activity=metadata, 
                                                  resources=resources,
                                                  progress=tqdm)

### FastQC

In [None]:
activity_label = 'sequencing_qc_prep_fastqc'
metadata = {'singleRead': False} # True for paired-end
resources = {'fastqc-report': ['files/fastqc_single_end.xls']}

# Resource groups represent folders. Here we're uploading the "Lib-Sample" fastqc output folder. Ovation automatically parses the
# file name to associate each folder with the correct sample, assuming <sample>_fastqc or <sample>_[12]_fastqc
resource_groups = {'fastqc-output': ['files/Lib-Sample_fastqc']}

In [None]:
seq_qc_prep_fastqc = workflows.create_activity(s, 
                                               workflow_id, 
                                               activity_label, 
                                               activity=metadata,
                                               resources=resources,
                                               resource_groups=resource_groups,
                                               progress=tqdm)

### Sequencing QC

_Complete in web app_

### STAR

In [None]:
activity_label = 'alignment-star'
metadata = {}
resources = {'star-stats-file': ['star-stats.xls'],
             'star-tar': ['star.tar.gz']}

In [None]:
alignment_star = workflows.create_activity(s, 
                                           workflow_id, 
                                           activity_label, 
                                           activity=metadata,
                                           resources=resources,
                                           progress=tqdm)

### RNASeqC

In [None]:
activity_label = 'bam-qc-prep'
metadata = {}
resources = {'rnaseqc-metrics': ['rnaseqc.xls'],
             'rnaseqc-tar': ['rnaseqc.tar.gz']}

In [None]:
rnaseqc = workflows.create_activity(s, 
                                    workflow_id, 
                                    activity_label, 
                                    activity=metadata,
                                    resources=resources,
                                    progress=tqdm)

### Novoalign

In [None]:
activity_label = 'alignment-novo'
metadata = {}
resources = {'novo-se-pe-stats-file': ['se-pe-stats.tab'],
             'novo-pe-stats-file': ['pe-stats.tab'],
             'novo-raw-stats-file': ['raw-stats.tar.gz']}

In [None]:
novoalign = workflows.create_activity(s, 
                                      workflow_id, 
                                      activity_label, 
                                      activity=metadata,
                                      resources=resources,
                                      progress=tqdm)

### ERCC

In [None]:
activity_label = 'alignment-ercc'
metadata = {}
resources = {'ercc-stats-file': ['ercc-stats.tab'],
             'ercc-image-file': glob.glob("*.ercc.jpg"),
             'ercc-raw-stats-file': ['raw-stats.tar.gz']}

In [None]:
novoalign = workflows.create_activity(s, 
                                      workflow_id, 
                                      activity_label, 
                                      activity=metadata,
                                      resources=resources,
                                      progress=tqdm)

### Alignment (BAM) QC

_Complete in web app_

### Differential expression

In [None]:
activity_label = 'differential-expression'
metadata = {
    'activeSiteLinks': ['https://example.com/activesite']
}

In [None]:
diff_expr = workflows.create_activity(s, 
                                      workflow_id, 
                                      activity_label, 
                                      activity=metadata,
                                      resources=resources,
                                      progress=tqdm)