# Time Series Pipeline: RNASeq data
    For all designations/subclassifications of species in the microbiome, do the following
    time series tests.
    
    1. For each time point, find classifications where its z-score deviates from all other time points
    2. Treat each time point as the last time point, find linear trends
    3. Do changepoint analysis based on differences between population and sliding window.

In [60]:
import sys

# User Libraries
import tanner.stats.timeseries as ts
import tanner.stats.helpers as shelp
import tanner.analysis.rnaseq as rs
import tanner.visual.timeseries as vts

# Python Libraries
import pandas as pd
import os 
import seaborn as sns

# Ipython Configuration
%pylab inline
%load_ext autoreload
%autoreload 2

# Data and analysis paths
rnaseq_path = "/mounts/tscc/projects/Li-Fraumeni/data/family3/rna-seq/updated_runs/11292015_Tanner_RNASeq/"
analysis_path = "/mounts/tscc/projects/Li-Fraumeni/analysis/rnaseq"

Populating the interactive namespace from numpy and matplotlib
The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [61]:
date = "01192016"
analysis_path = os.path.join(analysis_path, date)
if not os.path.exists(analysis_path): os.makedirs(analysis_path)
sns.set_context("talk", font_scale=1.2)
sns.set_style("whitegrid")

###  Go through each file type and create relevant plots

In [65]:
data_df = rs.load_deseq(rnaseq_path, individual='002')

In [66]:
data_df.head()

symbol,DDX11L1,WASH7P,MIR1302-10,FAM138A,OR4G4P,OR4G11P,OR4F5,RP11-34P13.7,RP11-34P13.8,CICP27,...,MT-ND4,MT-TH,MT-TS2,MT-TL2,MT-ND5,MT-ND6,MT-TE,MT-CYB,MT-TT,MT-TP
2014-09-25,-0.133548,1.858855,0,0,0,0,0,3.938958,0,6.871524,...,9.015825,0,0,0,8.760108,5.848129,0,8.959355,0,5.264223
2014-10-29,-0.167252,1.838836,0,0,0,0,0,4.018168,0,6.180839,...,9.163805,0,0,0,8.811117,5.446424,0,9.178837,0,5.877908
2014-11-25,0.066191,1.66926,0,0,0,0,0,3.902033,0,7.348003,...,9.291458,0,0,0,8.851813,5.864704,0,9.053104,0,5.406491
2014-12-19,-0.129562,1.596803,0,0,0,0,0,4.121893,0,6.449629,...,8.468865,0,0,0,8.528305,5.765864,0,8.43114,0,5.178837
2015-02-02,-0.125927,1.645826,0,0,0,0,0,4.069842,0,6.142796,...,8.385469,0,0,0,8.396658,5.374124,0,8.414193,0,5.23679


In [None]:
rs.process_timeseries(data_df, analysis_path, pvalue=0.05/100)

8.9474249311e-09
[ 0.38648978  0.19347677  1.          1.          1.        ]
0