# Lamprey Transcriptome Analysis: PGR Loci

```
Camille Scott [camille dot scott dot w @gmail.com] [@camille_codon]

camillescott.github.io

Lab for Data Intensive Biology (DIB)
University of California Davis
```

## About

    assembly version: lamp10

    assembly program: Trinity
    
    genome build: 7.0.75 (ensembl release 75

## Libraries

In [1]:
%load_ext autoreload
%autoreload 2
%autosave 60

Autosaving every 60 seconds


In [2]:
from libs import *
%run -i common.ipy

** Using data resources found in ../resources.json
** Using config found in ../config.json


In [3]:
import pyprind

In [4]:
from blasttools import blast_to_df

In [5]:
%pylab inline
from matplotlib import rc_context
tall_size = (8,16)
norm_size = (12,8)
mpl_params = {'figure.autolayout': True,
               'axes.titlesize': 24,
               'axes.labelsize': 16,
               'ytick.labelsize': 14,
               'xtick.labelsize': 14
               }
sns.set(style="white", palette="Paired", rc=mpl_params)
#sns.set_palette("Paired", desat=.6)
b, g, r, p = sns.color_palette("muted", 4)

Populating the interactive namespace from numpy and matplotlib


In [6]:
%config InlineBackend.close_figures = False

## Data

We'll save our results to a dictionary and dump it to JSON for use in the paper.

In [7]:
results = {}

In [8]:
store = pd.HDFStore(wdir('{}.store.h5'.format(prefix)), complib='zlib', complevel=5)

In [9]:
import atexit

In [10]:
def dump_results(fn='../doc/petmar-pgr-loci.results.json'):
    with open(fn, 'wb') as fp:
        json.dump(results, fp)

In [76]:
emb_samples = resources_df[(resources_df.tissue == 'embryo') & (resources_df.meta_type == 'sample')]

In [11]:
lamp10_tpm_df = store['lamp10.eXpress.tpm.tsv']

In [12]:
lamp10_best_hits = store['lamp10_best_hits']
lamp10_blast_filter_df = store['lamp10_blast_filter_df']

In [13]:
lamp10_ortho = store['lamp10_ortho']

In [30]:
homSap_all_df = store['lamp10.fasta.x.homSap.pep.fa.db.tsv']

In [24]:
homSap_bh_df = lamp10_best_hits['lamp10.fasta.x.homSap.pep.fa.db.tsv']

In [26]:
homSap_ortho_df = lamp10_ortho['lamp10.fasta.x.homSap.pep.fa.db.tsv']

In [36]:
tent_spopl = homSap_all_df[homSap_all_df.sseqid.str.contains(r'ENSP00000(280098|396006|410201)', na=False, case=False)]

In [48]:
spopl_tpm = lamp10_tpm_df.ix[tent_spopl.index]

In [57]:
spopl_df = homSap_bh_df.ix[homSap_bh_df.sseqid.str.contains(r'ENSP00000(280098|396006|410201)', na=False, case=False)]