# Introduction
This notebook shows how to use the Data API to plot the features in a genome. After initialization, this is broken into just a few high-level steps:
* Load up a workspace (namespace for the data, but also each narrative has its own workspace)
* Find genomes in the workspace
* Select one of those genomes
* Get the feature positions in the selected genome
* Plot the feature positions

## Initialize

In [1]:
%matplotlib notebook
import seaborn as sns
import os
from biokbase import data_api
from biokbase.data_api import display

In [17]:
b = data_api.browse(1013)
g = b['kb|g.1'].object
g

<biokbase.data_api.genome_annotation.GenomeAnnotationAPI at 0x10b8d9310>

## Get genomes from workspace 654

In [2]:
# Get a "browser" for the workspace
b = data_api.browse(654)

In [3]:
# Get API object for 2nd genome (index 1)
g1 = b.filter(type_re='KBaseGenomesCondensedPrototypeV2.GenomeAnnotation-.*')[1].object

In [5]:
display.Organism(g1)

## Get feature positions in one of the genomes

In [6]:
f = display.FeaturePositions(g1)

In [11]:
reload(qgrid)
qgrid.nbinstall()
import pandas as pd
f2 = pd.DataFrame({'foo': (1,2,3), 'bar': {'a', 'b', 'c'}})
qgrid.show_grid(f2)

## Plot the features
A 'stripplot' shows each feature as a dot, with the X coordinate being the start position in the sequence and on the Y axis is each type of feature in the dataset (sorted, by default, alphabetically). This kind of plot doesn't help with any detailed analysis, but it provides a good simple overview of the feature data.

Because we did `%matplotlib notebook` to load matplotlib, we automatically get zooming and panning. In essence, this makes our plat a mini-genome-browser with "tracks" for each feature.

In [5]:
import numpy as np

In [2]:
import pickle
f = pickle.load(open('featurepos'))

In [4]:
sns.stripplot(x='start', y='type', marker='.', size=10, data=f)

<matplotlib.axes._subplots.AxesSubplot at 0x114c30050>

In [13]:
max(f['len'])

7152

In [42]:
f.to_pickle('featurepos')